Understanding Consistency Level in the Milvus Vector Database
Learn about the four levels of consistency: strong, bounded staleness, session, and eventual supported in the Milvus vector database.
Join the DZone community and get the full member experience.
Join For FreeWhat Is Consistency?
Before getting started, we need to first clarify the connotation of consistency in this article, as the word "consistency" is an overloaded term in the computing industry. Consistency in a distributed database specifically refers to the property that ensures every node or replica has the same view of data when writing or reading data at a given time. Therefore, here we are talking about consistency as in the CAP theorem.Hence, we need different data consistency levels for different applications. And luckily, Milvus, a database for AI, offers flexibility in consistency level, and you can set the consistency level that best suits your application.
Consistency in the Milvus Vector Database
The concept of consistency level was first introduced with the release of Milvus 2.0. The 1.0 version of Milvus was not a distributed vector database, so we did not involve tunable levels of consistency then. Milvus 1.0 flushes data every second, meaning that new data are almost immediately visible upon their insertion and Milvus reads the most updated data view at the exact time point when a vector similarity search or query request comes.However, Milvus was refactored in its 2.0 version, and Milvus 2.0 is a distributed vector database based on a pub-sub mechanism. The PACELC theorem points out that a distributed system must trade off between consistency, availability, and latency. Furthermore, different levels of consistency serve different scenarios. Therefore, the concept of consistency was introduced in Milvus 2.0, and it supports tuning levels of consistency.
Four Levels of Consistency in the Milvus Vector Database
Milvus supports four levels of consistency: strong, bounded staleness, session, and eventual. And a Milvus user can specify the consistency level when creating a collection or conducting a vector similarity search or query. This section will continue to explain how these four levels of consistency are different and which scenario they are best suited for.1. Strong
Strong is the highest and the most strict level of consistency. It ensures that users can read the latest version of data.
2. Bounded Staleness
Bounded staleness, as its name suggests, allows data inconsistency during a certain period of time. However, generally, the data are always globally consistent out of that period of time.
3. Session
Session ensures that all data writes can be immediately perceived in reads during the same session. In other words, when you write data via one client, the newly inserted data instantaneously become searchable.
We recommend choosing a session as the consistency level for those scenarios where the demand for data consistency in the same session is high. An example can be deleting the data of a book entry from the library system. After confirmation of the deletion and refreshing the page (a different session), the book should no longer be visible in search results.
4. Eventual
There is no guaranteed order of reads and writes, and replicas eventually converge to the same state, given that no further write operations are done. Under eventual consistency, replicas start working on reading requests with the latest updated values. Eventual consistency is the weakest level among the four.
Bounded
) in the Milvus vector database. Therefore, the data read might lag behind, and Milvus might happen to read the data view before you conducted delete operations during a similarity search or query. However, this issue is simple to solve. All you need to do is tune the consistency level when creating a collection or conducting a vector similarity search or query. Simple!
For instance, if you want to set the consistency level as strong
, you only need to set the value of the parameter consistency_level
as Strong
. The following is an example.
In the next post, we will unveil the mechanism behind it and explain how the Milvus vector database achieves different levels of consistency. Stay tuned!
Published at DZone with permission of Charles Xie. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments