Vector Quantization
On this page
Note
Atlas Vector Search support for the following is available as a Preview feature:
Ingestion of BSON BinData
vector
subtypeint1
.Automatic scalar quantization.
Automatic binary quantization.
Atlas Vector Search supports automatic quantization of double or 32-bit float values in your vector embeddings. It can also ingest and index your scalar and binary quantized vectors from embedding providers.
About Quantization
Quantization is the process of shrinking full-fidelity vectors into fewer bits. It reduces the amount of main memory required to store each vector in a vector search index because you index reduced representation vectors, thus allowing the storage of more vectors or of vectors with higher dimensionality. In this way, quantization reduces resource consumption and improves speed. Therefore, we recommend quantization for applications with large number of vectors, typically over 10M vectors.
Scalar Quantization
Scalar quantization involves first
identifying the minimum and maximum values for each dimension of the
indexed vectors to establish a range of values for a dimension. Then,
the range is divided into equally sized intervals or bins. Finally, each
float value is mapped to a bin to convert the continuous float values
into discrete integers. In Atlas Vector Search, this quantization reduces the vector
embedding's RAM cost to one fourth (1/4
) of the pre-quantization cost.
Binary Quantization
Binary quantization involves assuming a
midpoint of 0
for each dimension, which is typically appropriate for
embeddings normalized to length 1
such as OpenAI's
text-embedding-3-large
. Then, each value in the vector is
compared to the midpoint and assigned a binary value of 1
if it's
greater than the midpoint and a binary value of 0
if it's less than
or equal to the midpoint. In Atlas Vector Search, this quantization reduces the
vector embedding's RAM cost to one twenty-fourth (1/24
) of the
pre-quantization cost. The reason it's not 1/32
is because the data
structure containing the Hierarchical Navigable Small Worlds graph itself, separate from the vector
values, isn't compressed.
When you run a query, Atlas Vector Search converts the float value in the query vector into a binary vector using the same midpoint for efficient comparison between the query vector and indexed binary vectors. It then rescores by reevaluating the identified candidates in the binary comparison using the original float values associated with those results from the binary index to further refine the results. The full fidelity vectors are stored in their own data structure on disk, and are only referenced during rescoring when you configure binary quantization or when you perform exact search against either binary or scalar quantizaed vectors.
Requirements
The following table shows the requirements for automatically quantizing and ingesting quantized vectors:
Requirement | For int1 Ingestion | For int8 Ingestion | For Automatic Scalar Quantization | For Automatic Binary Quantization |
---|---|---|---|---|
Requires index definition settings | No | No | Yes | Yes |
Requires BSON binData format | Yes | Yes | No | No |
Storage on mongod | binData(int1) | binData(int8) | binData(float32) array(float32) | binData(float32) array(float32) |
Supported Similarity method | euclidean | cosine euclidean dotProduct | cosine euclidean dotProduct | cosine euclidean dotProduct |
Supported Number of Dimensions | Multiple of 8 | 1 to 4096 | 1 to 4096 | Multiple of 8 |
Supports ENN Search | ENN on int1 | ENN on int8 | ENN on float32 | ENN on float32 |
How to Enable Automatic Quantization of Vectors
You can configure Atlas Vector Search to automatically quantize double or 32-bit
float values in your vector embeddings to smaller number types such as
int8
(scalar) and binary
.
For most embedding models, we recommend binary quantization with rescoring. If you want to use lower dimension models that are not QAT, use scalar quantization because it has less representational loss and therefore, incurs less representational capacity loss.
Benefits
Atlas Vector Search provides native capabilities for scalar quantization as well as
binary quantization with rescoring. Automatic quantization increases
scalability and cost savings for your applications by reducing the
storage and computational resources for efficient processing of your
vectors. Automatic quantization reduces the RAM for mongot
by 3.75x
for scalar and by 24x for binary; the vector values shrink by 4x and 32x
respectively, but Hierarchical Navigable Small Worlds graph itself does not shrink. This improves
performance, even at the highest volume and scale.
Use Cases
We recommend automatic quantization if you have large number of full fidelity vectors, typically over 10M vectors. After quantization, you index reduced representation vectors without compromising the accuracy when retrieving vectors.
Procedure
To automatically quantize your double
or 32-bit float
values,
specify the type of quantization you want as the value for
fields.quantization
in your index definition. You can specify one of
the following types of quantization as the value:
scalar
quantization to produce byte vectors from 32-bit input vectors.binary
quantization to produce bit vectors from 32-bit input vectors.
If you specify automatic quantization on data that is not an array of
doubles
or 32-bit float
values, Atlas Vector Search silently ignores that
vector instead of indexing it, and those vectors will be skipped.
How to Ingest Pre-Quantized Vectors
Atlas Vector Search also supports ingestion and indexing of scalar and
binary quantized vectors from embedding providers. If you don't already
have quantized vectors, you can convert your embeddings to BSON
BinData vector
subtype
float32
, int1
, or int8
vectors.
Use Cases
We recommend the BSON binData
vector
subtype for the following
use cases:
You need to index quantized vector output from embedding models.
You have a large number of float vectors but want to reduce the storage and WiredTiger footprint (such as disk and memory usage) in
mongod
.
Benefits
The BinData vector
format
requires about three times less disk space in your cluster compared
to arrays of elements. It allows you to index your vectors with
alternate types such as int1
or int8
vectors, reducing the
memory needed to build the Atlas Vector Search index for your collection. It reduces
the RAM for mongot
by 3.75x for scalar and by 24x for binary; the
vector values shrink by 4x and 32x respectively, but the Hierarchical Navigable Small Worlds graph
itself doesn't shrink.
If you don't already have binData
vectors, you can convert your
embeddings to this format by using any supported driver before writing
your data to a collection. This page walks you through the steps for
converting your embeddings to the BinData vector
subtype.
Supported Drivers
BSON BinData vector
subtype
float32
, int1
, and int8
vector conversion is supported by
PyMongo Driver v4.10 or later.
Prerequisites
To convert your embeddings to BSON BinData vector
subtype, you need the
following:
An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later.
Ensure that your IP address is included in your Atlas project's access list.
An environment to run interactive Python notebooks such as Colab.
Access to an embedding model that supports byte vector output.
The following embedding model providers support
int8
orint1
binData
vectors:Embedding Model ProviderEmbedding Modelembed-english-v3.0
nomic-embed-text-v1.5
jina-embeddings-v2-base-en
mxbai-embed-large-v1
You can use any of these embedding model providers to generate
binData
vectors. Scalar quantization preserves recall for these models because these models are all trained to be quantization aware. Therefore, recall degradation for scalar quantized embeddings produced by these models is minimal even at lower dimensions like 384.
Procedure
The examples in this procedure use either new data or existing data and
Cohere's embed-english-v3.0
model. The
example for new data uses sample text strings, which you can replace
with your own data. The example for existing data uses a subset of
documents without any embeddings from the listingsAndReviews
collection in the sample_airbnb
database, which you can replace with
your own database and collection (with or without any embeddings).
Select the tab based on whether you want to create binData
vectors
for new data or for data you already have in your Atlas cluster.
Create an interactive Python notebook by saving a file with the
.ipynb
extension, and then perform the following steps in the
notebook. To try the example, replace the placeholders with valid
values.
Install the required libraries.
Run the following command to install the PyMongo Driver. If necessary, you can also install libraries from your embedding model provider. This operation might take a few minutes to complete.
pip install pymongo
You must install PyMongo v4.10 or later driver.
Example
Install PyMongo and Cohere
pip --quiet install pymongo cohere
Load the data for which you want to generate BSON vectors in your notebook.
Example
Sample Data to Import
data = [ "The Great Wall of China is visible from space.", "The Eiffel Tower was completed in Paris in 1889.", "Mount Everest is the highest peak on Earth at 8,848m.", "Shakespeare wrote 37 plays and 154 sonnets during his lifetime.", "The Mona Lisa was painted by Leonardo da Vinci.", ]
(Conditional) Generate embeddings from your data.
This step is required if you haven't yet generated embeddings from your data. If you've already generated embeddings, skip this step. To learn more about generating embeddings from your data, see How to Create Vector Embeddings.
Example
Generate Embeddings from Sample Data Using Cohere
Placeholder | Valid Value |
---|---|
<COHERE-API-KEY> | API key for Cohere. |
import cohere api_key = "<COHERE-API-KEY>" co = cohere.Client(api_key) generated_embeddings = co.embed( texts=data, model="embed-english-v3.0", input_type="search_document", embedding_types=["float", "int8", "ubinary"] ).embeddings float32_embeddings = generated_embeddings.float int8_embeddings = generated_embeddings.int8 int1_embeddings = generated_embeddings.ubinary
Generate the BSON vectors from your embeddings.
You can use the PyMongo driver to convert your native vector embedding to BSON vectors.
Example
Define and Run a Function to Generate BSON Vectors
from bson.binary import Binary, BinaryVectorDtype def generate_bson_vector(vector, vector_dtype): return Binary.from_vector(vector, vector_dtype) # For all vectors in your collection, generate BSON vectors of float32, int8, and int1 embeddings bson_float32_embeddings = [] bson_int8_embeddings = [] bson_int1_embeddings = [] for i, (f32_emb, int8_emb, int1_emb) in enumerate(zip(float32_embeddings, int8_embeddings, int1_embeddings)): bson_float32_embeddings.append(generate_bson_vector(f32_emb, BinaryVectorDtype.FLOAT32)) bson_int8_embeddings.append(generate_bson_vector(int8_emb, BinaryVectorDtype.INT8)) bson_int1_embeddings.append(generate_bson_vector(int1_emb, BinaryVectorDtype.PACKED_BIT))
Create documents with the BSON vector embeddings.
If you already have the BSON vector embeddings inside of documents in your collection, skip this step.
Example
Create Documents from the Sample Data
Placeholder | Valid Value |
---|---|
<FIELD-NAME-FOR-FLOAT32-TYPE> | Name of field with float32 values. |
<FIELD-NAME-FOR-INT8-TYPE> | Name of field with int8 values. |
<FIELD-NAME-FOR-INT1-TYPE> | Name of field with int1 values. |
def create_docs_with_bson_vector_embeddings(bson_float32_embeddings, bson_int8_embeddings, bson_int1_embeddings, data): docs = [] for i, (bson_f32_emb, bson_int8_emb, bson_int1_emb, text) in enumerate(zip(bson_float32_embeddings, bson_int8_embeddings, bson_int1_embeddings, data)): doc = { "_id":i, "data": text, "<FIELD-NAME-FOR-FLOAT32-TYPE>":bson_f32_emb, "<FIELD-NAME-FOR-INT8-TYPE>":bson_int8_emb, "<FIELD-NAME-FOR-INT1-TYPE>":bson_int1_emb, } docs.append(doc) return docs documents = create_docs_with_bson_vector_embeddings(bson_float32_embeddings, bson_int8_embeddings, bson_int1_embeddings, data)
Load your data into your Atlas cluster.
You can load your data from the Atlas UI and programmatically. To learn how to load your data from the Atlas UI, see Insert Your Data. The following steps and associated examples demonstrate how to load your data programmatically by using the PyMongo driver.
Connect to your Atlas cluster.
PlaceholderValid Value<ATLAS-CONNECTION-STRING>
Atlas connection string. To learn more, see Connect via Drivers.Example
import pymongo MONGO_URI = "<ATLAS-CONNECTION-STRING>" def get_mongo_client(mongo_uri): # establish the connection client = pymongo.MongoClient(mongo_uri) if not MONGO_URI: print("MONGO_URI not set in environment variables") Load the data into your Atlas cluster.
PlaceholderValid Value<DB-NAME>
Name of the database.<COLLECTION-NAME>
Name of the collection in the specified database.Example
client = pymongo.MongoClient(MONGO_URI) db = client["<DB-NAME>"] db.create_collection("<COLLECTION-NAME>") col = db["<COLLECTION-NAME>"] col.insert_many(documents)
Create the Atlas Vector Search index on the collection.
You can create Atlas Vector Search indexes by using the Atlas UI, Atlas CLI, Atlas Administration API, and MongoDB drivers. To learn more, see How to Index Fields for Vector Search.
Example
Create Index for the Sample Collection
Placeholder | Valid Value |
---|---|
<FIELD-NAME-FOR-FLOAT32-TYPE> | Name of field with float32 values. |
<FIELD-NAME-FOR-INT8-TYPE> | Name of field with int8 values. |
<FIELD-NAME-FOR-INT1-TYPE> | Name of field with int1 values. |
import time from pymongo.operations import SearchIndexModel vector_search_index_definition = { "fields":[ { "type": "vector", "path": "<FIELD-NAME-FOR-FLOAT32-TYPE>", "similarity": "dotProduct", "numDimensions": 1024, }, { "type": "vector", "path": "<FIELD-NAME-FOR-INT8-TYPE>", "similarity": "dotProduct", "numDimensions": 1024, }, { "type": "vector", "path": "<FIELD-NAME-FOR-INT1-TYPE>", "similarity": "euclidean", "numDimensions": 1024, } ] } search_index_model = SearchIndexModel(definition=vector_search_index_definition, name="<INDEX-NAME>", type="vectorSearch") col.create_search_index(model=search_index_model)
Define a function to run the Atlas Vector Search queries.
The function to run Atlas Vector Search queries must perform the following actions:
Convert the query text to a BSON vector.
Define the pipeline for the Atlas Vector Search query.
Example
Placeholder | Valid Value |
---|---|
<FIELD-NAME-FOR-FLOAT32-TYPE> | Name of field with float32 values. |
<FIELD-NAME-FOR-INT8-TYPE> | Name of field with int8 values. |
<FIELD-NAME-FOR-INT1-TYPE> | Name of field with int1 values. |
<INDEX-NAME> | Name of vector type index. |
<NUMBER-OF-CANDIDATES-TO-CONSIDER> | Number of nearest neighbors to use during the search. |
<NUMBER-OF-DOCUMENTS-TO-RETURN> | Number of documents to return in the results. |
def run_vector_search(query_text, collection, path): query_text_embeddings = co.embed( texts=[query_text], model="embed-english-v3.0", input_type="search_query", embedding_types=["float", "int8", "ubinary"] ).embeddings if path == "<FIELD-NAME-FOR-FLOAT32-TYPE>": query_vector = query_text_embeddings.float[0] vector_dtype = BinaryVectorDtype.FLOAT32 elif path == "<FIELD-NAME-FOR-INT8-TYPE>": query_vector = query_text_embeddings.int8[0] vector_dtype = BinaryVectorDtype.INT8 elif path == "<FIELD-NAME-FOR-INT1-TYPE>": query_vector = query_text_embeddings.ubinary[0] vector_dtype = BinaryVectorDtype.PACKED_BIT bson_query_vector = generate_bson_vector(query_vector, vector_dtype) pipeline = [ { '$vectorSearch': { 'index': '<INDEX-NAME>', 'path': path, 'queryVector': bson_query_vector, 'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, 'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> } }, { '$project': { '_id': 0, 'data': 1, 'score': { '$meta': 'vectorSearchScore' } } } ] return collection.aggregate(pipeline)
Run the Atlas Vector Search query.
You can run Atlas Vector Search queries programmatically. To learn more, see Run Vector Search Queries.
Example
from pprint import pprint query_text = "tell me a science fact" float32_results = run_vector_search(query_text, col, "<FIELD-NAME-FOR-FLOAT32-TYPE>") int8_results = run_vector_search(query_text, col, "<FIELD-NAME-FOR-INT8-TYPE>") int1_results = run_vector_search(query_text, col, "<FIELD-NAME-FOR-INT1-TYPE>") print("results from float32 embeddings") pprint(list(float32_results)) print("--------------------------------------------------------------------------") print("results from int8 embeddings") pprint(list(int8_results)) print("--------------------------------------------------------------------------") print("results from int1 embeddings") pprint(list(int1_results))
results from float32 embeddings [{'data': 'Mount Everest is the highest peak on Earth at 8,848m.', 'score': 0.6578356027603149}, {'data': 'The Great Wall of China is visible from space.', 'score': 0.6420407891273499}] -------------------------------------------------------------------------- results from int8 embeddings [{'data': 'Mount Everest is the highest peak on Earth at 8,848m.', 'score': 0.5149182081222534}, {'data': 'The Great Wall of China is visible from space.', 'score': 0.5136760473251343}] -------------------------------------------------------------------------- results from int1 embeddings [{'data': 'Mount Everest is the highest peak on Earth at 8,848m.', 'score': 0.62109375}, {'data': 'The Great Wall of China is visible from space.', 'score': 0.61328125}]
Install the required libraries.
Run the following command to install the PyMongo Driver. If necessary, you can also install libraries from your embedding model provider. This operation might take a few minutes to complete.
pip install pymongo
You must install PyMongo v4.10 or later driver.
Example
Install PyMongo and Cohere
pip --quiet install pymongo cohere
Define the functions to generate vector embeddings and convert embeddings to BSON-compatible format.
You must define functions that perform the following by using an embedding model:
Generate embeddings from your existing data if your existing data doesn't have any embeddings.
Convert the embeddings to BSON vectors.
Example
Function to Generate and Convert Embeddings
Placeholder | Valid Value |
---|---|
<COHERE-API-KEY> | API key for Cohere. |
1 import os 2 import pymongo 3 import cohere 4 from bson.binary import Binary, BinaryVectorDtype 5 6 # Specify your OpenAI API key and embedding model 7 os.environ["COHERE_API_KEY"] = "<COHERE-API-KEY>" 8 cohere_client = cohere.Client(os.environ["COHERE_API_KEY"]) 9 10 # Function to generate embeddings using Cohere 11 def get_embedding(text): 12 response = cohere_client.embed( 13 texts=[text], 14 model='embed-english-v3.0', 15 input_type='search_document', 16 embedding_types=["float"] 17 ) 18 embedding = response.embeddings.float[0] 19 return embedding 20 21 # Function to convert embeddings to BSON-compatible format 22 def generate_bson_vector(vector, vector_dtype): 23 return Binary.from_vector(vector, vector_dtype)
1 import os 2 import pymongo 3 import cohere 4 from bson.binary import Binary, BinaryVectorDtype 5 6 # Specify your OpenAI API key and embedding model 7 os.environ["COHERE_API_KEY"] = "<COHERE-API-KEY>" 8 cohere_client = cohere.Client(os.environ["COHERE_API_KEY"]) 9 10 # Function to generate embeddings using Cohere 11 def get_embedding(text): 12 response = cohere_client.embed( 13 texts=[text], 14 model='embed-english-v3.0', 15 input_type='search_document', 16 embedding_types=["int8"] 17 ) 18 embedding = response.embeddings.int8[0] 19 return embedding 20 21 # Function to convert embeddings to BSON-compatible format 22 def generate_bson_vector(vector, vector_dtype): 23 return Binary.from_vector(vector, vector_dtype)
1 import os 2 import pymongo 3 import cohere 4 from bson.binary import Binary, BinaryVectorDtype 5 6 # Specify your OpenAI API key and embedding model 7 os.environ["COHERE_API_KEY"] = "<COHERE-API-KEY>" 8 cohere_client = cohere.Client(os.environ["COHERE_API_KEY"]) 9 10 # Function to generate embeddings using Cohere 11 def get_embedding(text): 12 response = cohere_client.embed( 13 texts=[text], 14 model='embed-english-v3.0', 15 input_type='search_document', 16 embedding_types=["ubinary"] 17 ) 18 embedding = response.embeddings.ubinary[0] 19 return embedding 20 21 # Function to convert embeddings to BSON-compatible format 22 def generate_bson_vector(vector, vector_dtype): 23 return Binary.from_vector(vector, vector_dtype)
Connect to the Atlas cluster and retrieve existing data.
You must provide the following:
Connection string to connect to your Atlas cluster that contains the database and collection for which you want to generate embeddings.
Name of the database that contains the collection for which you want to generate embeddings.
Name of the collection for which you want to generate embeddings.
Example
Connect to Atlas Cluster for Accessing Data
Placeholder | Valid Value |
---|---|
<ATLAS-CONNECTION-STRING> | Atlas connection string. To learn more, see
Connect via Drivers. |
1 # Connect to your Atlas cluster 2 mongo_client = pymongo.MongoClient("<ATLAS-CONNECTION-STRING>") 3 db = mongo_client["sample_airbnb"] 4 collection = db["listingsAndReviews"] 5 6 # Filter to exclude null or empty summary fields 7 filter = { "summary": {"$nin": [None, ""]} } 8 9 # Get a subset of documents in the collection 10 documents = collection.find(filter).limit(50) 11 12 # Initialize the count of updated documents 13 updated_doc_count = 0
Generate, convert, and load embeddings into your collection.
Generate embeddings from your data using any embedding model if your data doesn't already have embeddings. To learn more about generating embeddings from your data, see How to Create Vector Embeddings.
Convert the embeddings to BSON vectors (as shown on line 7 in the following example).
Upload the embeddings to your collection on the Atlas cluster.
These operation might take a few minutes to complete.
Example
Generate, Convert, and Load Embeddings to Collection
1 for doc in documents: 2 # Generate embeddings based on the summary 3 summary = doc["summary"] 4 embedding = get_embedding(summary) # Get float32 embedding 5 6 # Convert the float32 embedding to BSON format 7 bson_float32 = generate_bson_vector(embedding, BinaryVectorDtype.FLOAT32) 8 9 # Update the document with the BSON embedding 10 collection.update_one( 11 {"_id": doc["_id"]}, 12 {"$set": {"embedding": bson_float32}} 13 ) 14 updated_doc_count += 1 15 16 print(f"Updated {updated_doc_count} documents with BSON embeddings.")
1 for doc in documents: 2 # Generate embeddings based on the summary 3 summary = doc["summary"] 4 embedding = get_embedding(summary) # Get int8 embedding 5 6 # Convert the float32 embedding to BSON format 7 bson_int8 = generate_bson_vector(embedding, BinaryVectorDtype.INT8) 8 9 # Update the document with the BSON embedding 10 collection.update_one( 11 {"_id": doc["_id"]}, 12 {"$set": {"embedding": bson_int8}} 13 ) 14 updated_doc_count += 1 15 16 print(f"Updated {updated_doc_count} documents with BSON embeddings.")
1 for doc in documents: 2 # Generate embeddings based on the summary 3 summary = doc["summary"] 4 embedding = get_embedding(summary) # Get int1 embedding 5 6 # Convert the float32 embedding to BSON format 7 bson_int1 = generate_bson_vector(embedding, BinaryVectorDtype.PACKED_BIT) 8 9 # Update the document with the BSON embedding 10 collection.update_one( 11 {"_id": doc["_id"]}, 12 {"$set": {"embedding": bson_int1}} 13 ) 14 updated_doc_count += 1 15 16 print(f"Updated {updated_doc_count} documents with BSON embeddings.")
Create the Atlas Vector Search index on the collection.
You can create Atlas Vector Search indexes by using the Atlas UI, Atlas CLI, Atlas Administration API, and MongoDB drivers in your preferred language. To learn more, see How to Index Fields for Vector Search.
Example
Create Index for the Collection
Placeholder | Valid Value |
---|---|
<INDEX-NAME> | Name of vector type index. |
1 from pymongo.operations import SearchIndexModel 2 3 vector_search_index_definition = { 4 "fields":[ 5 { 6 "type": "vector", 7 "path": "embedding", 8 "similarity": "euclidean", 9 "numDimensions": 1024, 10 } 11 ] 12 } 13 14 search_index_model = SearchIndexModel(definition=vector_search_index_definition, name="<INDEX-NAME>", type="vectorSearch") 15 16 collection.create_search_index(model=search_index_model)
The index should take about one minute to build. While it builds, the index is in an initial sync state. When it finishes building, you can start querying the data in your collection.
Define a function to run the Atlas Vector Search queries.
The function to run Atlas Vector Search queries must perform the following actions:
Generate embeddings for the query text.
Convert the query text to a BSON vector.
Define the pipeline for the Atlas Vector Search query.
Example
Function to Run Atlas Vector Search Query
Placeholder | Valid Value |
---|---|
<INDEX-NAME> | Name of vector type index. |
<NUMBER-OF-CANDIDATES-TO-CONSIDER> | Number of nearest neighbors to use during the search. |
<NUMBER-OF-DOCUMENTS-TO-RETURN> | Number of documents to return in the results. |
1 def run_vector_search(query_text, collection, path): 2 query_embedding = get_embedding("query_text") 3 bson_query_vector = generate_bson_vector(query_embedding, BinaryVectorDtype.FLOAT32) 4 5 pipeline = [ 6 { 7 '$vectorSearch': { 8 'index': '<INDEX-NAME>', 9 'path': path, 10 'queryVector': bson_query_vector, 11 'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, # for example, 20 12 'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> # for example, 5 13 } 14 }, 15 { 16 '$project': { 17 '_id': 0, 18 'name': 1, 19 'summary': 1, 20 'score': { '$meta': 'vectorSearchScore' } 21 } 22 } 23 ] 24 25 return collection.aggregate(pipeline)
1 def run_vector_search(query_text, collection, path): 2 query_embedding = get_embedding("query_text") 3 bson_query_vector = generate_bson_vector(query_embedding, BinaryVectorDtype.INT8) 4 5 pipeline = [ 6 { 7 '$vectorSearch': { 8 'index': '<INDEX-NAME>', 9 'path': path, 10 'queryVector': bson_query_vector, 11 'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, # for example, 20 12 'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> # for example, 5 13 } 14 }, 15 { 16 '$project': { 17 '_id': 0, 18 'name': 1, 19 'summary': 1, 20 'score': { '$meta': 'vectorSearchScore' } 21 } 22 } 23 ] 24 25 return collection.aggregate(pipeline)
1 def run_vector_search(query_text, collection, path): 2 query_embedding = get_embedding("query_text") 3 bson_query_vector = generate_bson_vector(query_embedding, BinaryVectorDtype.PACKED_BIT) 4 5 pipeline = [ 6 { 7 '$vectorSearch': { 8 'index': '<INDEX-NAME>', 9 'path': path, 10 'queryVector': bson_query_vector, 11 'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, # for example, 20 12 'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> # for example, 5 13 } 14 }, 15 { 16 '$project': { 17 '_id': 0, 18 'name': 1, 19 'summary': 1, 20 'score': { '$meta': 'vectorSearchScore' } 21 } 22 } 23 ] 24 25 return collection.aggregate(pipeline)
Run the Atlas Vector Search query.
You can run Atlas Vector Search queries programmatically. To learn more, see Run Vector Search Queries.
Example
Run a Sample Atlas Vector Search Query
1 from pprint import pprint 2 3 query_text = "ocean view" 4 query_results = run_vector_search(query_text, collection, "embedding") 5 6 print("results from your embeddings") 7 pprint(list(query_results))
results from your embeddings [{'name': 'Your spot in Copacabana', 'score': 0.5468248128890991, 'summary': 'Having a large airy living room. The apartment is well divided. ' 'Fully furnished and cozy. The building has a 24h doorman and ' 'camera services in the corridors. It is very well located, close ' 'to the beach, restaurants, pubs and several shops and ' 'supermarkets. And it offers a good mobility being close to the ' 'subway.'}, {'name': 'Twin Bed room+MTR Mongkok shopping&My', 'score': 0.527062714099884, 'summary': 'Dining shopping conveniently located Mongkok subway E1, airport ' 'shuttle bus stops A21. Three live two beds, separate WC, 24-hour ' 'hot water. Free WIFI.'}, {'name': 'Quarto inteiro na Tijuca', 'score': 0.5222363471984863, 'summary': 'O quarto disponível tem uma cama de solteiro, sofá e computador ' 'tipo desktop para acomodação.'}, {'name': 'Makaha Valley Paradise with OceanView', 'score': 0.5175154805183411, 'summary': 'A beautiful and comfortable 1 Bedroom Air Conditioned Condo in ' 'Makaha Valley - stunning Ocean & Mountain views All the ' 'amenities of home, suited for longer stays. Full kitchen & large ' "bathroom. Several gas BBQ's for all guests to use & a large " 'heated pool surrounded by reclining chairs to sunbathe. The ' 'Ocean you see in the pictures is not even a mile away, known as ' 'the famous Makaha Surfing Beach. Golfing, hiking,snorkeling ' 'paddle boarding, surfing are all just minutes from the front ' 'door.'}, {'name': 'Cozy double bed room 東涌鄉村雅緻雙人房', 'score': 0.5149975419044495, 'summary': 'A comfortable double bed room at G/F. Independent entrance. High ' 'privacy. The room size is around 100 sq.ft. with a 48"x72" ' 'double bed. The village house is close to the Hong Kong Airport, ' 'AsiaWorld-Expo, HongKong-Zhuhai-Macau Bridge, Disneyland, ' 'Citygate outlets, 360 Cable car, shopping centre, main tourist ' 'attractions......'}]
Your results might vary because you randomly selected 50
documents from the sample_airbnb.listingsAndReviews
namespace in step 3. The selected documents and generated
embeddings might be different in your environment.
For an advanced demonstration of this procedure on sample data using
Cohere's embed-english-v3.0
embedding model, see
this notebook.
Evaluate Your Query Results
You can measure the accuracy of your Atlas Vector Search query by evaluating how closely the results for an ANN search match the results of an ENN search against your quantized vectors. That is, you can compare the results of ANN search with the results of ENN search for the same query criteria and measure how frequently the ANN search results include the nearest neighbors in the results from the ENN search.
For a demonstration of evaluating your query results, see How to Measure the Accuracy of Your Query Results.