Docs Home → Launch & Manage MongoDB → MongoDB Atlas

Run Vector Search Queries

On this page

Definition

Fields
Behavior
Atlas Vector Search Index
Atlas Vector Search Score
Atlas Vector Search Pre-Filter
Limitations
Supported Clients
Parallel Query Execution Across Segments
Examples

Atlas Vector Search queries take the form of an aggregation pipeline stage. For the $vectorSearch queries, Atlas Vector Search returns the results of your semantic search.

Definition

The $vectorSearch stage performs an ANN search on a vector in the specified field. The field that you want to search must be indexed as Atlas Vector Search vector type inside a vectorSearch index type.

$vectorSearch

A $vectorSearch pipeline has the following prototype form:

{
  "$vectorSearch": {
    "index": "<index-name>",
    "path": "<field-to-search>",
    "queryVector": [<array-of-numbers>],
    "numCandidates": <number-of-candidates>,
    "limit": <number-of-results>,
    "filter": {<filter-specification>}
  }
}

Fields

The $vectorSearch stage takes a document with the following fields:

Field	Type	Necessity	Description
`filter`	document	Optional	Any MQL match expression that compares an indexed field with a boolean, number (not decimals), or string to use as a prefilter. You can use any of the following comparison query and aggregation pipeline operators in your filter: `$gt` `$lt` `$gte` `$lte` `$eq` `$ne` `$in` `$nin` `$and` `$or` To learn more, see Atlas Vector Search Pre-Filter.
`index`	string	Required	Name of the Atlas Vector Search index to use. Atlas Vector Search doesn't return results if you misspell the index name or if the specified index doesn't already exist on the cluster.
`limit`	number	Required	Number (of type `int` only) of documents to return in the results. Value can't exceed the value of `numCandidates`.
`numCandidates`	number	Required	Number of nearest neighbors to use during the search. Value must be less than or equal to (`<=`) `10000`. You can't specify a number less than the number of documents to return (`limit`). We recommend that you specify a number higher than the number of documents to return (`limit`) to increase accuracy although this might impact latency. For example, we recommend a ratio of ten to twenty nearest neighbors for a limit of only one document. This overrequest pattern is the recommended way to trade off latency and recall in your ANN searches, and we recommend tuning this on your specific dataset.
`path`	string	Required	Indexed vectorEmbedding type field to search. To learn more, see Path Construction.
`queryVector`	array of numbers	Required	Array of numbers of the BSON `double` type that represent the query vector. The array size must match the number of vector `dimensions` specified in the index definition for the field. Note You must embed your query with the same model that you used to embed the data.

Behavior

$vectorSearch must be the first stage of any pipeline where it appears.

Atlas Vector Search Index

You must index the fields to search using the $vectorSearch stage inside a vectorSearch type index definition. You can index the following types of fields in an Atlas Vector Search vectorSearch type index definition:

Fields that contain vector embeddings as vector type.
Fields that contain boolean, numeric, and string values as filter type to enable vector search on pre-filtered data.

To learn more about these Atlas Vector Search field types, see How to Index Fields for Vector Search.

Atlas Vector Search Score

Atlas Vector Search assigns a score, in a fixed range from 0 to 1 only, to every document that it returns. For cosine and dotProduct similarities, Atlas Vector Search normalizes the score using the following algorithm:

score = (1 + cosine/dot_product(v1,v2)) / 2

The score assigned to a returned document is part of the document's metadata. To include each returned document's score along with the result set, use a $project stage in your aggregation pipeline.

To retrieve the score of your Atlas Vector Search query results, use vectorSearchScore as the value in the $meta expression. That is, after the $vectorSearch stage, in the $project stage, the score field takes the $meta expression. The expression requires the vectorSearchScore value to return the score of documents for the vector search.

Example

1 db.<collection>.aggregate([
2   {
3     "$vectorSearch": {
4       <query-syntax>
5     }
6   },
7   {
8     "$project": {
9       "<field-to-include>": 1,
10       "<field-to-exclude>": 0,
11       "score": { "$meta": "vectorSearchScore" }
12     }
13   }
14 ])

Note

Pre-filtering your data doesn't affect the score that Atlas Vector Search returns using $vectorSearchScore for $vectorSearch queries.

Atlas Vector Search Pre-Filter

The $vectorSearch filter option matches only BSON boolean, string, and numeric values. You must index the fields that you want to filter your data by as the filter type in a vectorSearch type index definition. Filtering your data is useful to narrow the scope of your semantic search and ensure that not all vectors are considered for comparison.

The $vectorSearch filter option supports only the following comparison query operators:

$gt
$lt
$gte
$lte
$eq
Note
Atlas Vector Search also supports the short form of $eq. In the short form, you don't need to specify $eq in the query. For example, consider the following $eq query:
{ "genres": { "$eq": "Comedy" } }
You can run the preceding query using the short form of $eq the following way:
{ "genres": "Comedy" }
$ne
$in
$nin

Only matches a single value and doesn't support an array of values.

The $vectorSearch filter option supports only the following aggregation pipeline operators:

Note

The $vectorSearch filter option doesn't support other comparison query and aggregation pipeline operators.

Limitations

$vectorSearch is supported only on Atlas clusters running the following MongoDB versions:

v6.0.11
v7.0.2 and later (including RCs).

$vectorSearch can't be used in view definition and the following pipeline stages:

$lookup sub-pipeline
$unionWith sub-pipeline
$facet pipeline stage

You can pass the results of $vectorSearch to this stage.

Supported Clients

You can run $vectorSearch queries using the Atlas Data Explorer, mongosh, and the following drivers:

You can also use Atlas Vector Search with local Atlas deployments that you create with the Atlas CLI. To learn more, see Create a Local Atlas Deployment.

Parallel Query Execution Across Segments

We recommend dedicated search nodes to isolate vector search query processing. You might see improved query performance on the dedicated search nodes. Note that the high-CPU systems might provide more performance improvement. When Atlas Vector Search runs on search nodes, Atlas Vector Search parallelizes query execution across segments of data.

Parallelization of query processing improves the response time in many cases, such as queries on large datasets. Using intra-query parallelism during Atlas Vector Search query processing utilizes more resources, but improves latency for each individual query.

Note

Atlas Vector Search doesn't guarantee that each query will run concurrently. For example, when too many concurrent queries are queued, Atlas Vector Search might fall back to single-threaded execution.

Examples

The following queries search the sample sample_mflix.embedded_movies collection using the $vectorSearch stage. The queries search the plot_embedding field, which contains embeddings created using OpenAI's text-embedding-ada-002 embeddings model. If you added the sample collection to your Atlas cluster and created the sample indexes for the collection, you can run the following queries against the collection.

➤ Use the Select your language drop-down menu to set the language of the examples in this page.

← Edit an Atlas Vector Search Index

Review Deployment Options →

1	db.<collection>.aggregate([
2	{
3	"$vectorSearch": {
4	<query-syntax>
5	}
6	},
7	{
8	"$project": {
9	"<field-to-include>": 1,
10	"<field-to-exclude>": 0,
11	"score": { "$meta": "vectorSearchScore" }
12	}
13	}
14	])