Docs Menu

Docs HomeLaunch & Manage MongoDBMongoDB Atlas

How to Index Fields for Vector Search

On this page

  • Considerations
  • Syntax
  • About the vector Type
  • About the filter Type
  • Atlas Vector Search Index Fields
  • Create and Manage Atlas Vector Search Indexes
  • Node Status

You can use the vectorSearch type to index fields for running $vectorSearch queries. You can define the index for the vector embeddings that you want to query and the boolean, numeric, or string values that you want to use to pre-filter your data. Filtering your data is useful to narrow the scope of your semantic search and ensure that certain vector embeddings are not considered for comparison, such as in a multi-tenant environment.

You can use the Atlas UI and Atlas Administration API to create your Atlas Vector Search index.

Note

You can't use the deprecated knnBeta operator to query fields indexed using the vectorSearch type index definition.

You can't index fields inside arrays of documents or fields inside arrays of objects in a vectorSearch type index definition. You can index fields inside documents using the dot notation.

The following syntax defines the vectorSearch index type:

1{
2 "fields":[
3 {
4 "type": "vector",
5 "path": "<field-to-index>",
6 "numDimensions": <number-of-dimensions>,
7 "similarity": "euclidean | cosine | dotProduct"
8 },
9 {
10 "type": "filter",
11 "path": "<field-to-index>"
12 },
13 ...
14 ]
15}

The vector field must contain an array of numbers of the BSON double data type for querying using the $vectorSearch pipeline stage. You must index the vector field as the vector type inside the fields array.

The following syntax defines the vector field type:

1{
2 "fields":[
3 {
4 "type": "vector",
5 "path": <field-to-index>,
6 "numDimensions": <number-of-dimensions>,
7 "similarity": "euclidean | cosine | dotProduct"
8 },
9 ...
10 ]
11}

You can optionally index boolean, numeric, and string values to pre-filter your data. Filtering your data is useful to narrow the scope of your semantic search and ensure that not all vectors are considered for comparison. You must index your boolean, numeric, and string fields using the filter type inside the fields array.

The following syntax defines the filter field type:

1{
2 "fields":[
3 {
4 "type": "vector",
5 ...
6 },
7 {
8 "type": "filter",
9 "path": "<field-to-index>"
10 },
11 ...
12 ]
13}

Note

Pre-filtering your data doesn't affect the score that Atlas Vector Search returns using $vectorSearchScore for $vectorSearch queries.

The Atlas Vector Search index definition takes the following fields:

Option
Type
Necessity
Purpose
fields
array of documents
Required
Vector and filter fields to index, one per document. At least one document must contain the field definition for the vector field. You can optionally also index number, boolean, and string fields, one per document, for pre-filtering the data.
fields.type
string
Required

Field type to use to index fields for $vectorSearch. You can specify one of the following values:

  • vector - for fields that contain vector embeddings.

  • filter - for fields that contain boolean, numeric, or string values.

fields.path
string
Required

Name of the field to index. For nested fields, use dot notation to specify path to embedded fields. You can't index field names with two consecutive dots (.) and field names ending with dots (.). For example, Atlas Vector Search doesn't support indexing the following field names:

foo..bar
Field name has two consecutive dots.
foo_bar.
Field name ends with a dot.
fields.numDimensions
int
Required
Number of vector dimensions that Atlas Vector Search enforces at index-time and query-time. You must specify a value less than or equal to 4096. You can set this field only for vector type fields.
fields.similarity
string
Required

Vector similarity function to use to search for top K-nearest neighbors. You can set this field only for vector type fields. Value include:

  • euclidean - measures the distance between ends of vectors. This value allows you to measure similarity based on varying dimensions. To learn more, see Euclidean.

  • cosine - measures similarity based on the angle between vectors. This value allows you to measure similarity that isn't scaled by magnitude. You can't use zero magnitude vectors with cosine. To measure cosine similarity, we recommend that you normalize your vectors and use dotProduct instead. To learn more, see Cosine.

  • dotProduct - measures similar to cosine, but takes into account the magnitude of the vector. This value allows you to efficiently measure similarity based on both angle and magnitude. To use dotProduct, you must normalize the vector to unit length at index-time and query-time. To learn more, see Dot Product.

Note

If you normalize the magnitude, cosine and dotProduct are almost identical in measuring similarity.

You can create and manage Atlas Vector Search indexes from the Atlas UI and the Atlas Administration API. To learn more, see:

When you create the Atlas Vector Search index, the Status column shows the current state of the index on the primary node of the cluster. Click the View status details link below the status to view the state of the index on all the nodes of the cluster.

When the Status column reads Active, the index is ready to use. In other states, queries against the index may return incomplete results.

Status
Description
Not Started
Atlas has not yet started building the index.
Initial Sync

Atlas is building the index or re-building the index after an edit. When the index is in this state:

  • For a new index, Atlas Vector Search doesn't serve queries until the index build is complete.

  • For an existing index, you can continue to use the old index for existing and new queries until the index rebuild is complete.

Active
Index is ready to use.
Recovering
Replication encountered an error. This state commonly occurs when the current replication point is no longer available on the mongod oplog. You can still query the existing index until it updates and its status changes to Active. Use the error in the View status details modal window to troubleshoot the issue. To learn more, see Fix Atlas Search Issues.
Failed
Atlas could not build the index. Use the error in the View status details modal window to troubleshoot the issue. To learn more, see Fix Atlas Search Issues.
Delete in Progress
Atlas is deleting the index from the cluster nodes.

While Atlas builds the index and after the build completes, the Documents column shows the percentage and number of documents indexed. The column also shows the total number of documents in the collection.

← Create and Manage Atlas Vector Search Indexes