Atlas Vector Search Quick Start
This quick start describes how to index vector embeddings in your data on an Atlas cluster and run queries that search vector embeddings for similar data.
A vector is an array of values arranged in one or more dimensions.
Vector embeddings that bidirectional encoder models, like OpenAI
text-embedding-ada-002
, produce can represent words, phrases, and
sentences. You can index vector embeddings and search for items related
in meaning or context. To learn more, see Atlas Vector Search Overview.
Objectives
In this quick start, you will do the following steps:
Create an index index definition for the
sample_mflix.embedded_movies
collection that indexes theplot_embedding
field as thevector
type. Theplot_embedding
field contains embeddings created using OpenAI'stext-embedding-ada-002
embeddings model. The index definition specifies1536
vector dimensions and measure similarity usingcosine
.Run an Atlas Vector Search query that searches the sample
sample_mflix.embedded_movies
collection. The query uses the$vectorSearch
stage to search theplot_embedding
field, which contains embeddings created using OpenAI'stext-embedding-ada-002
embeddings model. The query searches theplot_embedding
field using vector embeddings for the string time travel. It considers up to150
nearest neighbors, and returns10
documents in the results.
Prerequisites
To complete this quick start, you must meet the following prerequisites.
Required Cluster Configuration
You must have the following cluster configuration:
An Atlas cluster with MongoDB version 6.0.11, or v7.0.2 or later (including RCs).
The sample data loaded into your Atlas cluster.
Supported Clients
You must have one of the following applications to run queries on your Atlas cluster:
You can also use Atlas Vector Search with local Atlas deployments that you create with the Atlas CLI. To learn more, see Create a Local Atlas Deployment.
Required Access
To create an Atlas Vector Search index, you must have
Project Data Access Admin
or higher access to the project.
Index Vector Embeddings
Use the following example to index vector embeddings in your data on an Atlas cluster.
In Atlas, go to the Clusters page for your project.
If it is not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.
If it is not already displayed, select your desired project from the Projects menu in the navigation bar.
If the Clusters page is not already displayed, click Database in the sidebar.
Define your Atlas Vector Search index.
The following example index definition for the
sample_mflix.embedded_movies
collection indexes the
plot_embedding
field as the vector
type. The
plot_embedding
field contains embeddings created using
OpenAI's text-embedding-ada-002
embeddings model. The index
definition specifies 1536
vector dimensions and measures
similarity using cosine
.
1 { 2 "fields": [{ 3 "type": "vector", 4 "path": "plot_embedding", 5 "numDimensions": 1536, 6 "similarity": "cosine" 7 }] 8 }
To learn more, see How to Index Fields for Vector Search.
Run a Vector Search Query
Use the following example to run a query that searches vector embeddings.
➤ Use the Select your language drop-down menu to select the client to use to run the example queries in this section.