Atlas Search Overview

On this page

Atlas Search Fundamentals

Indexing
Tokenization
Querying
Scoring
Atlas Search Architecture
About the mongot Process

MongoDB's Atlas Search allows fine-grained text indexing and querying of data on your Atlas cluster. It enables advanced search functionality for your applications without any additional management or separate search system alongside your database. Atlas Search provides options for several kinds of text analyzers, a rich query language that uses Atlas Search aggregation pipeline stages like $search and $searchMeta in conjunction with other MongoDB aggregation pipeline stages, and score-based results ranking.

Tip

Quickly try Atlas Search without needing an Atlas account, cluster, or collection, with the Atlas Search Playground. To learn more, see the documentation.

Atlas Search Fundamentals

The following concepts form the basis of Atlas Search and are essential to optimize your application.

Indexing

In the context of search, an index is a data structure that categorizes data in an easily searchable format. Search indexes enable faster retrieval of documents that contain a given term without having to scan the entire collection. While both Atlas Search indexes and MongoDB Indexes make data retrieval faster, note that they are not the same. Like the index in the back of a book, a search index is a mapping between terms and the documents that contain those terms. Search indexes also contain other relevant metadata, such as the positions of terms in documents.

Creating at least one search index is usually required in any search application. For more information, see Create and Manage Atlas Search Indexes.

Tokenization

When creating a search index, data must first be transformed into a sequence of tokens or terms. An analyzer facilitates this process through steps including:

Tokenization: Breaking up words in a string into indexable tokens. For example, dividing a sentence by whitespace and punctuation.
Normalization: Organizing data so that it is consistently represented and easier to analyze. For example, transforming text to lower case or removing unwanted words called stop words.
Stemming: Reducing words to their root form. For example, ignoring suffixes, prefixes, and plural word forms.

The specifics of tokenization are language-specific and can require making additional choices. Which analyzer to use depends on your data and application. For more information, see Process Data with Analyzers.

Querying

Search queries consult the index to return a set of results. Search queries are different than traditional database queries, as they are intended to meet more general information needs. Where a database query must follow a strict syntax, search queries can be for simple text matching, but can also look for similar phrases, number or date ranges, or use regular expressions or wildcards.

For more information, see Create and Run Atlas Search Queries.

Scoring

Each document receives a relevancy score that enables query results to be returned in order from the highest relevance to the lowest. In the simplest form of scoring, documents score higher if the query term appears frequently in a document and lower if the query term appears across many documents in the collection. Scoring can also be customized. Tailoring search to a specific domain often means customizing the relevance-based default score by boosting, decaying, or modifying it in other ways.

For more information, see Score Documents.

Atlas Search Architecture

The Atlas Search mongot process, built on Apache Lucene, interfaces with the mongod database process to create and manage your full-text and vector search indexes and queries.

About the `mongot` Process

The mongot process performs the following tasks:

Creates Atlas Search indexes based on the rules in the index definition for the collection.
Monitors change streams for the current state of the documents and indexes for the collections for which you defined Atlas Search indexes.
Processes Atlas Search queries and returns the document IDs and other search metadata for the matching documents to mongod, which then does a full document lookup and returns the results to the client.

You can choose a deployment model where the Atlas Search mongot process runs alongside the mongod process on each node in the Atlas cluster or where the mongot process runs on separate search nodes. For testing your search queries and prototyping your application, you can choose the default deployment model where both the mongot and mongod processes run on the same node. However, for production-ready applications, deploy mongot on separate search nodes to avoid any resource contention between the mongot and mongod processes in your production environment.

For guidance on choosing a deployment type for pre-production and production environments, see Atlas Search Deployment Options and Atlas Vector Search Deployment Options.

Back

Atlas Search

Atlas Search Deployment Options

Tip