MongoDB Atlas Best Practices: Part 3

Mat Keep
August 16, 2016 | Updated: March 22, 2017

Scaling your MongoDB Atlas Deployment, Delivering Continuous Application Availability

MongoDB Atlas radically simplifies the operation of MongoDB. As with any hosted database as a service there are still decisions you need to take to ensure the best performance and availability for your application. This blog series provides a series of recommendations that will serve as a solid foundation for getting the most out of the MongoDB Atlas service.

We’ll cover four main areas over this series of blog posts:

In part 1, we got started by preparing for our deployment, focusing specifically on schema design and application access patterns.
In part 2, we discussed additional considerations as you prepare for your deployment, including indexing, data migration and instance selection.
In this part 3 post, we are going dive into how you scale your MongoDB Atlas deployment, and achieve your required availability SLAs.
In the final part 4, we’ll wrap up with best practices for operational management and ensuring data security.

If you want to get a head start and learn about all of these topics now, just go ahead and download the MongoDB Atlas Best Practices guide.

Scaling a MongoDB Atlas Cluster

Horizontal Scaling with Sharding

*Figure 1: Create a sharded MongoDB Atlas cluster in just a few clicks*

MongoDB Atlas provides horizontal scale-out for databases using a technique called sharding, which is transparent to applications. MongoDB distributes data across multiple Replica Sets called shards. With automatic balancing, MongoDB ensures data is equally distributed across shards as data volumes grow or the size of the cluster increases or decreases. Sharding allows MongoDB deployments to scale beyond the limitations of a single server, such as bottlenecks in RAM or disk I/O, without adding complexity to the application.

MongoDB Atlas supports three types of sharding policy, enabling administrators to accommodate diverse query patterns:

Range-based sharding: Documents are partitioned across shards according to the shard key value. Documents with shard key values close to one another are likely to be co-located on the same shard. This approach is well suited for applications that need to optimize range-based queries.
Hash-based sharding: Documents are uniformly distributed according to an MD5 hash of the shard key value. Documents with shard key values close to one another are unlikely to be co-located on the same shard. This approach guarantees a uniform distribution of writes across shards – provided that the shard key has high cardinality – making it optimal for write-intensive workloads.
Location-aware sharding: Documents are partitioned according to a user-specified configuration that "tags" shard key ranges to physical shards residing on specific hardware.

Users should consider deploying a sharded MongoDB Atlas cluster in the following situations:

RAM Limitation: The size of the system's active working set plus indexes is expected to exceed the capacity of the maximum amount of RAM in the provisioned instance.
Disk I/O Limitation: The system will have a large amount of write activity, and the operating system will not be able to write data fast enough to meet demand, or I/O bandwidth will limit how fast the writes can be flushed to disk.
Storage Limitation: The data set will grow to exceed the storage capacity of a single node in the system.

Applications that meet these criteria, or that are likely to do so in the future, should be designed for sharding in advance rather than waiting until they have consumed available capacity. Applications that will eventually benefit from sharding should consider which collections they will want to shard and the corresponding shard keys when designing their data models. If a system has already reached or exceeded its capacity, it will be challenging to deploy sharding without impacting the application's performance.

Between 1 and 12 shards can be configured in MongoDB Atlas.

Sharding Best Practices

Users who choose to shard should consider the following best practices.

Select a good shard key: When selecting fields to use as a shard key, there are at least three key criteria to consider:

Cardinality: Data partitioning is managed in 64 MB chunks by default. Low cardinality (e.g., a user's home country) will tend to group documents together on a small number of shards, which in turn will require frequent rebalancing of the chunks and a single country is likely to exceed the 64 MB chunk size. Instead, a shard key should exhibit high cardinality.
Insert Scaling: Writes should be evenly distributed across all shards based on the shard key. If the shard key is monotonically increasing, for example, all inserts will go to the same shard even if they exhibit high cardinality, thereby creating an insert hotspot. Instead, the key should be evenly distributed.
Query Isolation: Queries should be targeted to a specific shard to maximize scalability. If queries cannot be isolated to a specific shard, all shards will be queried in a pattern called scatter/gather, which is less efficient than querying a single shard.
Ensure uniform distribution of shard keys: When shard keys are not uniformly distributed for reads and writes, operations may be limited by the capacity of a single shard. When shard keys are uniformly distributed, no single shard will limit the capacity of the system.

For more on selecting a shard key, see Considerations for Selecting Shard Keys.

Avoid scatter-gather queries: In sharded systems, queries that cannot be routed to a single shard must be broadcast to multiple shards for evaluation. Because these queries involve multiple shards for each request they do not scale well as more shards are added.

Use hash-based sharding when appropriate: For applications that issue range-based queries, range-based sharding is beneficial because operations can be routed to the fewest shards necessary, usually a single shard. However, range-based sharding requires a good understanding of your data and queries, which in some cases may not be practical. Hash-based sharding ensures a uniform distribution of reads and writes, but it does not provide efficient range-based operations.

Apply best practices for bulk inserts: Pre-split data into multiple chunks so that no balancing is required during the insert process. For more information see Create Chunks in a Sharded Cluster in the MongoDB Documentation.

Add capacity before it is needed: Cluster maintenance is lower risk and more simple to manage if capacity is added before the system is over utilized.

Continuous Availability & Data Consistency

Data Redundancy

MongoDB maintains multiple copies of data, called replica sets, using native replication. Replica failover is fully automated in MongoDB, so it is not necessary to manually intervene to recover nodes in the event of a failure.

A replica set consists of multiple replica nodes. At any given time, one member acts as the primary replica and the other members act as secondary replicas. If the primary member fails for any reason (e.g., a failure of the host system), one of the secondary members is automatically elected to primary and begins to accept all writes; this is typically completed in 2 seconds or less and reads can optionally continue on the secondaries.

Sophisticated algorithms control the election process, ensuring only the most suitable secondary member is promoted to primary, and reducing the risk of unnecessary failovers (also known as "false positives"). The election algorithm processes a range of parameters including analysis of histories to identify those replica set members that have applied the most recent updates from the primary and heartbeat and connectivity status.

A larger number of replica nodes provide increased protection against database downtime in case of multiple machine failures. A MongoDB Atlas replica set can be configured with 3, 5, or 7 replicas. Replica set members are deployed across availability zones to avoid the failure of a data center interrupting service to the MongoDB Atlas cluster.

More information on replica sets can be found on the Replication MongoDB documentation page.

Write Guarantees

MongoDB allows administrators to specify the level of persistence guarantee when issuing writes to the database, which is called the write concern. The following options can be selected in the application code:

Write Acknowledged: This is the default write concern. The mongod will confirm the execution of the write operation, allowing the client to catch network, duplicate key, Document Validation, and other exceptions
Journal Acknowledged: The mongod will confirm the write operation only after it has flushed the operation to the journal on the primary. This confirms that the write operation can survive a mongod crash and ensures that the write operation is durable on disk
Replica Acknowledged: It is also possible to wait for acknowledgment of writes to other replica set members. MongoDB supports writing to a specific number of replicas. This mode also ensures that the write is written to the journal on the secondaries. Because replicas can be deployed across racks within data centers and across multiple data centers, ensuring writes propagate to additional replicas can provide extremely robust durability
Majority: This write concern waits for the write to be applied to a majority of replica set members, and that the write is recorded in the journal on these replicas – including on the primary

Read Preferences

Reading from the primary replica is the default configuration as it guarantees consistency. Updates are typically replicated to secondaries quickly, depending on network latency. However, reads on the secondaries will not normally be consistent with reads on the primary. Note that the secondaries are not idle as they must process all writes replicated from the primary. To increase read capacity in your operational system consider sharding. Secondary reads can be useful for analytics and ETL applications as this approach will isolate traffic from operational workloads. You may choose to read from secondaries if your application can tolerate eventual consistency.

A very useful option is primaryPreferred, which issues reads to a secondary replica only if the primary is unavailable. This configuration allows for the continuous availability of reads during the short failover process.

For more on the subject of configurable reads, see the MongoDB Documentation page on replica set Read Preference.

Next Steps

That’s a wrap for part 3 of the MongoDB Atlas best practices blog series. In the final instalment, we’ll dive into best practices for operational management and ensuring data security

Remember, if you want to get a head start and learn about all of our recommendations now:

Download MongoDB Atlas Best Practices guide

← Previous

Avoiding the Dark Side of the Cloud: Platform Lock-In

Date Your Cloud Provider….But Don’t Marry Them The rise of cloud computing is indisputable – driven primarily by the promise of agility in bringing new applications to market faster, and by more closely aligning expense with actual business usage. But moving to the cloud is not without risk. Many surveys, such as the MongoDB Cloud Brief discussed later, point at the fear of platform lock-in as one of the top inhibitors to on-going cloud adoption. Enterprises are turning to open source software to throw off the shackles of proprietary hardware and software. But they are also concerned about exposing the business to a new level of lock-in. This time from APIs and services of the cloud providers themselves. In this blog, we explore the drivers and inhibitors of cloud adoption, as well as which factors are driving the fear of cloud lock-in. We’ll then discuss the steps users can take to get the best of both worlds – the business velocity provided by the cloud, without the risks of locking themselves into a specific vendor. Growing Cloud Adoption So how quickly is the cloud growing? Recent analysis of the Infrastructure-as-a-Service (IaaS) market by IDC (1) provides some interesting statistics: Spending on public cloud platforms is expected to reach $23bn by the end of 2016, representing just under 20% growth over 2015. Private cloud spending is expected to reach $13bn over the same period, representing 10% growth. If we contrast this with spending on “traditional” IT infrastructure, we see a forecast decline of 4.5% through 2016. Now is not a good time to be a peddler of premium IT hardware. By 2020, IDC expects total IaaS cloud spending to hit just under $60bn, making revenues (almost) as large as the traditional IT infrastructure sector. Of course the cloud is a natural home for startups building their businesses. I’m old enough to remember when early seed funding was dedicated purely to financing your own Sun hardware and Oracle software licenses so that you could actually demo your new concept. The thought of doing this today is laughable. But it’s not just startups that are driving cloud growth. Research from RightScale (2) concluded that 17% of enterprises now have over 1,000 Virtual Machines (VMs) provisioned to public cloud providers, up from 13% of enterprises in 2015. Private cloud showed even stronger growth with 31% of enterprises running more than 1,000 VMs, up from 22% in 2015. Here at MongoDB, we’ve conducted our own research as we polled over 2,000 members of the MongoDB community. This research found that 82% of respondents were strategically using or evaluating the cloud today. This, and a multitude of other fun facts and insights are available in our MongoDB 2016 Cloud Brief . So What are the Top Drivers, and the Top Inhibitors of Cloud Adoption? As the Cloud Brief shows, the number one driver for cloud adoption is agility – the need to rollout new applications faster. This desire was reinforced at a recent meeting I had in London with developers from a leading global financial institution. They complained it takes three months for hardware supporting a new project to be procured, installed, racked, and stacked. Clearly unacceptable in today’s hyper-competitive market governed by agile development, continuous integration, and elastic scaling. This need for application agility was the top cited reason for cloud adoption across organizations of all sizes – from those with less than 50 employees to enterprises with more than 5,000. It was also the top reason for cloud adoption cited across all job titles – from the CIO through to architects, developers and DBAs. The Cloud Brief shows another interesting statistic. The majority of respondents use more than one cloud provider. This was primarily driven by the need to take advantage of specific features offered by one provider over another, and this clearly demonstrates the need to remain flexible in your cloud choices. Hitching your wagon to one provider could present serious competitive disadvantage if another cloud vendor introduces something that your rivals can take advantage of, but you can’t. What is that “something”? It could be a specific service or feature, region, instance type, pricing schedule, performance kick. The list goes on. When our survey respondents evaluated the leading inhibitors of cloud usage, security and data privacy came out on top, followed closely by cloud vendor lock-in. We did see more bifurcation in the responses to this specific question: Security was the top inhibitor in medium and large-sized enterprises. Lock-in was the second top inhibitor. Lock-in was the top concern among smaller enterprises. What did company size have to do with the differences in response? Small organizations have increased freedom to innovate quickly and are less likely to be tied to legacy software. Maintaining maximum flexibility as they build their apps means avoiding vendor lock-in that can present restrictions on this ability to evolve. Larger organizations are more likely to have mature contracts with software vendors and are therefore less sensitive to the loss of flexibility caused by long-term vendor agreements. Concerns over data security resonate more for large enterprises as high profile attacks and data breaches are a substantial threat to a large brand. However, lock-in was the second top concern for these larger organizations, ahead of the technical expertise needed to run on the cloud, or concerns about maintaining performance and availability SLAs for workloads running in the cloud. So Where Does the Fear of Lock-in Come From? As discussed in the introduction to this post, many organizations have been burnt by lock-in in the past. The use of open source software and commodity hardware has provided an escape route for many, but they have concerns that by moving to the cloud, they trade one form of lock-in for another. What form does that lock-in take? It’s not about the hardware, operating systems and software of the past, instead it’s about APIs, services, and data. The underlying IaaS components made up by compute, storage and networking are pretty much commodity and can be exchanged between cloud providers. But as we move up the infrastructure stack, so the APIs and data these services exchange become much less portable. Specifically, we need to think about security, management, continuous integration/continuous delivery (CI/CD) pipelines, container orchestration, serverless compute fabrics, content management, search, databases, data warehouses, and analytics, to name just some of the key friction points. And it’s those services that manage our data that cause particular concern. You may have heard of the term “data gravity”. It was (presciently) coined a few years ago, but has a real resonance today. In the same way that as the mass of an object increases so the strength of gravitational pull against it increases as well, in the case of data gravity the more data you have in a specific location, the harder it is to move. An article (3) in the UK’s Computing tech publication illustrates this point. Comparethemarket.com, the largest price comparison site in the UK, made the switch from managing its own on-premises infrastructure to Amazon Web Services (AWS). As a part of that move, the IT team considered the AWS DynamoDB NoSQL database service. However, concerns around exposing itself to excessive AWS control made comparethemarket eliminate DynamoDB as an option. The company has since standardized on MongoDB as the operational database for its microservices-based architecture . There is an important take-away in all of this: It’s fine to date your cloud provider….but don’t marry them. MongoDB and the Cloud We’ve just launched our shiny new MongoDB Atlas database as a service, providing all of the features of MongoDB, without the operational heavy lifting required for any application. So isn’t this new service also presenting the risk of cloud lock-in? The answer is “no”, for two important reasons. The first is that MongoDB Atlas is designed to run on multiple public cloud platforms – so you can spin it up on your vendor of choice. It is available on AWS today, with Azure and Google Cloud Platform coming soon. Eventually we plan to offer MongoDB Atlas across clouds, so you can stretch your MongoDB deployment across providers to take advantage of, for example, specific pricing schemes, regions, or platform features. Secondly, MongoDB Atlas is running the same software you can download yourself from the MongoDB Download Center . This means MongoDB can run on your laptop, on your own local servers, in your chosen co-location facilities, or on your own instances on any public cloud provider. It is quick and easy to migrate existing databases into MongoDB Atlas, and to get it back out again, as we demonstrate in this MongoDB Atlas migration blog . What is also really helpful in mitigating lock-in is that if you decide you want to bring operations out of MongoDB Atlas and back under your control, it is easy to move your databases onto your own infrastructure and manage them using the MongoDB Ops Manager and MongoDB Cloud Manager tools. The user experience across MongoDB Atlas, Cloud Manager, and Ops Manager is consistent, ensuring that disruption is minimal if you decide to switch to your own infrastructure. *Figure 1: Consistent operational interface, where-ever you run MongoDB* The reality is that if you try to achieve this type of flexibility with any of the public cloud vendors database services, you’ll soon hit a wall. Next Steps We all know the cloud is great. But it doesn’t come without risks. If you are looking to move your databases to the cloud – whether for new apps, or migrations of existing on-premises apps, then go ahead and check out MongoDB Atlas. Providing freedom from database cloud lock-in. Try MongoDB Atlas (1) http://www.informationweek.com/cloud/infrastructure-as-a-service/cloud-spending-will-top-$37-billion-in-2016-idc-reports/d/d-id/1326193 (2) http://www.rightscale.com/blog/cloud-industry-insights/cloud-computing-trends-2016-state-cloud-survey (3) http://www.computing.co.uk/ctg/news/2411982/making-movies-how-comparethemarketcom-got-meerkat-movies-up-and-running-in-months

August 15, 2016

Next →

Rethinking Information Retrieval in MongoDB with Voyage AI

The future of AI-powered search The role of the modern database is evolving. AI-powered applications require more than just fast, scalable, and durable data management: they need highly accurate data retrieval and intelligent ranking, which are enabled by the ability to extract meaning from large volumes of unstructured inputs like text, images, and video. Retrieval-augmented generation (RAG) is now the default for LLM-powered applications, making accuracy in AI-driven search and retrieval a critical priority for developers. Meanwhile, customers in industries like healthcare, legal, and finance need highly reliable answers to power the applications their users rely on. MongoDB Atlas Search already combines keyword and vector search through its hybrid capabilities. However, to truly meet developers’ needs and expectations, we are expanding our focus to integrating best-in-class embedding and reranking models into Atlas to ensure optimal performance and superior outcomes. These models enable search systems to understand meaning beyond exact words in text, and to recognize semantic similarities across images, video, and audio. Embedding models and rerankers empower customer support teams to quickly match queries with pertinent documents, assist legal professionals in surfacing key clauses within long contracts, and optimize RAG pipelines by retrieving contextually significant information that addresses users’ queries. MongoDB is actively building this future. In February, we announced the acquisition of Voyage AI , a pioneer in state-of-the-art embedding and reranking models. With Voyage’s leading models and Atlas Search, developers will get a unified, production-ready stack for semantic retrieval. Why embedding and reranking matter Embedding and reranking models are core components of modern information retrieval, providing the link between natural language and accurate results: Embedding models transform data into vector representations that capture meaning and context, enabling searches based on semantic similarity rather than just keyword matches. Reranking models improve search accuracy by scoring and ranking a smaller set (e.g., 1000) of documents based on their relevance to a query, ensuring the most meaningful results appear first. A typical system uses an embedding model to project documents into a vector space that encodes semantics. A nearest neighbor search provides a list of documents close to a given query. These results are processed with a reranking model that enables deeper, clause-by-clause comparison between the queries and the nearest neighbors. This combination can greatly improve retrieval accuracy. For example, the system processing a user query for “holiday cookie recipes without tree nuts” may first retrieve a set of holiday recipes with the nearest neighbor search. In reranking, the query would be fully compared to each retrieved document to ensure each recipe does not contain any nuts. Voyage AI’s embedding and reranking models Voyage offers a suite of embedding models that support both general-purpose use cases and domain-specific needs . General models like voyage-3 , voyage-3-large , and voyage-3-lite handle diverse text inputs. For specialized applications, Voyage provides models tailored to domains like code ( voyage-code-3 ), legal ( voyage-law-2 ), and finance ( voyage-finance-2 ), offering higher accuracy by capturing the context and semantics unique to each field. They also offer a multimodal model ( voyage-multimodal-3 ) capable of processing interleaved text and images. In addition, Voyage provides reranking models in standard and lite versions , each focused on optimizing relevance while keeping latency and computational load under control. Voyage’s embedding models are designed to optimize the two distinct workloads required for each application, and our inference platform is purpose-built to support both scenarios efficiently: Document embeddings are created for all documents in a database whenever they are added or updated, capturing the semantic meaning of the documents an application has access to. Typically generated in batch, they are optimized for scale and throughput. Query embeddings enable the system to effectively interpret the user's intent for relevant results. Produced for a user's search query at the moment it's made, they are optimized for low latency and high precision. Figure 1. Voyage AI's embedding workflow: Document and query processing in MongoDB. Voyage AI’s embedding and reranking models consistently outperform leading production-grade models across industry benchmarks. For example, the general-purpose voyage-3-large model shows up to 20% improved retrieval accuracy over widely adopted production models across 100 datasets spanning domains like law, finance, and code. Despite its performance, it requires 200x less storage when using binary quantized embeddings. Domain-specific models like voyage-code-2 also outperform general-purpose models by up to 15% on code tasks On the reranking side, rerank-lite-1 and rerank-1 deliver gains of up to 14% in precision and recall across over 80 multilingual and vertical-specific datasets. These improvements translate directly into better relevance, faster inference, and more efficient RAG pipelines at scale. MongoDB Atlas Search + Voyage AI models today MongoDB Atlas Vector Search enables powerful semantic retrieval with a wide range of embedding and reranking models. Developers can benefit from using Voyage models with Atlas Vector Search today, even before the deeper integration arrives. Figure 2. Example code for embedding and vector search with Voyage AI and MongoDB. “AI-powered search”, not “AI Search” Not all AI search experiences are created equal. As we begin integrating Voyage AI models directly into MongoDB Atlas, it’s worth sharing how we’re approaching this work. The best solutions today blend traditional information retrieval with modern AI techniques, improving relevance while keeping systems explainable and tunable. AI-powered search in MongoDB Atlas enhances traditional search techniques with modern AI models. Embeddings improve semantic understanding, and reranking models refine relevance. But unlike opaque AI stacks, this approach remains transparent, customizable, and efficient: More control: Developers can tune search logic and ranking strategies based on their domain. More flexibility: Models can be updated or swapped to improve on an industry-specific corpus of data. More efficiency: MongoDB handles both storage and retrieval, optimizing cost and performance at scale. With Voyage’s models integrated directly into Atlas workflows, developers gain powerful semantic capabilities without sacrificing clarity or maintainability. Building the MongoDB + Voyage AI “better together” story While MongoDB’s flexible query language unlocks powerful capabilities, Atlas Vector Search can require thoughtful setup, especially for advanced use cases. Users must select and fine-tune embedding models to fit specific use cases. Additionally, they must either rely on serverless model APIs or build and maintain infrastructure to host models themselves. Each insert of new data and search query requires independent API calls, adding operational overhead. As applications scale or when models need updating, managing these new data types in clusters introduces additional friction. Finally, integrating rerankers further complicates the workflow by requiring separate API calls and custom handling for reordering results. By natively bringing Voyage AI's industry-leading models to MongoDB Atlas, we will eliminate these burdens and introduce new capabilities that empower customers to deliver highly relevant query results with simplicity. MongoDB is actively integrating Voyage's embedding and reranking models into Atlas to deliver a truly native experience. These deep integrations will not only simplify the developer workflow but will also enhance accuracy, performance, and cost efficiency - all without the usual complexity of tuning disparate systems. And our ongoing commitment to partnering with innovative companies across AI and tech ensures that models from various providers remain supported within a collaborative ecosystem. However, adopting the native Voyage models allows developers to focus on building their applications while achieving the highest quality of information retrieval. Figure 3. Enhanced AI-powered retrieval with MongoDB and Voyage AI. As we work on these native integrations, we're actively exploring advanced capabilities to further enhance the Atlas platform. Our investigations focus on: Defining the optimal approach to multi-modal information retrieval, integrating diverse inputs like text and images for richer results. Developing instruction-tuned retrieval, which allows concise prompts to precisely guide model interpretations, ensuring searches align closely with user intent. For example, enabling a search for “shoes” to prioritize sneakers or dress shoes, depending on user behavior and preferences. Determining the best ways to integrate domain-specific models tailored to the unique needs and use cases of industries such as legal, finance, and healthcare to achieve superior retrieval accuracy. Making it easy to update and change models without impacting availability. Bringing additional AI capabilities into our expressive aggregation pipeline language Improving the ability to automatically assess model performance, with the potential to offer this capability to customers. Building the future of AI-powered search From RAG pipelines to AI-powered customer experiences, information retrieval is the backbone of real-world AI applications. Voyage’s models strengthen this foundation by surfacing better documents and improving final LLM outputs. We are building this future around four core principles, with accuracy at the forefront: Accurate: ensuring the precision of information retrieval is always our top priority, empowering applications to achieve production-grade quality and mass adoption. Seamless: built into existing developer workflows. Scalable: optimized for performance and cost. Composable: open, flexible, and deeply integrated. By embedding Voyage into Atlas, MongoDB offers the best of both worlds: industry-leading retrieval models inside a fully managed, developer-friendly platform. This unified platform allows models and data to work together seamlessly, empowering developers to build scalable, high-performance AI applications with precision at their core. Join our MongoDB Community to learn about upcoming events, hear stories from MongoDB users, and connect with community members from around the world.

April 24, 2025