Comprehensive Guide to Optimising MongoDB Performance

Srinivas Mutyala7 min read • Published Jul 09, 2024 • Updated Jul 09, 2024

MongoDB

Rate this article

MongoDB is celebrated for its high performance and scalability, making it a popular choice among NoSQL databases. However, to fully leverage its potential, fine-tuning your MongoDB deployment is essential. This guide outlines various strategies and best practices for enhancing MongoDB performance, covering everything from identifying bottlenecks to optimizing queries and hardware.

Understanding your workload

Before diving into performance tuning, it's crucial to understand your workload. MongoDB's performance can vary significantly based on whether your application is read-heavy, write-heavy, or a balanced mix. Utilize tools like MongoDB's Atlas Profiler or the open-source mongostat to analyze your database operations and gain insights into your workload.

Indexing for performance

Effective indexing is one of the most impactful ways to enhance query performance in MongoDB. Here are key practices:

Create relevant indexes: Tailor indexes to match your application's query patterns. Use the [explain() method to understand query behavior and optimize accordingly.

db.collection.find({ field: value }).explain("executionStats")

You can also get this information from MongoDB Compass with sophisticated output as shown below.

$MongoDB Compass interface showing the 'addrInfo' collection within the 'IndexDB' database. The query { state: 'st1' } is entered into the query bar. The interface displays multiple documents with fields such as _id, state, city, pin, landmark, and lastModifiedOn. Buttons for 'Add Data', 'Export Data', 'Update', and 'Delete' are visible below the query bar, along with options for explaining the query, resetting, and finding documents. The left sidebar lists other collections and databases.$

Avoid over-indexing: While indexes improve query speed, they can hinder write operations and consume additional disk space. Regularly review and remove unused or unnecessary indexes.

db.collection.dropIndex("indexName")

Use compound indexes: For queries involving multiple fields, compound indexes can significantly boost performance.

db.collection.createIndex({ field1: 1, field2: -1 })

Optimising query patterns

Optimizing your query patterns is crucial for reducing execution time and resource usage:

Projection: Use projection to limit the fields returned by your queries, minimizing data transfer and processing load. Also, it’s better to exclude _id with 0 (false) if it’s not a field pertaining to the application — i.e., an auto-generated field by MongoDB. db.collection.find({ field: value }, { field1: 1, field2: 1 })
Aggregation framework: Leverage MongoDB's aggregation framework for complex data processing. Ensure aggregations utilize indexed fields where possible.

db.collection.aggregate([ { $match: { field: value } }, { $group: { _id: "$field", total: { $sum: "$amount" } } } ])

Avoid $where: The $where operator can be slow and resource-intensive. Use it sparingly and only when necessary. Instead, the use of $expr with aggregation operators that do not use JavaScript (i.e., non-$function and non-$accumulator operators) is faster than $where because it does not execute JavaScript and is preferable, when possible. However, if you must create custom expressions, $function is preferred over $where.

Hardware considerations

The hardware on which MongoDB runs plays a crucial role in its performance:

RAM: MongoDB relies heavily on RAM to store working sets. If your dataset exceeds your available RAM, consider upgrading your memory.
Storage: Utilize SSDs for storage to enhance I/O throughput and data access speeds.
Network: Ensure your network bandwidth and latency are sufficient, especially in distributed deployments.

Replication and sharding

Replication and sharding MongoDB supports replication and sharding to improve availability and scalability:

Replication: This ensures data redundancy and high availability. Configure read preference settings to effectively route read operations across replicas. rs.initiate()

Following are the available read methods with MongoDB which you can configure at the application level.

primary: Reads from the primary only
primaryPreferred: Reads from the primary if available, otherwise from a secondary
secondary: Reads from a secondary only
secondaryPreferred: Reads from a secondary if available, otherwise from the primary
nearest: Reads from the nearest node based on network latency and operational health

Example: Setting read preferences in application code (Node.js)

1 const MongoClient = require('mongodb').MongoClient;
2 const uri = "mongodb://host1,host2,host3/?readPreference=secondaryPreferred";
3 MongoClient.connect(uri, function(err, client) {
4   const db = client.db('test');
5   // Perform operations 
6 });

Sharding: This distributes data across multiple servers and is crucial for managing large datasets and high throughput operations. Choose a shard key that evenly distributes data and query load. sh.enableSharding("mydatabase") sh.shardCollection("mydatabase.mycollection", { shardKey: 1 })

Choosing a shard key in MongoDB can significantly impact performance depending on whether your workload is read-heavy or write-heavy. Here are some guidelines for selecting a shard key based on your workload:

Read-heavy workloads

Shard key selection: Choose a shard key that evenly distributes read operations across shards.

Considerations: Use a high-cardinality field that ensures even distribution of reads. Avoid shard keys that can cause hot spots where most reads target a single shard.

Example: Use a user ID if user-related queries are common.

sh.shardCollection("mydatabase.mycollection", { userID: 1 })

Write-heavy workloads

Shard key selection: Choose a shard key that balances the write load across shards.

Considerations: Use a field that changes frequently and ensures even write distribution. Avoid monotonically increasing keys (e.g., timestamps) as they can lead to a single shard being a bottleneck.

Example: Use a hashed shard key to distribute writes evenly if you can not get a unique shard key.

sh.shardCollection("mydatabase.mycollection", { hashedField: "hashed" })

Additional considerations: Monitor and adjust: Continuously monitor the performance and adjust shard keys if needed.

Indexing: Ensure indexes are aligned with the shard key for optimal query performance. By selecting the appropriate shard key and considering the nature of your workload, you can optimize your MongoDB deployment for both read and write operations.

Performance monitoring and maintenance

Regular monitoring and maintenance are vital for sustained performance:

Monitoring tools: Utilize MongoDB Atlas, mongostat, and mongotop to monitor database performance and resource usage.

mongostat --host <host> mongotop --host <host>

Routine maintenance: Regularly compact collections, repair databases, and rebalance shards to ensure optimal performance. db.repairDatabase()

Read/write concerns

The choice of write concern can influence both the performance and the durability of the data.

Performance

A lower write concern (e.g., w: 0) can enhance performance by reducing the latency of the write operation. However, it risks data durability.

Impact on latency

Lower write concern (e.g., w: 0):

Latency reduction:

The client does not wait for any acknowledgment from the server.
The operation is sent to the server and considered complete from the client's perspective.
There is no network round-trip latency as there is no need for the server to respond.

Trade-off:

There's an increased risk of data loss since the client receives no confirmation of write success.
It's suitable for non-critical data or scenarios where high write throughput is needed with minimal latency.

Higher write concern (e.g., w: 1 or w: "majority"):

Latency increase

The client waits for acknowledgement from the server.
For w: 1, waits for acknowledgment from the primary node.
For w: "majority", waits for acknowledgment from the majority of replica set members.
Network round-trip latency and server processing time add to the overall latency.
Enhanced data durability and consistency.
Ensures the write operation is replicated and acknowledged.

db.collection.insertOne({ field: "value" }, { writeConcern: { w: 1 } })

Read preferences

The choice of read preference can influence both the performance and the availability of the data.

Performance: Distributing read operations to secondary members can enhance performance by reducing the load on the primary. To successfully distribute read operations to secondary members and thereby enhance performance, you need to set the read preference in MongoDB. Here are examples of how to configure read preferences:

MongoDB Shell

db.getMongo().setReadPref("secondaryPreferred")

Connection URI

mongodb://host1,host2,host3/?readPreference=secondaryPreferred

** Application code example (NodeJS) **

1 const MongoClient = require('mongodb').MongoClient;
2 const uri = "mongodb://host1,host2,host3/?readPreference=secondaryPreferred";
3 MongoClient.connect(uri, function(err, client) {
4   const db = client.db('test');
5   // Perform operations
6 })

By setting the read preference to secondaryPreferred, you direct read operations to secondary members when they are available, reducing the load on the primary node and enhancing overall performance.

Checks to identify the common reasons for performance issues:

Run mongotop and mongostat, and check which namespace is causing the issue.
System level - check for primary replication. Is there any lag, and how is the opLog window?
Application level — check for any batch loads at the application level.
Any slow queries (with currentOp())?
Are there proper indexes?
Sharded cluster — are the majority of the queries using the shard key?
WT cache? Any evicts?
Do you see write contention?
Open files ( ulimit -a ) - 65000
Check whether the mongod process alone causes server load or any other processes.
top or htop: Monitor CPU and memory usage of mongod and other processes.
ps and grep: Run ps aux | grep mongod to view mongod resource usage.
iostat: Use iostat -x 1 10 to check disk I/O metrics.
vmstat: Run vmstat 1 10 for overall system performance snapshots.

Write contention in MongoDB can be identified by the following indicators:

High locking percentages: Use mongostat to monitor lock percentages. High values indicate contention. Slow write operations: Check for slow write operations using db.currentOp() which may indicate contention. Frequent write conflicts: Review logs for messages about write conflicts or rejections.

Increased latency: Observe increased latency in write-heavy operations or applications.

Example command to monitor lock percentages: mongostat --host <hostname>

Designing the schema properly, such as using appropriate indexes and avoiding hotspots with distributed writes, can help mitigate write contention.

Conclusion

Achieving optimal MongoDB performance involves a comprehensive approach, including query optimization, proper indexing, sufficient hardware resources, and continuous monitoring. By implementing the strategies outlined in this guide, you can significantly enhance the efficiency and responsiveness of your MongoDB deployment, ensuring it meets the demands of your applications.

Questions? Comments? Head to the MongoDB Developer Community next.

Top Comments in Forums

Srinivas_MutyalaSrinivas Mutyala3 quarters ago

Hello Community !!
Seeking your suggestions/ feedback on this article.

See More on Forums

Rate this article

Tutorial

Develop MongoDB Locally With TLS

Jan 17, 2025 | 6 min read

Quickstart

Getting Started With MongoDB and FastAPI

Jul 12, 2024 | 7 min read

Tutorial

Procedure to Allow Non-Root Users to Stop/Start/Restart "mongod" Process

May 16, 2022 | 3 min read

Tutorial

Beyond Vectors: Augment LLM Capabilities With MongoDB Aggregation Framework

Jun 20, 2024 | 16 min read

Understanding your workload
Indexing for performance
Optimising query patterns
Hardware considerations
Replication and sharding
Performance monitoring and maintenance
Read/write concerns
Read preferences
Conclusion

MongoDB

Comprehensive Guide to Optimising MongoDB Performance

Understanding your workload

Indexing for performance

Optimising query patterns

Hardware considerations

Replication and sharding

Performance monitoring and maintenance

Read/write concerns

Performance

Read preferences

MongoDB Shell

Conclusion

Top Comments in Forums

Related

Develop MongoDB Locally With TLS

Getting Started With MongoDB and FastAPI

Procedure to Allow Non-Root Users to Stop/Start/Restart "mongod" Process

Beyond Vectors: Augment LLM Capabilities With MongoDB Aggregation Framework

Table of Contents

1	const MongoClient = require('mongodb').MongoClient;
2	const uri = "mongodb://host1,host2,host3/?readPreference=secondaryPreferred";
3	MongoClient.connect(uri, function(err, client) {
4	const db = client.db('test');
5	// Perform operations
6	});