Optimize With MongoDB Atlas: Performance Advisor, Query Analyzer, and More
Rate this tutorial
Optimizing MongoDB performance involves understanding the intricacies of your database's schema and queries, and navigating this landscape might seem daunting. There can be a lot to keep in mind, but MongoDB Atlas provides several tools to help spot areas where how you interact with your data can be improved.
In this tutorial, we're going to go through what some of these tools are, where to find them, and how we can use what they tell us to get the most out of our database. Whether you're a DBA, developer, or just a MongoDB enthusiast, our goal is to empower you with the knowledge to harness the full potential of your data.
As your application grows and use cases evolve, potential problems can present themselves in what was once a well-designed schema. How can you spot these? Well, in Atlas, from the data explorer screen, select the collection you'd like to examine. Above the displayed documents, you'll see a tab called "Schema Anti-Patterns."
Now, in my collection, I have a board that describes the tasks necessary for our next sprint, so my documents look something like this:
1 { 2 "boardName": "Project Alpha", 3 "boardId": "board123", 4 "tasks": [ 5 { 6 "taskId": "task001", 7 "title": "Design Phase", 8 "description": "Complete the initial design drafts.", 9 "status": "In Progress", 10 "assignedTo": ["user123", "user456"], 11 "dueDate": "2024-02-15", 12 }, 13 // 10,000 more tasks 14 ] 15 }
While this worked fine when our project was small in scope, the lists of tasks necessary really grew out of control (relatable, I'm sure). Let's pop over to our schema anti-pattern tab and see what it says.
From here, you'll be provided with a list of anti-patterns detected in your database and some potential fixes. If we click the "Avoid using unbounded arrays in documents" item, we can learn a little more.
This collection has a few problems. Inside my documents, I have a substantial array. Large arrays can cause multiple issues, from exceeding the limit size on documents (16 MB) to degrading the performance of indexes as the arrays grow in size. Now that I have identified this, I can click "Learn How to Fix This Issue" to be taken to the MongoDB documentation. In this case, the solution mentioned is referencing. This involves storing the tasks in a separate collection and having a field to indicate what board they belong to. This will solve my issue of the unbounded array.
Now, every application is unique, and thus, how you use MongoDB to leverage your data will be equally unique. There is rarely one right answer for how to model your data with MongoDB, but with this tool, you are able to see what is slowing down your database and what you might consider changing — from unused indexes that are increasing your write operation times to over-reliance on the expensive
$lookup
operation, when embedded documents would do.While you continue to use your MongoDB database, performance should always be at the back of your mind. Slow performance can hamper the user's experience with your application and can sometimes even make it unusable. With larger datasets and complex operations, these slow operations can become harder to avoid without conscious effort. The Performance Advisor provides a holistic view of your cluster, and as the name suggests, can help identify and solve the performance issues.
The Performance Advisor is a tool available for M10+ clusters and serverless instances. It monitors queries that MongoDB considers slow, based on how long operations on your cluster typically take. When you open up your cluster in MongoDB Atlas, you'll see a tab called "Performance Advisor."
This tool works by continuously monitoring query patterns and workload characteristics in real-time. Leveraging this data, the Performance Advisor automatically identifies potential performance issues and provides specific, actionable recommendations to optimize database operations.
These suggestions often include the creation of new indexes, modifications of existing ones, or the removal of redundant indexes, all aimed at reducing query execution time and improving overall efficiency. Without indexes, MongoDB has to scan all the documents in a collection to return a query result, but indexes allow MongoDB to limit the number of documents it must scan. Indexes improve your query performances, but they come at a performance cost for write operations. Because of this, a tool that can balance the pros and cons of each index becomes invaluable.
My database seems to be fine with the indexes I have at present and only spots a few areas of improvement with my schemas, but it is important to regularly check this page. That is because this page monitors a sample from the 20 most active collections. This means the performance advisor can regularly present new areas of optimization. Let's take a look at what one of these may look like.
From this example in the MongoDB documentation, we have a database containing information on New York City taxi rides. A typical query on the application would look something like this:
1 db.yellow.find({ "dropoff_datetime": "2014-06-19 21:45:00", 2 "passenger_count": 1, 3 "trip_distance": {"$gt": 3 } 4 })
With a large enough collection, running queries on specific field data will generate potentially slow operations without properly indexed collections. If we look at suggested indexes, we're presented with this screen, displaying the indexes we may want to create.
We will be suggested potential indexes in order of performance impact, from most impactful to least. Impact is an indication of potential performance improvements the suggested index will bring. If we're happy with our suggestion, we can select "create index." If we want to refine the areas for recommendations to be shown, we can select specific collections from the dropdown or reduce the time range we want to analyze our cluster’s performance.
While the MongoDB Performance Advisor excels at providing proactive, automated recommendations for indexing and general performance improvements, the MongoDB Query Profiler serves a distinct, complementary purpose. It's the go-to tool for a more in-depth and manual examination of your database's operations.
The Query Profiler allows you to capture and analyze detailed information about individual queries, including execution time and resource utilization. This level of granularity is particularly useful when you need to diagnose and troubleshoot specific performance issues that are not directly related to indexing, such as slow-running queries or inefficient query patterns. With the Query Profiler, you can identify queries that are taking longer than expected, understand their execution plans, and pinpoint the exact stages in a query that are causing bottlenecks.
In the cluster dashboard, click on the “Profiler” tab. Atlas clusters have profiling enabled by default. On the Profiler tab, you will see a list of recent queries, commands, and operations. Each entry will display key information such as operation type, namespace, execution time, and more. Look for queries with long execution times. These are your slow queries.
Click on any query to expand and view detailed information, including the query pattern, the number of documents returned, index usage, and execution statistics.
Pay attention to queries that don’t use indexes or that scan a large number of documents. Consider creating new indexes or modifying existing ones to improve query efficiency.
After making changes like indexing, monitor the Query Profiler to observe any improvements in query performance. Regularly check the Query Profiler to stay on top of your database’s performance.
Remember, effective database management is an ongoing process, and regular monitoring and optimization are key to maintaining a healthy and efficient database system.
We've seen how to examine our database, and the view to improve our data models and queries in our application, but the best method will always be starting with a good design. MongoDB Atlas offers a variety of data modeling templates that are designed to demonstrate best practices for various use cases. These templates serve as a valuable starting point, especially for those who are new to NoSQL databases or are looking to optimize their existing MongoDB schemas. To find them, go to your project overview and you'll see the "Data Toolkit." Under this header, click the "Data Modeling Templates."
These templates provide a host of examples demonstrating various use cases, from "Store Sales" to "Financial Analytics." These provide information on data modeling patterns, how best to query data stored like this, and lists of use cases in which these models can be helpful. Are you planning on storing unbounded data? Check out the "Customer Reviews" example. This demonstrates a good example of utilizing the subset pattern.
It is important to consider that even though you might not see an exact example of what your application is, these templates are there to highlight best practices in schema design. You will likely see something that you can apply or should consider with how you are using your data, and learn how best to get the most out of the document model.
Optimizing your MongoDB database is an ongoing process. As your application evolves and your data grows, it's essential to regularly monitor and fine-tune your MongoDB Atlas deployment to ensure consistently high performance.
In this tutorial, we've delved into the world of MongoDB Atlas, exploring the powerful tools and techniques available to optimize both schemas and queries. Stay proactive, monitor performance, and use the tools available to ensure your MongoDB Atlas databases run smoothly and efficiently.
If you would like to learn more about MongoDB and how best to design your data, pop over to the Developer Center and check out A Summary of Schema Design Anti-Patterns and How to Spot Them or to our Developer Community Forums to see what other people are building.