MongoDB Design Reviews Help Customers Achieve Transformative Results

Steve Jurczak
December 21, 2023

The pressure to deliver flawless software can weigh heavily on developers' minds and cause teams to second-guess their processes. While no amount of preparation can guarantee success, we've found that a design review conducted by members of the MongoDB Developer Relations team can go a long way in ensuring best practices have been followed and that optimizations are in place to help the team deliver confidently. Design reviews are hour-long sessions where we partner with our customers to help them fine-tune their data models for specific projects or use cases. They serve to give our customers a jump start in the early stages of application design when the development team is new to MongoDB and trying to understand how best to model their data to achieve their goals. A design review is a valuable enablement session that leverages the development team’s own workload as a case study to illustrate performant and efficient MongoDB design. We also help customers explore the art of the possible and put them on the right path toward achieving their desired outcomes. When participants leave these sessions, they carry the knowledge and confidence to evolve their designs independently.

The underlying principle that characterizes these reviews is the domain-driven design ethos, an indispensable concept in software engineering. Design isn't merely a box to tick; it's a daily routine for developers. Design reviews are more than just academic exercises; they hold tangible goals. A primary aim is to enable and educate developers on a global scale, transitioning them away from legacy systems like Oracle. It's about supporting developers, helping them overcome obstacles, and imparting critical education and training. Mastery of the tools is essential, and our sessions delve deep into addressing access patterns and optimizing schema for performance.

At its core, a design review is a catalyst for transformation. It's a collaborative endeavor, merging expertise and fostering an environment where innovation thrives. It's not just about reviewing. When our guidance and expertise are combined with developer innovation and talent, the journey from envisioning to implementing a robust data model becomes a shared success. During the session, our experts look at the workload's data-related functional requirements — like data entities and, in particular, reads and writes — along with non-functional requirements like growth rates, performance, and scalability. With these insights in hand, we can recommend target document schemas that help developers achieve the goals they established before committing their first lines of code. A properly designed document schema is fundamental for performant and cost-efficient operations. Getting schema wrong is often the number one reason why projects fail. Design reviews help customers avoid lost time and effort due to poor schemas.

Design reviews in practice

Not long ago, we were approached by a customer in financial services who wanted us to conduct a design review for an application they were building in MongoDB Atlas. The application was designed to give regional account managers a comprehensive view of aggregated performance data. Specifically, it aimed to provide insights into individual stock performance within a customer's portfolio across a specified time frame within a designated region.

When we talked to them, the customer highlighted an issue with their aggregation pipeline, which was taking longer than expected, ranging from 20 to 40 seconds to complete. Their SLA demanded a response time of under two seconds.

Most design reviews involve a couple of steps to assess and diagnose the problem. The first involves assessing the workload. During this step, a few of the things we look at include:

Number of collections
The documents in collections
How many records documents contain
How frequently data is being written or updated in the collections
What hours of the day see the most activity
How much storage is being consumed
Whether and how old data is being purged from collections
The cluster size the customer is running in MongoDB

Once we performed this assessment for our finserv customer, we had a better understanding of the nature and scale of the workload. The next step was examining the structure of the aggregation pipeline. What we found was that the way data was being collected had a few unnecessary steps, such as breaking down the data and then reassembling it through various $unwind and $group stages. The MongoDB DevRel experts suggested using arrays to reduce the number of steps involved to just two: first, finding the right data, and then, looking up the necessary information. Eliminating the $group stage reduced the response time to 19 seconds — a significant improvement but still short of the target.

In the next step of the design review, the MongoDB DevRel team looked to determine which schema design patterns could be applied to optimize the pipeline performance. In this particular case, there was a high volume of stock activity documents being written to the database every minute, but users were querying only a limited number of times per day. With this in mind, our DevRel team decided to apply the computed design pattern.

The computed pattern is ideal when you have data that needs to be computed repeatedly in an application. By pre-calculating and saving commonly requested data, it avoids having to do the same calculation each time the data is requested. With our finserv customer, we were able to pre-calculate the trading volume and the starting, closing, high, and low prices for each stock. These values were then stored in a new collection that the $lookup pipeline could access. This resulted in a response time of 1800 ms — below our two-second target SLA, but our DevRel team wasn't finished. They performed additional optimizations, including using the extended reference pattern to embed region data in the pre-computed stock activity so that all the related data can be retrieved with a single query and avoiding the use of a $lookup-based join. After the team was finished with their optimizations, the final test execution of the pipeline resulted in a response time of 377 ms — a 60x improvement in the performance of their aggregation pipeline and more than four times faster than the application target response time.

Read the complete story, including a step-by-step breakdown with code examples of how we helped one of our financial services customers achieve a 60x performance improvement.

If you'd like to learn more about MongoDB data modeling and aggregation pipelines, we recommend the following resources:

Daniel Coupal and Ken Alger’s excellent series of blog posts on MongoDB schema patterns
Daniel Coupal and Lauren Schaefer’s equally excellent series of blog posts on MongoDB anti-patterns
Paul Done’s ebook, Practical MongoDB Aggregations
MongoDB University Course, "M320 - MongoDB Data Modeling"

If you're interested in a Design Review, please contact your account representative.

← Previous

MongoDB Security Incident Update, December 20, 2023

The following is an update on the security incident first reported on December 16, 2023, US Eastern time (EST). For all critical alerts and advisories for MongoDB, please visit mongodb.com/alerts . We continue to find no evidence of unauthorized access to MongoDB Atlas clusters or the Atlas cluster authentication system . Based on the investigation to date, the unauthorized third party used a phishing attack to gain access to some of the corporate applications that we use to provide support services to MongoDB customers. In collaboration with outside forensic experts, we currently have a high level of confidence that the unauthorized third party has been removed from our corporate applications and that this incident is contained. Although our investigation remains ongoing, today we’re sharing additional information regarding the contact information and related account metadata that we have identified as having been exposed. The tables below show the relevant fields. CRM Application Field Name Description Salutation First Name Last Name Title Account Name Company Name Address Street Address City Address State Address Zip Address Country Phone 1 Primary Phone Phone 2 Mobile Phone 3 Fax E-Mail Owner Full Name MongoDB Sales Contact Customer Support Application td { padding: 15px 10px; font-weight: 500; border-bottom: 1px solid #b3bbc1; } .multicol td:not(:last-child) { border-right: 1px solid #b3bbc1; } tr:nth-child(even) { background: #ffffff; } table { width: 100%; border-left: none; border-right: none; } Field Type Description User Name String Username / email address for account.mongodb.com Last Auth Date/Time Time of last user authentication Last Auth Method String Last authentication method used Time Zone ID String ID for user's preferred time zone Time Zone Code String Alphabetical code for user's preferred timezone Created Date/Time User registration time First Name String User first name Last Name String User last name User ID String Internal unique user identifier Is Invite Boolean User invited but has not yet accepted invite Read Only Boolean User has limited permissions Last Page View Date/Time Last time a page was viewed by user Login Count Number Number of times a user has logged in Is Locked Boolean Indicates if user is locked, automatically or manually Is Deleted Boolean Indicates if user has been deleted Deleted Date Date/Time Time at which the user was deleted Email Last Verified Date/Time Email verification date Email Needs Verification Boolean Email needs verification Email Address String Alternate email address Has Account Multifactor Auth Boolean User is enrolled for multifactor authentication Deprecated Fields The fields below are only populated for users of our deprecated multifactor authentication (MFA) system. We released our current MFA system in January 2021. Multifactor Auth Phone String Phone number used for deprecated MFA Multifactor Auth Extension String Phone number extension used for deprecated MFA Multifactor Auth Backup Phone String Alternate phone number used for deprecated MFA Multifactor Auth Backup Phone Extension String Alternate phone number extension used for deprecated MFA Multifactor Auth Authenticator Boolean Specifies whether an authenticator device was used for deprecated MFA Multifactor Auth Voice Boolean Specifies whether a user of deprecated MFA wished to receive voice calls Unused Fields The following fields are no longer in use by any system. Multifactor Auth Update Key String May be populated for users of deprecated MFA. Field is not used by any system. Team IDs String[] Empty and unused Num Teams Number Empty and unused Status String Empty and unused Num Groups Number Empty and unused Internal Fields Roles String[] Internal field, populated only for MongoDB employee records Roles String String Internal field, populated only for MongoDB employee records In addition, we previously disclosed a list of indicators of compromise (IOCs) from which we detected unauthorized activity; that list is shared again below. Pursuant to industry best practices, we recommend that customers take the following actions using this information: Provide this list of IOCs to your security or infrastructure teams. These teams can proactively set up firewall blocks or monitoring, as appropriate. Search your application or infrastructure logs for these addresses to identify possible anomalous activity. Please be aware that threat actors will regularly change IP addresses, therefore this list is not exhaustive. Indicators of Compromise (IOC) 107.150.22.47 138.199.6.199 146.70.187.157 179.43.189.85 185.156.46.165 198.44.136.69 198.44.136.71 198.44.140.133 198.44.140.199 199.116.118.207 206.217.205.88 66.63.167.152 66.63.167.154 87.249.134.10 96.44.191.132 We also continue to recommend that customers be vigilant for social engineering and phishing attacks, activate phishing-resistant, multifactor authentication (MFA), and regularly rotate their passwords. To learn how you can enable phishing-resistant MFA on MongoDB’s native cloud authentication service, read our documentation on managing MFA options . MongoDB Cloud also supports federating your identity from your IDP, and you can read about configuring federated authentication here . Moving forward, MongoDB will post updates to mongodb.com/alerts when we have notable new information. Update as of January 3, 2024: The investigation of this incident is complete and closed. Please see the MongoDB Alerts page for more information.

December 21, 2023

Next →

Enhancing the MongoDB Atlas Go SDK with Automated Code Generation

The MongoDB Atlas API Experience team is committed to offering a seamless developer experience to customers who build and automate against the MongoDB Atlas developer data platform. We offer various programmatic tools, such as the Atlas CLI and the Terraform Atlas Provider , and maintain the Atlas Admin API with key features, including versioning and Open API specification . Our latest offering, the MongoDB Atlas Go SDK , empowers developers to manage Atlas using Go . It eliminates the need for low-level Admin API calls by utilizing higher-level abstractions. New SDK versions are automatically generated on each Atlas update, ensuring access to the latest Atlas administrative capabilities. In this post, we’ll discuss the advantages of using an SDK over direct API calls, explore our decision for code generation over manual development, share insights from engineering challenges, review our adoption experience, and discuss some next steps for the Atlas SDKs. Benefits of using an SDK over direct API calls Direct API calls offer flexibility and fine-grained control but become cumbersome for complex interactions. Developers manually handle tasks like authentication, error handling, and response parsing, which can be time-consuming and error-prone. Additionally, keeping up with API updates can require regular low-level code updates. The Atlas Go SDK simplifies development with higher-level abstractions. Pre-built functions and structs encapsulate API interactions, reducing boilerplate code. This frees developers to focus on core application logic rather than API integration intricacies. The SDK automates authentication elements, error handling, and response parsing, reducing boilerplate code and errors. Plus, staying up-to-date becomes low-effort, as new SDK versions are auto-generated with each Atlas release. While leveraging new functionalities might require some code updates, the SDK significantly simplifies integration compared to direct API calls. Moving from a manual client to an auto-generated SDK Previously, MongoDB’s bespoke Go Client for MongoDB Atlas powered internal tools and customer applications. However, maintaining a manually written client increased toil and impacted the team’s productivity. Keeping code quality high required constant effort, leading to a backlog of unsupported Atlas features and a struggle to keep pace with new functionalities. As Go-based Atlas applications proliferated, the limitations of manual maintenance became clear. We needed a more scalable solution. We had previously built a mechanism to automatically generate the Open API specification for the Atlas Admin API. This machine-readable document details how to interact with the Atlas functionality through its REST API, accessible through the “Download” option in the MongoDB Atlas Administration API portal . This led us to explore leveraging that specification for client code generation as well. Building on this foundation, the Atlas Go SDK leverages the Open API spec and a robust delivery pipeline to streamline client generation. The pipeline scans for changes in the spec and triggers the generation of a new SDK version upon detection, producing a pull request as an intermediate output. Following our review and merge, the pipeline requires no other manual steps to continue working on the release process, ultimately publishing a new SDK version. Code generation has improved output consistency and speed over manual development, freeing up resources for other impactful projects. To showcase the time savings, adding support for a new resource in the old manual Go Client could take roughly two engineer days. The new SDK reduces that time to a less-than-an-hour code review, as all other steps are automated. Major challenges and solutions Transitioning to an auto-generated SDK wasn’t without its hurdles. Selecting the right tooling was crucial, and we explored both commercial and open-source solutions for Open API-based code generation. We opted for openapi-generator due to its open-source nature. The choice prioritized flexibility and control, ensuring long-term project autonomy. It also allowed us to open-source the Go SDK generation codebase, fostering community contributions and improvements. Leveraging the Atlas Open API specification, originally intended for documentation , presented unique challenges. While this approach offered a single source of truth, we encountered discrepancies between the spec and actual API behavior. Notably, minor, non-breaking API changes sometimes resulted in significant, breaking changes in the generated client. For instance, marking a field as optional in the API (non-breaking) turns the equivalent Go model property into a pointer (breaking). This discovery necessitated a two-fold solution. We addressed the root cause by improving the Open API specification, ensuring better alignment with code generation requirements. Then we developed an automated transformation process to optimize the spec for generating code. This transformed version remains optimized for code generation while the original spec continues to serve its original purpose of live API documentation. Having a pull request review as part of the release process has helped us ensure quality. In some early cases, it allowed us to catch issues, apply SDK-wide improvements, and add linting rules to prevent reoccurrence. Maintaining compatibility with the existing Go client was crucial for an easy migration experience. However, achieving perfect compatibility wasn’t always feasible due to inconsistencies in the handwritten client. We meticulously evaluated each change, balancing compatibility with the benefits of code generation. We also created a migration guide and best practices to aid with migrating existing applications. Finally, to minimize confusion and client bloat, we decided that each Go SDK release should target a single Admin API version. That simplified integration but created a versioning challenge: Go’s semantic versioning uses major bumps for breaking changes. We needed to communicate the targeted Atlas API version within the SDK version while simultaneously denoting breaking SDK changes unrelated to API updates. Our solution was to develop a unique major versioning scheme for the Go SDK. It incorporates the Admin API’s date as a prefix, with additional digits signaling breaking SDK changes for that specific API version. While unconventional, it adheres to semantic versioning and keeps developers informed. Adoption experience Our Atlas CLI tool was the proving ground for the new Go SDK. In 2023, it became the first application to migrate from the manual Go client by leveraging pre-release versions of the Go SDK. That staged adoption approach yielded significant benefits. By migrating early, we uncovered valuable improvement areas for the SDK, directly influencing its architecture. This early feedback loop also provided crucial insights into effective SDK generation. The migration wasn’t without its complexities. It addressed cross-cutting issues across three distinct layers: the Open API spec, the SDK generation, and the CLI integration, each presenting unique development challenges. However, this comprehensive approach ensured that all layers benefitted from the process, and we raised quality across the board. Following the successful CLI integration, we confidently expanded the SDK’s reach across our product portfolio. Now, it serves as a prominent and trusted middleware solution, not only for the Atlas CLI , but also for several Atlas DevOps tools. Recent additions include the Atlas integrations for AWS CloudFormation and CDK . The Terraform Atlas Provider and the Atlas Kubernetes Operator are also on the immediate roadmap, further solidifying the SDK as a core component within our DevOps ecosystem. External adoption of the Atlas Go SDK is also gaining momentum. Over 30% of customer-developed Go-based interactions with the Atlas Admin API now leverage the new SDK, a trend we see increasing monthly. Looking ahead As Atlas continues to scale, the pace of features will only continue to accelerate. Our primary focus for the Go SDK is to continuously expose new Atlas features as they are released in the Admin API while introducing quality-of-life improvements for developers at every opportunity. We strive to reduce boilerplate code, improve Error Handling, simplify Authentication / Authorization, and to enrich documentation with high-quality examples. The success of code generation has us exploring what else can be automated. Our team is looking at which other offerings could benefit from automation to free up development time for impactful projects that might be less readily automatable. Finally, we understand that our developer community thrives on various languages and technologies. Not using Go? Let us know what other languages you’d like to see supported (e.g. Python, Java, TypeScript) by sharing your feedback . Conclusion The Atlas Go SDK empowers Go developers to streamline interactions with the Atlas cloud platform. By leveraging code generation and a focus on developer experience, the SDK offers several advantages over manual API calls. Reduced complexity: pre-built functions and higher-level abstractions simplify development by tackling authentication, error handling, and response parsing. Improved maintainability: auto-generation of updated SDK versions with each Atlas release ensures access to the latest functionalities, minimizing manual code changes. Enhanced reliability: Tailored API models promote code reliability and catch potential errors at compile time. Our in-house adoption experience was a valuable proving ground, influencing the SDK’s development and uncovering key optimization points. Today, the SDK is a cornerstone within our DevOps ecosystem, accelerating the development of downstream DevOps tools. This successful transition proved the value of code generation for developer productivity and code quality. It also opened up the possibility of generating SDKs for more programming languages in the future. The Atlas Go SDK can be a valuable asset for Go developers who build solutions on Atlas. To get started with the SDK, see our Docs Page . We also welcome community contributions, so visit our Contributing Guidelines for more details. We invite you to try it , and we would love to hear your feedback. Go build with the new Atlas Go SDK today!

May 22, 2024