MongoDB Blog

Redefining the Database for AI: Why MongoDB Acquired Voyage AI

February 24, 2025

News

Building Gen AI with MongoDB & AI Partners | February 2025

February was big for MongoDB—and, more importantly, for anyone looking to build AI applications that deliver highly accurate, relevant information (in other words, for everyone building AI apps). MongoDB announced the acquisition of Voyage AI , a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications. Because generative AI is by nature probabilistic, models can “hallucinate”, and generate false or misleading information. This can lead to serious risks, especially in cases or industries (e.g., financial services) where accurate information is paramount. To address this, organizations building AI apps need high-quality retrieval; they need to trust that the most relevant information is extracted from their data with precision. Voyage AI’s advanced embedding and reranking models enable applications to extract meaning from highly specialized and domain-specific text and unstructured data. With roots at Stanford and MIT, Voyage AI’s world-class team is trusted by AI innovators like Anthropic, LangChain, Harvey, and Replit. Integrating Voyage AI’s technology with MongoDB will enable organizations to easily build trustworthy, AI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational data. For more, check out MongoDB CEO Dev Ittycheria’s blog post about Voyage AI , and what this means for developers and businesses (in short, delivering high-quality results at scale). Onward! P.S. If you’re in Vegas for HumanX this week, stop by booth 412 to say hi to MongoDB! Welcoming new AI and tech partners The Voyage AI news was hardly the only exciting development last month. In February 2025, MongoDB welcomed three new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! CopilotKit Seattle-based CopilotKit provides open source infrastructure for in-app AI copilots. CopilotKit helps organizations build production-ready copilots and agents effortlessly. “We’re excited to be partnering with MongoDB to help companies build best-in-class copilots that leverage RAG & take action based on internal data,” said Uli Barkai, Co-Founder and Chief Marketing Officer at CopilotKit. “MongoDB made it dead simple to build a scalable vector database with operational data. This collaboration enables developers to easily ship production-grade RAG applications.” Varonis Varonis is the leader in data security, protecting data wherever it lives—across SaaS, IaaS, and hybrid cloud environments. Varonis’ cloud-native Data Security Platform continuously discovers and classifies critical data, removes exposures, and detects advanced threats with AI-powered automation. “Varonis’s mission is to protect data wherever it lives,” said David Bass, Executive Vice President of Engineering and Chief Technology Officer at Varonis. “We are thrilled to further advance our mission by offering AI-powered data security and compliance for MongoDB, the database of choice for high-performance application and AI development. With this integration, joint customers can automatically discover and classify sensitive data, detect abnormal activities, secure AI data pipelines, and prevent data leaks.” Xlrt Xlrt is an automated insight-generation platform that enables financial institutions to create innovative financial credit products at scale by simplifying the financial spreading process. “We are excited to partner with MongoDB Atlas to transform AI-driven financial workflows,” said Rupesh Chaudhuri, Chief Operating Officer and Co-Founder of Xlrt. “XLRT.ai leverages agentic AI, combining graph-based contextualization, vector search, and LLMs to redefine data-driven decision-making. With MongoDB's robust NoSQL and vector search capabilities, we’re delivering unparalleled efficiency, accuracy, and scalability in automating financial processes.” To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem. And visit the MongoDB AI Applications Program (MAAP) page to learn how MongoDB and the MAAP ecosystem helps organizations build applications with advanced AI capabilities.

March 12, 2025

Artificial Intelligence

ORiGAMi: A Machine Learning Architecture for the Document Model

The document model has proven to be the optimal paradigm for modern application schemas. At MongoDB, we've long understood that semi-structured data formats like JSON offer superior expressiveness compared to traditional tabular and relational representations. Their flexible schema accommodates dynamic and nested data structures, naturally representing complex relationships between data entities. However, the machine learning (ML) community has faced persistent challenges when working with semi-structured formats. Traditional ML algorithms, as implemented in popular libraries like scikit-learn and pandas , operate on the assumption of fixed-dimensional tabular data consisting of rows and columns. This fundamental mismatch forces data scientists to manually convert JSON documents into tabular form—a time-consuming process that requires significant domain expertise. Recent advances in natural language processing (NLP) demonstrate the power of Transformers in learning from unstructured data but their application to semi-structured data has been under-studied. To bridge this gap, MongoDB's ML research group has developed a novel Transformer-based architecture designed for supervised learning on semi-structured data (e.g., JSON data in a document model database). We call this new architecture ORiGAMi (Object Representation through Generative, Autoregressive Modelling), and we're excited to make it available to the community at github.com/mongodb-labs/origami . It includes components that make training a Transformer model feasible on datasets entailing as few as 200 labeled samples. By combining this data efficiency with the flexibility of Transformers, ORiGAMi enables prediction directly from semi-structured documents, without the cumbersome flattening and manual feature extraction required for tabular data representation. You can read more about our model on arXiv . Technical innovation The key insight behind ORiGAMi lies in its tokenization strategy: documents are transformed into sequences of key-value pairs and special structural tokens that encode nested types like arrays and subdocuments: These token sequences serve as input to the Transformer model trained to predict the next token given a portion of the document, similar to how large language models (LLMs) are trained on text tokens. What’s more, our modifications to the standard Transformer architecture include guardrails to ensure that the model only generates valid, well-formed documents, and a novel position encoding strategy that respects the order invariance of key/value pairs in JSON. These modifications also allow for much smaller models compared to LLMs, which can thus be trained on consumer hardware in minutes to hours depending on dataset size and complexity, versus days to weeks for LLMs. By reformulating classification as a next-token prediction task, ORiGAMi can predict any field within a document, including complex types like arrays and nested subdocuments. This unified approach eliminates the need for separate models or preprocessing pipelines for different prediction tasks. Example use case Our initial focus has been supervised learning: training models from labeled data to make predictions on unseen documents. Let's explore a practical example of user segmentation. Consider a collection where each document represents a user profile, containing both simple fields and complex nested structures: { "_id": "user_7842", "email": "sarah.chen@example.com", "signup_date": "2024-01-15", "device_history": [ { "device": "mobile_ios", "first_seen": "2024-01-15", "last_seen": "2024-02-11" }, { "device": "desktop_chrome", "first_seen": "2024-01-16", "last_seen": "2024-02-10" } ], "subscription": { "plan": "pro", "billing_cycle": "annual", "features_used": ["analytics", "api_access", "team_sharing"], "usage_metrics": { "storage_gb": 45.2, "api_calls_per_day": 1250, "active_projects": 8 } }, "user_segment": "enterprise_power_user" // <-- target field } Suppose you want to automatically classify users into segments like "enterprise_power_user", "smb_growth", or "early_stage_startup" based on their behavior and characteristics. Some documents in your collection already have correct labels, perhaps assigned through manual analysis or customer interviews. Traditional ML approaches would require flattening this rich document structure, leading to very sparse tables and potentially losing important hierarchical relationships. With ORiGAMi, you can: Train directly on the raw documents with existing labels Preserve the full context of nested structures and arrays Make predictions for the "user_segment" field on new users immediately after signup Update predictions as user behavior evolves without rebuilding feature pipelines Getting started with ORiGAMi We're excited to be open-sourcing ORiGAMi ( github.com/mongodb-labs/origami ) and you can read more about our model on arXiv . We've also included a command-line interface that lets users make predictions without writing any code. Training a model is as simple as pointing ORiGAMi to your MongoDB collection: origami train <mongo-uri> -d app -c users Once trained, you can generate predictions and seamlessly integrate them back into your MongoDB workflow. For example, to predict user segments for new signups (from the analytics.signups collection ) and write the resulting predictions back to MongoDB to an analytics.predicted collection: origami predict <mongo-uri> -d analytics -c signups --target user_segment --json | mongoimport -d analytics -c predicted For those looking to dive deeper, we've also included several Jupyter notebooks in the repository that demonstrate advanced features and customization options. Model performance can be improved by adjusting the hyperparameters. We're just scratching the surface of what's possible with document-native machine learning, and have many more use cases in mind. We invite you to explore the repository, contribute to the project, and share how you use ORiGAMi to solve real-world problems. Head over to the ORiGAMi github repo , play around with it, and tell us about new ways of applying it and problems it’s well-suited to solving.

March 11, 2025

Artificial Intelligence

ZEE5: A Masterclass in Migrating Microservices to MongoDB Atlas

ZEE5 is a leading Indian over-the-top (OTT) video-streaming platform that delivers streamed content via Internet-connected devices. The platform offers a wide variety of content—movies, TV shows, web series, and original programming—across multiple genres and languages. Owned by Zee Entertainment Enterprises Limited , ZEE5 produces over 260 hours of content daily, with a monthly active user base of more than 119.5 million users across 190 countries. ZEE5’s operations and customer satisfaction are dependent on its backend infrastructure being robust and scalable to handle immense traffic and complex workflows. In order to future-proof its infrastructure and to maintain its competitive edge, the company needed to streamline operations and enhance its database management capabilities. This included the migration of its entire OTT platform, including a total of 100+ microservices and 80+ databases to Google Cloud. Pramod Prakash, Senior Vice President of Engineering at ZEE5, was on the stage of MongoDB.local Bangalore in 2024 . He shared insights into how ZEE5 managed this migration without hindering performance or disrupting its services. “It was a massive project which required a very carefully orchestrated migration plan,” said Prakash. Massive migration, zero downtime: Challenge accepted ZEE5’s team embarked on an ambitious journey to migrate a total of 40+ microservices (out of its 100+ microservices) to MongoDB Atlas . These were previously running on the Community Edition of MongoDB and on other NoSQL databases. One of the challenges of this migration was to ensure continuous data flow for the platform’s 119.5 million streaming users. To do so, Prakash and his team created multiple environments using a change data capture tool . This ensured continuous replication of data so the user experience would not be impacted. “We had to build four environments: dev, QA [Quality Assurance], UAT [User Acceptance Testing], and production,” explained Prakash. “We needed to keep testing and verifying each environment, and then finally enter the production phase when we migrated the data and moved the traffic.” The approach involved migrating production data twice: first for testing and then for the final cutover. This was to minimize any data loss. ZEE5 used MongoDB Atlas’ integrated tools mongosync and mongomirror . The tools helped achieve an essential goal: avoiding any downtime. “We migrated this entire mammoth application with zero downtime!” said Prakash. “We have not stopped ZEE5’s operations at all.” “The second important thing is the performance: you want to be 100% sure that the entire scale and peak traffic will work seamlessly within the new cloud environment,” added Prakash. ZEE5 relied on MongoDB Professional Services (PS)’s support. The PS team helped architect and plan the entire migration strategy. They also accompanied Prakash’s team step by step to ensure there would be no unexpected disruptions. The production environment was built and tested rigorously before the final migration to ensure seamless performance at peak traffic levels. “We iterated until we were 100% sure that the new environment was ready to take up ZEE5’s peak traffic. Functionally, it was all perfect,” said Prakash. The power of the Atlas platform According to Prakash, the power of MongoDB Atlas lies in the fact that it offers a fully managed platform. “There is no maintenance overhead at all,” he said. “All upgrades happen automatically without any downtime. We are also leveraging auto-scaling capabilities and point-in-time recovery.” All of this enables efficient handling of varying traffic loads without manual intervention. Additionally, data recovery capabilities are enhanced, and most importantly, the engineering team can prioritise application development rather than operational maintenance. As of February 2025, MongoDB Atlas supports a total of seven key use cases at ZEE5: payments, subscriptions, plans and coupons, video engineering, Zee Music (users’ preferences and playlists), content metadata, and the platform’s communication engine (SMS and email notifications). Looking ahead, ZEE5 is working on more use cases powered by MongoDB. For example, the company is looking to completely migrate their master data source for content metadata to MongoDB Atlas. ZEE5 is also considering relying on MongoDB Atlas to support and enhance its search and recommendations capabilities. Interested in learning how MongoDB is powering other companies applications? Head over to our customer case studies hub to read the latest stories. Visit our product page to learn more about MongoDB Atlas .

March 11, 2025

Applied

Debunking MongoDB Myths: Security, Scale, and Performance

MongoDB has come a long way since its founding in 2007. Many people first encountered MongoDB during its early years. They formed opinions about the database based on impressions from 2012 to 2014. However, much has changed since then. Over the past eleven years, MongoDB has made significant strides. Foremost being the launch of MongoDB Atlas in 2016. It has placed a substantial focus on improving the four critical areas that matter most to businesses and developers alike: security, durability, availability, and performance. Security: Protecting sensitive data from unauthorized access and ensuring regulatory compliance. Durability: Ensuring data remains intact and reliable, even during system failures or unexpected disruptions. Availability: Minimizing downtime and maintaining system operation, no matter what happens. Performance: Delivering fast, consistent application response times and scaling efficiently to meet growing demand. These advancements have earned the trust of some of the world’s largest enterprises, including Toyota , Cisco , Wells Fargo , Bosch , and Verizon . Yet despite this progress, outdated myths regarding MongoDB persist—particularly in these four foundational areas. In this blog, we will tackle those misconceptions head on and set the record straight about MongoDB’s security, durability, availability, and performance. Let’s dive in. Myth 1: “MongoDB is not as secure as a relational database” One of the most persistent myths about MongoDB is that it is not secure—certainly not as secure as traditional relational databases. This misconception likely stems from a series of ransomware attacks in the mid-2010s. Hackers exploited unsecured databases that lacked proper authentication and were left exposed on default TCP ports. While these incidents highlighted poor configuration practices, they have unfairly cast a shadow over MongoDB’s contemporary security capabilities. MongoDB provides robust, intelligent security features designed to protect sensitive data at every stage of its lifecycle. MongoDB encrypts data both in transit and at rest , just like other leading NoSQL and relational databases. However, what sets MongoDB apart is its ability to keep data encrypted while in use. With Queryable Encryption , an industry-first innovation unique to MongoDB, sensitive data can remain encrypted even while it is queried. This eliminates the need to decrypt the data and reduces exposure to threats. MongoDB also supports flexible authentication and authorization that seamlessly integrates with many identity management systems. Features like role-based access control and fine-grained permissions ensure users only have access to what they are authorized for. Concurrently, intuitive configuration makes these controls easy to implement. Beyond encryption and access control, MongoDB includes powerful auditing tools to monitor database activity and advanced network security features, such as IP allow-listing and private networking . Together, these capabilities provide comprehensive protection against unauthorized access and help organizations meet strict compliance requirements. Best of all, these advanced security features are included by default in both MongoDB Atlas and MongoDB Enterprise Advanced at zero cost. MongoDB’s approach simplifies security management while minimizing expenditure. This allows teams to focus on building applications with confidence that their data is protected. Myth 2: “MongoDB’s multi-cloud capabilities do not set it apart from other databases” At first glance, the claim that MongoDB is multi-cloud may not sound special. After all, plenty of databases are available through more than one cloud provider - however, this should not be confused with them all being multi-cloud . True multi-cloud supports ‘cross-cloud’ deployments, i.e. the ability to deploy individual nodes of a single cluster across multiple cloud providers. This distinction is often obfuscated by those vendors unable to run their clusters in such a configuration. Support for multi-cloud clusters in Atlas became generally available in October of 2020. MongoDB Atlas enables deployment not only on Amazon Web Services (AWS), Microsoft Azure, or Google Cloud but also across all three clouds simultaneously with a single cluster. It is possible to set up and configure cross-cloud deployments solely from the Atlas management console. No further configuration is required via the individual cloud providers. This is more than just a convenience; it is a transformative capability that eliminates the boundaries between cloud providers. With MongoDB Atlas, it is as if AWS, Azure, and Google Cloud operate as one unified cloud environment. Why does this matter? For starters, deploying a single database cluster across multiple clouds removes the operational complexity of managing data replication and migration between providers. Seamless data mobility can be achieved. The hardest part of any application to move—the data—now becomes the easiest. Multicloud also enables the creation of application architectures that exploit the best services from multiple cloud providers simultaneously. In addition, cross-cloud deployments deliver unmatched resiliency. With cross-cloud failover, in the event of an outage, data can be automatically switched to another cloud provider in the same geographic region. Thus ensuring uninterrupted service. Finally, MongoDB Atlas provides the flexibility to meet regional and cloud provider preferences with ease. Atlas spans 115+ supported regions across all three major cloud providers . This makes it easy to meet customer demands or comply with local regulations using a single database. MongoDB Atlas gives us the ability to run our database on multiple clouds through the same service. With Atlas, we have the freedom from lock-in—each client can choose where they are the most comfortable hosting their data. Gary Hoberman, CEO and Founder - Unquork Myth 3: “I get that MongoDB is built for horizontal scaling, but it is so painful to scale” Horizontal scaling, also known as scale-out, is a core strength of MongoDB. It allows workloads to be distributed by adding more nodes as data and applications expand. However, some beliefs have perpetuated that scaling MongoDB is difficult and complex. The reality? MongoDB makes scaling not just possible, but seamless—whether scaling out horizontally or scaling up vertically. With MongoDB Atlas, vertical scaling—or scale-up—is simple. By enabling auto-scaling , MongoDB Atlas dynamically adjusts cluster resources to meet workload demands. Adding more RAM, CPU, or storage capacity can be performed automatically and on-demand. This ensures optimal performance without continual manual intervention or oversight. If you need to move beyond vertical scaling, MongoDB offers three flexible ways to scale horizontally : Hashed sharding : Data is distributed randomly across nodes using a hashed shard key. This ensures an even distribution of data and workloads to prevent bottlenecks. Ranged sharding : Data is distributed based on ranges of a specific field. This enables fine-grained control over how data is divided. This approach is especially useful for preventing hotspots in workloads. Zone sharding : Data is distributed geographically. This enables compliance with data residency requirements and reduces latency by keeping data closer to users. What happens if the initial sharding strategy does not go as planned? MongoDB addresses this challenge with the ability to refine shard keys and reshard a collection with zero downtime. This ensures data distribution strategies can adapt as needs evolve, all without disrupting applications or users. Myth 4: “Since MongoDB is built for flexibility, it must not be very performant” One common misconception about MongoDB is that its flexibility and versatility must come at the expense of performance. After all, can such an agile database—one built for developers to model data however they want—really deliver the speed and efficiency of a performance-first solution? MongoDB is designed to provide both; unmatched flexibility and exceptional performance —all while keeping costs low. MongoDB’s performance stems from its intelligent architecture and powerful features. Ad hoc queries, indexing , and real-time aggregations make it easy to access and analyze data quickly. How fast are queries? Primary key or indexed queries typically execute in milliseconds. Even complex queries that are not indexed remain efficient. Performance typically is dependent on factors like collection size and machine specifications. What about workloads like search and analytics? Some developers might assume these would compete for resources and degrade performance on operational tasks. However, MongoDB solves this with workload isolation . This feature ensures that operational and nonoperational workloads are separated. This enables each to run at peak performance without requiring costly and time-consuming extract, transform, and load (ETL) processes. Network latency? For globally distributed applications, MongoDB’s hedged reads enable the nearest replica nodes to be read from rather than waiting for a response from distant nodes. This reduces latency and ensures applications remain highly responsive. MongoDB’s real-world performance is backed by incredible use cases: Amadeus processes 630 million bookings per year. Idealo supports 200,000 queries and 60,000 updates per second. Temenos achieves 150,080 transactions per second. This was before the release of MongoDB 8.0 , the most performant version of the database yet. MongoDB 8.0 has delivered: 36% faster reads 32% faster reads and updates 56% faster bulk inserts A stunning 200% improvement for time series queries MongoDB Atlas doesn’t just solve our performance issues. It makes life easier for web developers, who can build and maintain simpler, more straightforward code. Moutia Khatiri, CTO - Tech Accelerator, L’Oreal MongoDB Today MongoDB has evolved far beyond the myths perpetuated during its early years. MongoDB 8.0 delivers robust capabilities across security, durability, availability, and performance. It encrypts sensitive data throughout its lifecycle and enables seamless cross-cloud deployments. It simplifies horizontal and vertical scaling and powers some of the world’s most demanding applications. These capabilities solidify MongoDB’s position as the database of choice for modern applications. Read about more MongoDB myths and misconceptions in our previous two posts in this series: Debunking MongoDB Myths: Enterprise Use Case Busting the Top Myths About MongoDB vs Relational Databases Don't be held back by outdated misconceptions. Experience the innovation and performance of MongoDB. Start using MongoDB Atlas for free today . Or, to learn more about MongoDB, head over to MongoDB University and take our free Intro to MongoDB course .

March 10, 2025

Applied

Advancing Encryption in MongoDB Atlas

Maintaining a strong security posture and ensuring compliance with regulations and industry standards are core responsibilities of enterprise security teams. However, satisfying these responsibilities is becoming increasingly complex, time-consuming, and high-stakes. The rapid evolution of the threat landscape is a key driver of this challenge. In 2024, the percentage of organizations that experienced a data breach costing $1 million or more jumped from 27% to 36%. 1 This was partly fueled by a 180% surge from 2023 to 2024 in vulnerability exploitation by attackers. 2 Concurrently, regulations are tightening. Laws like the Health Insurance Portability and Accountability Act (HIPAA) 3 and the U.S. Securities and Exchange Commission’s cybersecurity regulations 4 have introduced stricter security requirements. This has raised the bar for compliance. Thousands of enterprises rely on MongoDB Atlas to protect their sensitive data and support compliance efforts. Encryption plays a crucial role on three levels; securing data at rest, in transit, and in use. However, security teams need more than solely strong encryption. Flexibility and control are essential to align with an organization’s specific requirements. MongoDB is introducing significant upgrades to MongoDB Atlas encryption to meet these needs. This includes enhanced customer-managed key (CMK) functionality and support for TLS 1.3. This post explores these improvements, along with the planned deprecation of outdated TLS versions, to strengthen organizations’ security postures. Why customer-managed keys (CMKs) matter Customer-managed keys (CMKs) are a security and data governance feature that delivers enterprises full control over the encryption keys protecting their data. With CMKs, customers can define and manage their encryption strategy. This ensures they have ultimate authority over access to their sensitive information. MongoDB Atlas customer key management provides file-level encryption, similar to transparent data encryption (TDE) in other databases. This customer-managed encryption-at-rest feature works alongside always-on volume-level encryption 5 in MongoDB Atlas. CMKs ensure all database files and backups are encrypted. MongoDB Atlas also integrates with AWS Key Management Service (AWS KMS), Azure Key Vault , and Google Cloud KMS . This ensures customers have the flexibility to manage keys as part of their broader enterprise security strategy. Customers using CMKs retain complete control of their encryption keys. If an organization needs to revoke access to data due to a security concern or any other reason, it can do so immediately by freezing or destroying the encryption keys. This capability acts as a “kill switch,” ensuring sensitive information becomes inaccessible when protection is critical. Similarly, an organization can destroy the keys to render the data and backups permanently unreadable and irretrievable. This may be applicable should they choose to retire a cluster permanently. Announcing CMK over private networking As part of a commitment to deliver secure and flexible solutions for enterprise customers, MongoDB is introducing CMKs over private networking. This enhancement enables organizations to manage their encryption keys without exposing their key management service (KMS) to the public internet. Using CMKs in MongoDB Atlas previously required Azure Key Vault and AWS KMS to be accessible via public IP addresses prior to today. While functional, this posed challenges for customers who need to keep KMS traffic private. It forced those customers to either expose their KMS endpoints or manage IP allow lists. By using private networking, customers can now: Eliminate the need for public IP exposure. Simplify network management by removing the need to manage allowed IP addresses. This reduces administrative effort and misconfiguration risk. Align with organizational requirements that mandate the use of private networking. Customer key management over private networking is now available for Azure Key Vault and AWS KMS . Customers can enable and manage this feature for all their MongoDB Atlas projects through the MongoDB Atlas UI or the MongoDB Atlas Administration API . More enhancements are coming for MongoDB customer key management in 2025. These include secretless authentication mechanisms and CMKs for search nodes. MongoDB Atlas TLS enhancements advance encryption in transit Securing data in transit is equally vital as a foundation of encryption at rest with CMKs. To address this, MongoDB Atlas enforces TLS by default. This ensures encrypted communication across all aspects of the platform, including client connections. Now MongoDB is reinforcing its TLS implementation with key enhancements for enterprise-grade security. MongoDB is in the process of rolling out fleetwide support for TLS 1.3 in MongoDB Atlas. The latest version of the cryptographic protocol offers several advantages over its predecessors. This includes stronger security defaults, faster handshakes, and reduced latency. Concurrently, TLS versions 1.0 and 1.1 are being deprecated. The rationale for this is known weaknesses and their inability to meet modern security standards. MongoDB is aligning with industry best practices by standardizing on TLS 1.2 and 1.3. This ensures a secure communication environment for all MongoDB Atlas users. Additionally, MongoDB now offers custom cipher suite selection, giving enterprises more control over their cryptographic configurations. This feature lets organizations choose the cipher suites for their TLS connections, ensuring compliance with their security requirements. Achieving encryption everywhere This post covers how MongoDB secures data at rest with CMKs and in transit with TLS. However, what about data in use while it’s being processed in a MongoDB Atlas instance? That’s where Queryable Encryption comes in. This groundbreaking feature enables customers to run expressive queries on encrypted data without ever exposing the plaintext or keys outside the client application. Sensitive data and queries never leave the client unencrypted. This ensures sensitive information is protected and inaccessible to anyone without the keys, including database administrators and MongoDB itself. MongoDB is committed to providing enterprise-grade security that evolves with the changing threat and regulatory landscapes. Organizations now have greater control, flexibility, and protection across every stage of the data lifecycle with enhanced CMK functionality, TLS 1.3 adoption, and custom cipher suite selection. As security challenges grow more complex, MongoDB continues to innovate to enable enterprises to safeguard their most sensitive data. To learn more about these encryption enhancements and how they can strengthen your security posture, visit MongoDB Data Encryption . 1 PwC , October 2024 2 Verizon Data Breach Investigations Report , 2024 3 U.S. Department of Health and Human Services , December 2024 4 U.S. Securities and Exchange Commission , 2023 5 MongoDB Atlas Security White Paper , Encryption at Rest section page 12

March 5, 2025

Applied

AI-Powered Java Applications With MongoDB and LangChain4j

MongoDB is pleased to introduce its integration with LangChain4j , a popular framework for integrating large language models (LLMs) into Java applications. This collaboration simplifies the integration of MongoDB Atlas Vector Search into Java applications for building AI applications. The advent of generative AI has opened up many new possibilities for developing novel applications. These advancements have led to the development of AI frameworks that simplify the complexities of orchestrating and integrating LLMs and the various components of the AI stack , where MongoDB plays a key role as an operational and vector database. Simplifying AI development for Java The first AI frameworks to emerge were developed for Python and JavaScript, which were favored by early AI developers. However, Java remains widespread in enterprise software. This has led to the development of LangChain4j to address the needs of the Java ecosystem. While largely inspired by LangChain and other popular AI frameworks, LangChain4j is independently developed. As with other LLM frameworks, LangChain4j offers several advantages for developing AI systems and applications by providing: A unified API for integrating LLM providers and vector stores. This enables developers to adopt a modular approach with an interchangeable stack while ensuring a consistent developer experience. Common abstractions for LLM-powered applications, such as prompt templating, chat memory management, and function calling, offering ready-to-use building blocks for common AI applications like retrieval-augmented generation (RAG) and agents. Powering RAG and agentic systems with MongoDB and LangChain4j MongoDB worked with the LangChain4j open-source community to integrate MongoDB Atlas Vector Search into the framework, enabling Java developers to develop AI-powered applications from simple RAG to agentic applications. In practice, this means developers can now use the unified LangChain4j API to store vector embeddings in MongoDB Atlas and use Atlas Vector Search capabilities for retrieving relevant context data. These capabilities are essential for enabling RAG pipelines, where private, often enterprise data is retrieved based on relevancy and combined with the original prompt to get more accurate results in LLM-based applications. LangChain4j supports various levels of RAG, from basic to advanced implementations, making it easy to prototype and experiment before customizing and scaling your solution to your needs. A basic RAG setup with LangChain4j typically involves loading and parsing unstructured data from documents stored locally or on remote services like Amazon S3 or Azure Storage using the Document API. The process then transforms and splits the data, then embeds it to capture the semantic meaning of the content. For more details, check out the documentation on core RAG APIs . However, real-world use cases often demand solutions with advanced RAG and agentic systems. LangChain4j optimizes RAG pipelines with predefined components designed to enhance accuracy, latency, and overall efficiency through techniques like query transformation, routing, content aggregation, and reranking. It also supports AI agent implementation through dedicated APIs, such as AI Services and Tools , with function calling and RAG integration, among others. Learn more about the MongoDB Atlas Vector Search integration in LangChain4j’s documentation . MongoDB’s dedication to providing the best developer experience for building AI applications across different ecosystems remains strong, and this integration reinforces that commitment. We will continue strengthening our integration with LLM frameworks enabling developers to build more-innovative AI applications, agentic systems, and AI agents. Ready to start building AI applications with Java? Learn how to create your first RAG system by visiting our tutorial: How to Make a RAG Application With LangChain4j .

March 4, 2025

Artificial Intelligence

MongoDB 8.0: Eating Our Own Dog Food

Key Takeaways We achieve real-world testing by adopting release candidates (RCs) on our internal production systems before finalizing a release. Our diverse internal workloads delivered unique insights. For instance, an internal cluster’s upgrade identified a rare MongoDB server crash and an inefficiency for a specific query shape introduced by a new MongoDB 8.0 feature. Issues encountered while testing MongoDB 8.0 internally were fixed proactively before they went out to customers. For example, during an upgrade to an 8.0 RC, one of our internal databases crashed and the issue was fixed in the next RC. Prerelease testing uncovered gaps in our automated testing, leading to improved coverage with additional tests. Using MongoDB 8.0 internally on mission-critical internal systems demonstrated its reliability. This gave customers confidence that the release could handle their demanding workloads, just as it did for our own engineering teams. Release jitters Every software release, whether it’s a new product or an update of an existing one, comes with an inherent risk: what if users encounter a bug that the development team didn’t anticipate? With a mission-critical product like MongoDB 8.0 , even minor issues can have a significant impact on customer operations, uptime, and business continuity. Unfortunately, no amount of automated testing can guarantee how MongoDB will perform when it lands with customers. So how does MongoDB proactively identify and resolve issues in our software before customers encounter them, thereby ensuring a seamless upgrade experience and maintaining customer trust? Catching issues before you do To address these challenges, we employ a combination of methods to ensure reliability. One approach is to formally model our system to prove the design is correct, such as the effort we undertook to mathematically model our protocols with lightweight formal methods like TLA+. Another method is to prove reliability empirically by dogfooding. Dogfooding (🤨)? Eating your own dog food—aka eating your own pizza, aka “dogfooding”—refers to a development process where you put yourself in customers’ shoes by using your own product in your own production systems. In short: you’re your own customer. Why dogfood? Enhanced product quality: Testing in a controlled environment can’t replicate the edge cases of true-to-life workloads, so real-world scenarios are needed to ensure robustness, reliability, and performance under diverse conditions. Early identification of issues: Testing internally surfaces issues earlier in the release process, enabling fixes to be deployed proactively before customers encounter them. Build customer empathy: Acting as users provides direct insight into customer pain points and needs. Engineers gain firsthand understanding of the challenges of using their product, informing more customer-centric solutions. Without dogfooding, things like upgrades are taken for granted and customer pain points can be overlooked. Boost credibility and trust: Relying on our own software to power critical internal systems reassures customers of its dependability. Dogfooding at MongoDB MongoDB has a strong dogfooding culture. Many internal services are built with MongoDB and hosted on MongoDB Atlas , the very same setup we provide our customers. Eating our own dog food is essential to our customer mindset. Because internal teams work alongside MongoDB engineers, acting as users bridges the gap between MongoDB engineers and their customers. Additionally, real-life workloads vet our software and processes in a way automated testing cannot. Release dogfooding With the release of MongoDB 8.0, the company decided to take dogfooding one step further. Driven by a company-wide focus on making 8.0 the most performant version of MongoDB yet, we embarked on an ambitious plan to dogfood the release candidates within our own infrastructure. Before, our release process looked like this: Figure 1. Releases without real-world testing. We wanted it to look more like this: Figure 2. Releases pregamed on internal clusters. Adding internal testing to the release process allows us to iterate long before we make the product available to customers. Whereas in the past we’d release and fix issues reactively as customers encountered them, using the release internally, before it got into customers’ hands, would uncover edge cases so we could fix them proactively. By acting as our own customers, we remove our real customers from the development cycle and build confidence in the release. The confidence team To tackle upgrades effectively, we assembled a cross-functional team of MongoDB engineers, Atlas SREs, and internal service developers. A technical program manager (TPM) was assigned to the effort to track progress and coordinate efforts across the team. Together, we enumerated the databases, scheduled upgrade dates, and assigned directly responsible individuals (DRIs) to each upgrade. To streamline communication, we created an internal Slack channel and invited everyone on the team to it. We agreed on a playbook: with the support of the team, the assigned DRI would upgrade their cluster and monitor for any issues. If something came up we would create a ticket in an internal Jira project and mention it in Slack for visibility. I took on the role of DRI for Evergreen database upgrades. Evergreen My team maintains the database clusters for Evergreen , MongoDB’s bespoke continuous integration (CI) system. Evergreen is responsible for running automated tests at scale against MongoDB, Atlas, the drivers, Evergreen itself, and many other products. At last count, each day Evergreen executes, in parallel, roughly ten years of tests per day and is on the critical path for many teams at the company. Evergreen runs on two separate clusters in Atlas: the application’s main replica set and a smaller one for our background job coordinator, Amboy . In terms of scale, the main replica set contains around 9.5TB of data and handles 1 billion CRUD operations per day, while the Amboy cluster contains about 1TB of data and handles 100 million CRUD operations per day. Because of Evergreen’s criticality to the development cycle, historically we’ve taken a cautious approach to any operational changes and database upgrades were not a priority. The initiative to dogfood our internal clusters changed our approach—we were going to use 8.0 before it went out to customers. Enabling a feature flag in Atlas made the RC build available in our Atlas project before it was available to customers. A showstopper Our first target was the Amboy cluster, which handles background jobs for Evergreen. I clicked the button to upgrade our Amboy cluster and we held our collective breath. Atlas upgrades are rolling. This means an upgrade is applied iteratively to each secondary in the cluster until finally the primary is stepped down and upgraded. Usually this works well since any issues will at most affect just a secondary, but in our case it didn’t work out. The secondaries’ upgrades succeeded, but when the primary was stepped down, each node that won the election to be the next primary crashed. The result was that our cluster had no primary and the Amboy database was unavailable, which threw a monkey-wrench in our application. We sounded the alarm and an investigation commenced ASAP. Stack traces, logs, and diagnostics were captured and the cluster was downgraded to 7.0. As it turned out, we’d hit an edge case that was triggered by a malformed TTL index specification with a combination of two irregularities: Its expireAfterSeconds was not an integer. It contained a weights field , which is not valid in an index that’s not a text index . Both irregularities were previously allowed, but became invalid due to strengthened validation checks. When a node steps up to primary, it corrects these malformed index specifications, but in that 8.0 RC if there were two things wrong with an index it would go down an execution path that ended in a segfault. This bug only occurs when a node steps up to primary, which is why it brought down our cluster despite the rolling upgrade. SERVER-94487 was opened to fix the bug and the fix was rolled into the next RC. When the RC was ready, we upgraded the Amboy database again and the upgrade succeeded. Not a showstopper Next up was the main database cluster for the Evergreen application. We performed the upgrade, and at first all indications were that the upgrade was a success. However, on further inspection a discontinuous jump had appeared in two of the Atlas monitoring graphs. Before the upgrade our Query Executor graph usually looked like this: Figure 3. Query Executor graph before the upgrade. Whereas after the upgrade it looked like this: Figure 4. Query Executor graph after the upgrade. This represented roughly a 5x increase in the rate per second of index keys and documents scanned by queries and query plans. Similarly, the Query Targeting graph looked like this before the upgrade: Figure 5. Query Targeting graph before the upgrade. Whereas after the upgrade it looked like this: Figure 6. Query Targeting graph after the upgrade. This also represented roughly a 5x increase to the ratio of scanned index keys and documents to the number of documents returned. Both these graphs indicated there was at least one query that wasn’t using indexes as well as it had been before the upgrade. We got eyes on the cluster and it was determined that a bug in index pruning (a new feature introduced in 8.0) was causing the query planner to remove the most efficient index for a contained $or query shape. This is when a query contains an $or branch that isn’t the root of the query predicate, such as A and (C or B) . For the 8.0 release this was listed as a known issue and disabled in Atlas, and index pruning was disabled entirely by the 8.0.1 release until we can fix the underlying issue in SERVER-94741 . Other clusters Other teams’ clusters followed suit, but their upgrades went off without a hitch. It’s to be expected that the particulars of each dataset and workload would trigger various edge cases. Evergreen’s clusters hit some while the rest did not. This brings out an important lesson: testing against a variegated set of live workloads raises the likelihood we’ll encounter and address all the issues our customers would have encountered. Continuous improvement Although we caught these issues before they reached customers, our shift-left mindset motivates us to catch them earlier in the process through automated testing. As part of this effort, we plan to add additional tests focused on upgrades from older versions of the database. The index pruning issue, in particular, was part of the inspiration for us to investigate property based testing –an approach that has already uncovered several new bugs ( SERVER-89308 ). SERVER-92232 will introduce a property based test specifically for index pruning. What’s next? All told, the exercise was a success. The 8.0 upgrade reduced Evergreen’s operation execution times by an order of magnitude: Figure 7. Drastically faster database operations after the upgrade. For customers, dogfooding uncovered novel issues and gave us the chance to fix them before they could disrupt customer workloads. By the time we cut the release we were confident we were providing our customers a seamless upgrade. Through the dogfooding process we discovered additional internal teams with services built on MongoDB. And now we’re further leaning in on dogfooding by building out a formal framework that will include those teams and their clusters. For the next release, this will uncover even more insights and provide greater confidence. Looking ahead, as our CTO aptly put it , "all customers demand security, durability, availability, and performance" from their technology. Our commitment to eating our own dogfood directly strengthens these very pillars. It's a commitment to our customers, a commitment to innovation, and a commitment to making MongoDB the best database in the world. Join our MongoDB Community to learn about upcoming events, hear stories from MongoDB users, and connect with community members from around the world.

March 3, 2025

Engineering Blog

Secure by Default: Mandatory MFA in MongoDB Atlas

On March 26, 2025, MongoDB will start rolling out mandatory multi-factor authentication (MFA) for MongoDB Atlas users. While MFA has long been supported in Atlas, it was previously optional. MongoDB is committed to delivering customers the highest level of security, and the introduction of mandatory MFA adds an extra layer of protection against unauthorized access to MongoDB Atlas. Note: MFA will require users to provide a second form of authentication, such as a one-time passcode or biometrics. To ensure a smooth transition, users are encouraged to set up their preferred MFA method in advance. This process should take around three minutes to set up. If MFA is not configured by March 26, 2025, users will need to enter a one-time password (OTP) sent to their registered email each time they log in. Why are we making MFA mandatory? Stealing users’ credentials is a key tactic in the modern cyberattack playbook. According to a Verizon report, stolen credentials have been involved in 31% of data breaches in the past decade, and credential stuffing is the most common attack type for web applications. 1 Credential stuffing is when attackers use stolen credentials obtained from a data breach on one service to attempt to log in to another service. These breaches are particularly harmful, taking an average of 292 days to detect and contain. 2 This rise in cyber threats has rendered password-only security inadequate. Organizations of all sizes trust MongoDB Atlas to safeguard their mission-critical applications and sensitive data. These range from global enterprises to individual developers. Therefore, to strengthen account security and to reduce the risk of unauthorized access, MongoDB is introducing mandatory MFA. The impact of MFA A large-scale study by Microsoft measured the effectiveness of MFA to prevent cyberattacks on enterprise accounts. The findings indicated enabling MFA reduces the risk of account compromise by 99.22%. For accounts with previously leaked credentials, MFA still lowered the risk by 98.56%. This makes MFA one of the most effective defenses against unauthorized access. By default, requiring MFA strengthens the security of all MongoDB Atlas accounts. By reducing the risk of compromised accounts being used in broader attacks, this proactive step protects individual users and enhances MongoDB Atlas’s overall security. Ensuring strong authentication practices across the Atlas ecosystem maintains the integrity of mission-critical applications and sensitive data— and a safer experience for everyone is the result. Preparing for mandatory MFA MFA will be a prerequisite for all users when logging into MongoDB services using Atlas credentials. These services include: MongoDB Atlas user interface MongoDB Support portal MongoDB University MongoDB Forums Atlas supports the following MFA methods: Security key or biometrics: FIDO2 (WebAuthn) compliant security keys (e.g., YubiKey ) or biometric authentication (e.g., Apple Touch ID or Windows Hello) One-time password (OTP) and push notifications: Provided through the Okta Verify app Authenticator apps: Such as Twilio Authy , Google Authenticator , or Microsoft Authenticator for generating time-based OTPs Email: For generating OTPs MongoDB encourages users to choose phishing-resistant MFA methods, such as security keys or biometrics. Strengthening security with mandatory MFA Requiring MFA is a significant step that enhances MongoDB Atlas’s default security. Multi-factor authentication protects users from credential-based attacks and unauthorized access. Making MFA’s additional layer of authentication mandatory ensures greater account security. This safeguards mission-critical applications and data. To ensure a smooth transition, users are encouraged to set up their preferred MFA method before March 26, 2025. For detailed setup instructions, refer to the MongoDB documentation . And, please visit the MongoDB security webpage and Trust Center to learn more about MongoDB’s commitment to security.

February 28, 2025

Updates

Why Vector Quantization Matters for AI Workloads

Key takeaways As vector embeddings scale into millions, memory usage and query latency surge, leading to inflated costs and poor user experience. By storing embeddings in reduced-precision formats (int8 or binary), you can dramatically cut memory requirements and speed up retrieval. Voyage AI's quantization-aware embedding models are specifically tuned to handle compressed vectors without significant loss of accuracy. MongoDB Atlas streamlines the workflow by handling the creation, storage, and indexing of compressed vectors, enabling easier scaling and management. MongoDB is built for change, allowing users to effortlessly scale AI workloads as resource demands evolve. Organizations are now scaling AI applications from proofs of concept to production systems serving millions of users. This shift creates scalability, latency, and resource challenges for mission-critical applications leveraging recommendation engines, semantic search, and retrieval-augmented generation (RAG) systems. At scale, minor inefficiencies compound and become major bottlenecks, increasing latency, memory usage, and infrastructure costs. This guide explains how vector quantization enables high-performance, cost-effective AI applications at scale. The challenge: Scaling vector search in production Let’s start by considering a modern voice assistance platform that combines semantic search with natural language understanding. During development, the system only needs to process a few hundred queries per day, converting speech to text and matching the resulting embeddings against a modest database of responses. The initial implementation is straightforward: each query generates a 32-bit floating-point embedding vector that's matched against a database of similar vectors using cosine similarity. This approach works smoothly in the prototype phase—response times are quick, memory usage is manageable, and the development team can focus on improving accuracy and adding features. However, as the platform gains traction and scales to processing thousands of queries per second against millions of document embeddings, the simple approach begins to break down. Each incoming query now requires loading massive amounts of high-precision floating-point vectors into memory, computing similarity scores across an exponentially larger dataset, and maintaining increasingly complex vector indexes for efficient retrieval. Without proper optimization, the system struggles as memory usage balloons, query latency increases, and infrastructure costs spiral upward. What started as a responsive, efficient prototype has become a bottleneck production system that struggles to maintain its performance requirements while serving a growing user base. The key challenges are: Loading high-precision 32-bit floating-point vectors into memory Computing similarity scores across massive embedding collections Maintaining large vector indexes for efficient retrieval Which can lead to critical issues like: High memory usage as vector databases struggle to keep float32 embeddings in RAM Increased latency as systems process large volumes of high-precision data Growing infrastructure costs as organizations scale their vector operations Reduced query throughput due to computational overhead AI workloads with tens or hundreds of millions of high-dimensional vectors (e.g., 80M+ documents at 1536 dimensions) face soaring RAM and CPU requirements. Storing float32 embeddings for these workloads can become prohibitively expensive. Vector quantization: A path to efficient scaling The obvious question is: How can you maintain the accuracy of your recommendations, semantic matches, and search queries, while drastically cutting down on compute and memory usage and reducing retrieval latency? Vector quantization is how. It helps you store embeddings more compactly, reduce retrieval times, and keep costs under control. Vector quantization offers a powerful solution to scalability, latency, and resource utilization challenges by compressing high-dimensional embeddings into compact representations while preserving their essential characteristics. This technique can dramatically reduce memory requirements and accelerate similarity computations without compromising retrieval accuracy. What is vector quantization? Vector quantization is a compression technique widely applied in digital signal processing and machine learning. Its core idea is to represent numerical data using fewer bits, reducing storage requirements without entirely sacrificing the data’s informative value. In the context of AI workloads, quantization commonly involves converting embeddings—originally stored as 32-bit floating-point values—into formats like 8-bit integers. By doing so, you can substantially decrease memory and storage consumption while maintaining a level of precision suitable for similarity search tasks. An important point to note is that the quantization mechanism is especially suitable for use cases that involve over 1 million vector embeddings, such as RAG applications, semantic search, or recommendation systems that require tight control of operational costs without a compromise on retrieval accuracy. Smaller datasets with fewer than 1 million embeddings might not see significant gains from quantization procedures. For smaller datasets, the overhead of implementing quantization might outweigh its benefits. Understanding vector quantization Vector quantization operates by mapping high-dimensional vectors to a discrete set of prototype vectors or converting them to lower-precision formats. There are three main approaches: Scalar quantization: Converts individual 32-bit floating-point values to 8-bit integers, reducing memory usage of vector values by 75% while maintaining reasonable precision. Product quantization: Compresses entire vectors at once by mapping them to a codebook of representative vectors, offering better compression than scalar quantization at the cost of more complex encoding/decoding. Binary quantization: Transforms vectors into binary (0/1) representations, achieving maximum compression but with more significant information loss. A vector database that applies these compression techniques must effectively manage multiple data structures: Hierarchical navigable small world (HNSW) graph for navigable search Full-fidelity vectors (32-bit float embeddings) Quantized vectors (int8 or binary) When quantization is defined in the vector index, the system builds quantized vectors and constructs the HNSW graph from these compressed vectors. Both structures are placed in memory for efficient search operations, significantly reducing the RAM footprint compared to storing full-fidelity vectors alone. The table below illustrates how different quantization mechanisms impact memory usage and disk consumption. This example focuses on HNSW indexes storing 30 GB of original float32 embeddings alongside a 0.1 GB HNSW graph structure. Our RAM usage estimates include a 10% overhead factor (1.1 multiplier) to account for JVM memory requirements with indexes loaded into page cache, reflecting typical production deployment conditions. Actual overhead may vary based on specific configurations. Here are key attributes to consider based on the table below: Estimated RAM usage: Combines HNSW graph size with either full or quantized vectors, plus a small overhead factor (1.1 for index overhead). Disk usage: Includes storage for full-fidelity vectors, HNSW graph, and quantized vectors when applicable. Notice that while enabling quantization increases total disk usage —because you still store full-fidelity vectors for exact nearest neighbor queries in both cases and rescoring in the case of binary quantization—it dramatically decreases RAM requirements and speeds up initial retrieval . MongoDB Atlas Vector Search offers powerful scaling capabilities through its automatic quantization system . As illustrated in Figure 1 below, MongoDB Atlas supports multiple vector search indexes with varying precision levels: Float32 for maximum accuracy, Scalar Quantized (int8) for balanced performance with 3.75× RAM reduction, and Binary Quantized (1-bit) for maximum speed with 24× RAM reduction. The quantization variety provided by MongoDB Atlas allows users to optimize their vector search workloads based on specific requirements. For collections exceeding 1M vectors, Atlas automatically applies the appropriate quantization mechanism, with binary quantization particularly effective when combined with Float32 rescoring for final refinement. Figure 1: MongoDB Atlas Vector Search Architecture with Automatic Quantization Data flow through embedding generation, storage, and tiered vector indexing with binary rescoring. Binary quantization with rescoring A particularly effective strategy is to combine binary quantization with a rescoring step using full-fidelity vectors. This approach offers the best of both worlds: extremely fast lookups thanks to binary data formats, plus more precise final rankings from higher-fidelity embeddings. Initial retrieval (Binary) Embeddings are stored as binary to minimize memory usage and accelerate the approximate nearest neighbor (ANN) search. Hamming distance (via XOR + population count) is used, which is computationally faster than Euclidean or cosine similarity on floats. Rescoring The top candidate results from the binary pass are re-evaluated using their float or int8 vectors to refine the ranking. This step mitigates the loss of detail in binary vectors, balancing result accuracy with the speed of the initial retrieval. By pairing binary vectors for rapid recall with full-fidelity embeddings for final refinement, you can keep your system highly performant and maintain strong relevance. The need for quantization-aware models Not all embedding models perform equally well under quantization. Models need to be specifically trained with quantization in mind to maintain their effectiveness when compressed. Some models—especially those trained purely for high-precision scenarios—suffer significant accuracy drops when their embeddings are represented with fewer bits. Quantization-aware training (QAT) involves: Simulating quantization effects during the training process Adjusting model weights to minimize information loss Ensuring robust performance across different precision levels This is particularly important for production applications where maintaining high accuracy is crucial. Embedding models like those from Voyage AI— which recently joined MongoDB —are specifically designed with quantization awareness, making them more suitable for scaled deployments. These models preserve more of their essential feature information even under aggressive compression. Voyage AI provides a suite of embedding models specifically designed with QAT in mind, ensuring minimal loss in semantic quality when shifting to 8-bit integer or even binary representations. Figure 2: Embedding model performance comparing retrieval quality (NDCG@10) versus storage costs . Voyage AI models (green) maintain superior retrieval quality even with binary quantization (triangles) and int8 compression (squares), achieving up to 100x storage efficiency compared to standard float embeddings (circles) . The graph above shows several important patterns that demonstrate why quantization-aware training (QAT) is crucial for maintaining performance under aggressive compression. The Voyage AI family of models (shown in green) demonstrates strong performance in retrieval quality even under extreme compression. The voyage-3-large model demonstrates this dramatically—when using int8 precision at 1024 dimensions, it performs nearly identically to its float precision, 2048-dimensional counterpart, showing only a minimal 0.31% quality reduction despite using 8 times less storage. This showcases how models specifically designed with quantization in mind can preserve their semantic understanding even under substantial compression. Even more impressive is how QAT models maintain their edge over larger, uncompressed models. The voyage-3-large model with int8 precision and 1024 dimensions outperforms OpenAI-v3-large (using float precision and 3072 dimensions) by 9.44% while requiring 12 times less storage. This performance gap highlights that raw model size and dimension count aren't the decisive factors —it's the intelligent design for quantization that matters. The cost implications become truly striking when we examine binary quantization. Using voyage-3-large with 512-dimensional binary embeddings, we still achieve better retrieval quality than OpenAI-v3-large with its full 3072-dimensional float embeddings while using 200 times less storage. To put this in practical terms: what would have cost $20,000 in monthly storage can be reduced to just $100 while actually improving performance. In contrast, models not specifically trained for quantization, such as OpenAI's v3-small (shown in gray), show a more dramatic drop in retrieval quality as compression increases. While these models perform well in their full floating-point representation (at 1x storage cost), their effectiveness deteriorates more sharply when quantized, especially with binary quantization. For production applications where both accuracy and efficiency are crucial, choosing a model that has undergone quantization-aware training can make the difference between a system that degrades under compression and one that maintains its effectiveness while dramatically reducing resource requirements. Read more on the Voyage AI blog . Impact: Memory, retrieval latency, and cost Vector quantization addresses the three core challenges of large-scale AI workloads—memory, retrieval latency, and cost—by compressing full-precision embeddings into more compact representations. Below is a breakdown of how quantization drives efficiency in each area. Figure 3: Quantization Performance Metrics: Memory Savings with Minimal Accuracy Trade-offs Comparison of scalar vs. binary quantization showing RAM reduction (75%/96%), query accuracy retention (99%/95%), and performance gains (>100%) for vector search operations Memory and storage optimization Quantization techniques dramatically reduce compute resource requirements while maintaining search accuracy for vector embeddings at scale. Lower RAM footprint Storage in RAM is often the primary bottleneck for vector search systems Embeddings stored as 8-bit integers or binary reduce overall memory usage, allowing significantly more vectors to remain in memory. This compression directly shrinks vector indexes (e.g., HNSW), leading to faster lookups and fewer disk I/O operations. Reduced disk usage in collection with binData binData (binary) formats can cut raw storage needs by up to 66%. Some disk overhead may remain when storing both quantized and original vectors, but the performance benefits justify this tradeoff. Practical gains 3.75× reduction in RAM usage with scalar (int8) quantization Up to 24× reduction with binary quantization, especially when combined with rescoring to preserve accuracy. Significantly more efficient vector indexes, enabling large-scale deployments without prohibitive hardware upgrades. Retrieval latency Quantization methods leverage CPU cache optimizations and efficient distance calculations to accelerate vector search operations beyond what's possible with standard float32 embeddings. Faster similarity computations Smaller data types are more CPU-cache-friendly, which speeds up distance calculations. Binary quantization uses Hamming distance (XOR + popcount), yielding dramatically faster top-k candidate retrieval. Improved throughput With reduced memory overhead, the system can handle more concurrent queries at lower latencies. In internal benchmarks, query performance for large-scale retrievals improved by up to 80% when adopting quantized vectors. Cost efficiency Vector quantization provides substantial infrastructure savings by reducing memory and computation requirements while maintaining retrieval quality through compression and rescoring techniques. Lower infrastructure costs Smaller vectors consume fewer hardware resources, enabling deployments on less expensive instances or tiers. Reduced CPU/GPU time per query allows resource reallocation to other critical parts of the application. Better scalability As data volumes grow, memory and compute requirements don’t escalate as sharply. Quantization-aware training (QAT) models, such as those from Voyage AI, help maintain accuracy while reaping cost savings at scale. By compressing vectors into int8 or binary formats, you tackle memory constraints, accelerate lookups, and curb infrastructure expenses—making vector quantization an indispensable strategy for high-volume AI applications. MongoDB Atlas: Built for Changing Workloads with Automatic Vector Quantization The good news for developers is that MongoDB Atlas supports “automatic scalar” and “automatic binary quantization” in index definitions, reducing the need for external scripts or manual data preprocessing. By quantizing at index build time and query time, organizations can run large-scale vector workloads on smaller, more cost-effective clusters. A common question most developers ask is when to use quantization. Quantization becomes most valuable once you reach substantial data volumes—on the order of a million or more embeddings. At this scale, memory and compute demands can skyrocket, making reduced memory footprints and faster retrieval speeds essential. Examples of cases that call for quantization include: High-volume scenarios: Datasets with millions of vector embeddings where you must tightly control memory and disk usage. Real-time responses: Systems needing low-latency queries under high user concurrency. High query throughput: Environments with numerous concurrent requests demanding both speed and cost-efficiency. For smaller datasets (under 1 million vectors), the added complexity of quantization may not justify the benefits. However, for large-scale deployments, it becomes a critical optimization that can dramatically improve both performance and cost-effectiveness. Now that we have established a strong foundation on the advantages of quantization—specifically the benefits of binary quantization with rescoring— feel free to refer to the MongoDB documentation to learn more about implementing vector quantization. You can also learn more about Voyage AI’s state-of-the-art embedding models on our product page .

February 27, 2025

Artificial Intelligence

Hasura: Powerful Access Control on MongoDB Data

Across industries—and especially in highly regulated sectors like healthcare, financial services, and government—MongoDB has been a preferred modern database solution for organizations handling large volumes of sensitive data that require strict compliance adherence. In such enterprises, secure access to data via APIs is critical, particularly when information is distributed across multiple MongoDB databases and external data stores. Hasura extends and enhances MongoDB's access control capabilities by providing granular permissions at the column and field level across multiple databases through its unified interface. At the same time, designing a secure API system from scratch to meet this need takes significant development resources and becomes a burden to maintain and update. Hasura solves this problem for enterprises by elegantly serving as a federated data layer, with robust access control policies built-in. Hasura enforces powerful access control rules across data domains, joins data from multiple sources, and exposes it to the user via a single API. In this blog, we'll explore how Hasura and MongoDB work together to empower teams with granular data access control while simplifying data retrieval across collections. Team-specific data domains First, Hasura makes it possible for a business unit or team to own a set of databases and collections, also known as a data domain. Within each domain, a team can connect any number of MongoDB databases and other data sources, allowing the domain to have fine-grained role-based access control (RBAC) and attribute-based access control (ABAC) across all sources. More important though, is the ability to enable relationships that span domains, effectively connecting data from various teams or business units and exposing it to a verified user as necessary. This granular permissioning system means that the right users can access the right data at the right time, without compromising security. Field-level access control Hasura’s MongoDB connector also provides a powerful, declarative way to define access control rules at the collection and field level. For each MongoDB collection, roles may be specified for read, create, update, and delete (CRUD) permissions. Within those permissions, access may be further restricted based on the values of specific attributes. By defining these rules declaratively, Hasura makes it easy to implement and reason about complex access control policies. Joining across collections In addition to enabling granular access control, Hasura simplifies the retrieval of related data across multiple databases. By inspecting your MongoDB collections, Hasura can automatically create schemas and API endpoints (in GraphQL, REST, etc.) that let you query data along with its relationships. This eliminates the need to manually stitch together data from different collections in your application code. Instead, a graph of related data can be easily retrieved in a single API call, while still having that data filtered through your access control rules. As companies wrestle with the challenges of secure data access across sprawling database environments, Hasura provides a compelling solution. By serving as a federated data layer on MongoDB and external data, Hasura enables granular access control through a combination of role-based permissions, attribute-based restrictions, and the ability to join data and apply access across sources. Figure 1. Hasura & MongoDB demo environment With Hasura’s MongoDB connector , teams can easily implement sophisticated data access policies in a declarative way and provide their applications with secure access to the data they need. This combination of security and simplicity makes Hasura and MongoDB a powerful solution for organizations that strive to modernize, especially those in industries with strict compliance requirements. Visit the MongoDB Resources Hub to learn more about MongoDB Atlas.

February 26, 2025

Applied

Debunking MongoDB Myths: Enterprise Use Cases

MongoDB is frequently viewed as a go-to database for proof-of-concept (POC) applications. The flexibility of MongoDB’s document model enables teams to rapidly prototype and iterate. This allows for adaptation of the data model as requirements evolve during the early stages of application development. It is common for applications to continuously evolve during initial development. However, moving an application to production requires developers to add validation logic and fully define the data structures. A frequent assumption is that because MongoDB data models can be flexible, they can not be structured. However, while MongoDB does not require a defined schema, it does support them. MongoDB allows users to precisely calibrate rules and enforcement levels for every component of data. This enables a level of granular control that traditional databases, with their all-or-nothing approach to schema enforcement, struggle to match. Data model flexibility is not a binary choice between "schemaless" or "strictly enforced." More accurately, it exists on a spectrum in MongoDB. Users can incrementally define schemas in parallel with the overall “hardening” of the application. MongoDB's approach to data modeling makes it an ideal platform for business-critical applications. It is designed to support the entire application lifecycle; from nascent concepts and initial prototypes, to global rollouts of production environments. Enterprise-grade features like ACID transactions and industry-leading scalability ensure MongoDB can meet the demands of any modern application. Learning from the past So why do misconceptions persist regarding MongoDB? These perceptions originated over a decade ago. Teams working with MongoDB back in 2014 or earlier faced challenges when deploying it in production. Applications could slow down under heavy loads, data consistency was not guaranteed when writing to multiple documents, and teams lacked tools to monitor and manage deployments effectively. As a result, MongoDB gained a perception of being unsuitable for specific use cases or critical workloads. This perception has persisted despite a decade of subsequent development and innovation . Therefore, this is now an inaccurate assessment of today’s preeminent document database. MongoDB has evolved into a mature platform that directly addresses these historical pain points. Today’s MongoDB delivers robust tooling, guaranteed consistency, and comprehensive data validation capabilities. Myth: MongoDB is a niche database What are the top use cases for MongoDB? This question is difficult to answer because MongoDB is a general-purpose database that can support any use case. The document model is the primary driver of MongoDB’s versatility. Documents are similar to JSON objects with data being represented as key-value pairs. Values can be simple types like strings or numbers. However, values can also be arrays or nested objects which allows documents to easily represent complex hierarchical structures. The document model's flexibility allows data to be stored exactly as the application consumes it. This enables highly efficient writing and optimizes data for retrieval without needing to set up standard or materialized views, although both are supported . While MongoDB is no longer a niche database, it does have advanced capabilities to support niche requirements. The aggregation pipeline provides a powerful framework for data analytics and transformation. Time-series collections store and query temporal data efficiently to support IoT and financial applications. Geospatial indexes and queries enable location-based applications to perform complex proximity calculations. MongoDB Atlas includes native support for vector search . This enabled Cisco to experiment with generative AI use cases and streamline their applications to production. MongoDB handles the diverse data requirements that power modern applications. The document model provides the foundation for general use. Concurrently, advanced features ensure teams do not need to integrate additional tools as application requirements evolve. The result is a single platform that can grow from prototype to production, handling general requirements and specialized workloads with equal proficiency. Myth: MongoDB is not suitable for enterprise-grade workloads A common perception is that MongoDB works well for small applications but falls short at enterprise scale. Ironically, many organizations first consider MongoDB while struggling to scale their relational databases. These organizations have discovered MongoDB’s architecture is specifically designed to support scale-out distributed deployments. While MongoDB matches relational databases in vertical scaling capabilities, the document model enables a more natural and intuitive approach for horizontal scaling. Related data is stored together in a single document. Therefore, MongoDB can easily distribute complete data units across shards. This contrasts with relational databases. Relational data is split across multiple tables. This makes it difficult to place all related data on the same shard. Horizontal scaling with MongoDB sets an organization up for better performance. Most MongoDB queries need to access only a single shard. Equivalent queries in a relational database often require costly cross-server communication. Telefonica Tech has leveraged horizontal scaling to nearly double their capacity with a 40% hardware reduction . MongoDB Atlas further automates and simplifies these scaling capabilities through a fully managed service built to meet demanding enterprise requirements. Atlas provides a 99.995% uptime guarantee and availability across AWS, Google Cloud, and Azure in over 100 regions worldwide. This frees teams to focus on rapid development and innovation rather than infrastructure maintenance by offloading the operational complexity of deploying and running databases at scale. Powering the enterprise applications of today and tomorrow Over 50,000 customers and 70% of the Fortune 100 rely on MongoDB to power their enterprise applications. Independent industry reports from Gartner and Forrester continue to recognize MongoDB as a leader in the database space. Do not let outdated myths prevent your organization from the competitive advantages of MongoDB's enterprise capabilities. To learn more about MongoDB, head over to MongoDB University and take our free Intro to MongoDB course . Read more about customers building on MongoDB. Read our first blog in this series about myths around MongoDB vs relational databases. Check out the full video to learn about the other 6 myths that we're debunking in this series.

February 25, 2025

Applied

MongoDB & DKatalis’s Bank Jago, Empowering Over 500 Engineers

DKatalis , a technology company specialized in developing scalable digital solutions, is the engineering arm behind Bank Jago , Indonesia’s first digital bank. An app-only institution, Bank Jago enables end-to-end banking with features such as auto budgeting. This allows Bank Jago’s customers to easily and effectively organize their finances by creating " Pockets "—for expenses like food, savings, or entertainment. Launched in 2019, Bank Jago has seen tremendous growth in only a few years, with its customer base reaching 14.1 million as of October 2024. While speaking at MongoDB.local Jakarta , Chris Samuel, Staff Engineer at DKatalis, shared how MongoDB became the data backbone of Bank Jago, and how MongoDB Atlas supported Bank Jago’s growth. Bank Jago’s journey with MongoDB started in 2019, when DKatalis built the first version of Bank Jago using the on-premise version of MongoDB: MongoDB Community Edition . “We did everything ourselves, up to the point when we realized that the bigger our user [base] grew, the more painful it was for us to monitor everything,” said Samuel. In 2021, DKatalis decided to migrate Bank Jago [from MongoDB Community Edition] to MongoDB Atlas. This first involved migrating all data to Atlas. Then the database platform had to be set up to facilitate scalability and enable improved maintenance operations in the long-term. “In terms of process, it is actually seamless,” said Samuel during his MongoDB.local talk. Specifically, MongoDB Atlas offers six key capabilities that have facilitated the bank’s daily operations, supported its fast growth, and improved efficiencies: Flexibility: MongoDB's document model supports diverse data types and adapts to Jago's dynamic requirements. Scalability: MongoDB Atlas effortlessly supports the rapid growth in user base and data volume. High performance: The platform enables fast query execution and efficient data retrieval for a seamless customer experience. Real-time capabilities: MongoDB Atlas prevents delays during transactions, account creation, and balance checking. Regulation compliance: With MongoDB Atlas, local hosting is possible. This enables DKatalis to meet Indonesian financial regulatory standards. Community support: MongoDB’s strong developer community and rich ecosystem in Jakarta fosters collaboration and learning. All of these have also helped improve efficiencies for DKatalis’s team of over 500 engineers, who are now able to reduce data architecture complexity, and focus on innovation. Fostering a great engineering culture and community with MongoDB In another talk at MongoDB.local Singapore , DKatalis’s Chief Engineering Officer, Alex Titlyanov, explained that using MongoDB has been and continues to be a great learning, upskilling, and operational experience for his team. “DKatalis has a pretty unique organizational culture when it comes to its engineering teams: there are no designated engineering managers or project managers; instead, teams are self-managed,” said Titlyanov. “This encourages a community-driven environment, where engineers are continuously upgrading their skills, particularly with tools like MongoDB.” The company has established internal communities, such as the MongoDB community led by Principal Software Engineer Boon Hian Tek. These communities focus on knowledge sharing, skill-building, and ensuring that the company’s 500 engineers are proficient in using MongoDB. This deep knowledge of MongoDB—and the ease of use offered by the Atlas platform—means that DKatalis’s engineers are also able to build their own bespoke tools to improve daily operations and meet specific needs. For example, the team has built a range of tools aimed at helping deal with the complexity and scale of Bank Jago’s data architecture. “Most traditional banks offer their customers access to six months, sometimes a year’s worth of transaction history. But Bank Jago gives access to the entire transaction history,” said Boon. The engineering team ended up having to deal with 56 different databases and 485 data collections. Some would reach 1.13 billion documents, while others receive up to 42.5 million new documents every day. Some of the bespoke tools built on MongoDB Atlas include: Index sync report: DKatalis implemented a custom-built tool using MongoDB’s Atlas API to manage database indexing automatically. This was essential given the bank’s real-time requirements. Adding indexes manually during peak hours would have disrupted performance. Daily reporting: The team built a tool to monitor for slow queries. This provides daily reports on query performance so issues can be identified and resolved quickly. Add index: The Rolling Index feature from Atlas was initially used. However, the team required greater context for each index. Therefore, they built a tool that at 3:00 am automatically checks if there are any indexes to create. The tool calls in the Atlas API to create and publish the results. Exporting metrics: The Atlas console was used to source diagrams that were helpful. However, the team required each metric to be available per database and per collection versus cluster. The team built a thin layer on top of the Atlas console to slice up the required metrics using the Atlas API. “The scalability and flexibility of MongoDB have been essential in helping the team handle the bank’s fast growth and complex feature set. MongoDB’s document-oriented structure enables us to develop innovative features like ‘Pockets’, and we continue to see MongoDB as an integral part of our technology stack in the future,” said Titlyanov. Visit our product page to learn more about MongoDB Atlas . To learn how MongoDB powers solutions in the financial services industry, visit our solutions page .

February 24, 2025

Applied

Ready to get Started with MongoDB Atlas?

Start Free