PLAID, Inc. Optimizes Real-Time Data With MongoDB Atlas Stream Processing
July 17, 2025
A MongoDB customer since 2015, Tokyo, Japan-based PLAID, Inc. works to “maximize the value of people with the power of data,” according to the company’s mission statement. PLAID’s customer experience platform, KARTE, analyzes and visualizes website and application users’ data in real time, offering the company’s customers a one-stop solution that helps them better understand their customers and provide personalized experiences.
After running a self-hosted instance of MongoDB for several years, in 2021, PLAID adopted MongoDB Atlas, a fully managed suite of cloud database services. Subsequently, however, the company ran into real-time data challenges.
Specifically, PLAID faced challenges when trying to migrate an existing batch processing system that sent real-time data from MongoDB Atlas to Google BigQuery, which helps organizations “go from data to AI action faster.” While their initial cloud setup with Kafka connectors provided valuable streaming capabilities by capturing events from MongoDB and streaming them to BigQuery, the complexity tied to the number of pipelines became a concern. The staging environment, which required duplicate pipelines, further exacerbated the issue, and rising costs could hinder PLAID's ability to scale and expand its real-time data processing system efficiently.
_Spot (1)-l8rcs5gbkr.png)
Easy event data processing with Atlas Stream Processing
To address these challenges, PLAID turned to MongoDB Atlas Stream Processing, which enables development teams to process streams of complex data using the same query API used in their MongoDB Atlas databases.
Atlas Stream Processing provided PLAID with a cost-effective way of acquiring and processing event data in real time, all while being natively integrated within their existing MongoDB Atlas environment for a seamless developer experience. This allowed them to replace some of their costly Kafka source connectors while maintaining the overall data flow to BigQuery via their existing Confluent Cloud Kafka setup.
Key aspects of the solution included:
- Replacing Kafka source connectors: Atlas Stream Processing efficiently captures event data from MongoDB Atlas databases and writes them to Kafka, reducing costs associated with the previous Kafka source connectors.
- MongoDB Atlas Stream Processing:
- Stream processing instance (SPI): PLAID used SPIs, where cost is determined by the instance tier and the number of workers, which in turn depends on the number of stream processors. This offered a more optimized cost structure compared to the previous connector-task-based pricing.
- Connection management: Atlas Stream Processing simplifies connection management. Connecting to Atlas databases is straightforward, and a single connection can be used for the Kafka cluster.
- Stream processors: These processing units perform data transformation and routing with the same aggregation pipelines used by MongoDB databases. Thus, the PLAID team leveraged their existing MongoDB knowledge to define pipeline logic, making the transition smoother.
- Custom backfill mechanism: To address the lack of a backfill feature in Stream Processing, PLAID developed a custom application to synchronize existing data.
- Custom metric collection: Since native monitoring integration with Datadog was unavailable, PLAID created a bot to collect Atlas Stream Processing metrics and send them to Datadog for monitoring and alerting.
Atlas Stream Processing provided us with a robust solution for real-time data processing, which has significantly reduced costs and improved scalability throughout our platform.
Hajime Shiozawa, senior software engineer, PLAID, Inc.
The outcome: Lower costs, improved efficiency
By implementing MongoDB Atlas Stream Processing, PLAID achieved significant improvements. These include everything from reduced costs to operational efficiencies:
-
Reduced costs: PLAID eliminated the cost structure that was proportional to the number of pipelines, resulting in substantial cost savings. The new cost model based on Atlas Stream Processing workers offered a more scalable and predictable pricing structure.
-
Improved scalability: The optimized architecture allowed PLAID to scale their real-time data processing system efficiently, supporting the addition of new products and Atlas clusters without escalating costs.
-
Simplified management: Because Stream Processing is a native MongoDB Atlas capability, it simplified connection management and pipeline configuration, reducing operational overhead.
-
Stable operation: PLAID successfully deployed and operated more than 20 pipelines, processing over 3 million events per day to BigQuery.
-
Enhanced real-time data capabilities: The improved system strengthened the real-time nature of their data, improving operational efficiency.
MongoDB Atlas Stream Processing provided PLAID with a robust and cost-effective solution for real-time data processing to BigQuery. By replacing costly Kafka Source Connectors and optimizing their architecture, PLAID significantly reduced costs and improved scalability. The seamless integration with MongoDB Atlas and the developer-friendly API further enhanced their operational efficiency. PLAID’s success with Atlas Stream Processing demonstrates that it is a valuable tool for organizations that are looking to streamline their data integration pipelines and leverage real-time data effectively.
To learn how Atlas Stream Processing helps organizations integrate MongoDB with Apache Kafka to build event-driven applications, see the MongoDB Atlas Stream Processing page.