Enable Real-Time SQL with Rockset for MongoDB

MongoDB and the MongoDB Query Language (MQL) are built around the document model, which intuitively maps directly to objects within code. This has created a very natural and efficient way for developers to work with data. It’s exciting to see our partner Rockset offer a real-time, interactive SQL experience for our users that makes it easy to JOIN across MongoDB data and other data sources. This partnership is another way that we are continually seeking to improve developer productivity and make developers’ lives easier.

For some background, our two teams have a long history. Rockset is the original team behind RocksDB and the Facebook online data platform. During their time at Facebook, they worked with us to build MongoRocks. We are excited to work together again. Let me introduce Dhruba Borthakur, the co-founder and CTO of Rockset to talk more about the Rockset + MongoDB integration.

Towards Intelligent Applications

Rockset’s purpose is to make life easier for developers of intelligent applications. These are applications that take actions in real-time, whether it is deciding what to offer customers, matching supply and demand, or sending alerts when needed. Intelligent applications include IoT automation, instant personalization, real-time customer 360s, and many gaming apps. These kinds of applications will dominate the next decade, and MongoDB is an excellent fit for them, due to its flexibility and scale.

Intelligent applications use as much useful data as they can, combining real-time and historical data, to take actions within limited time windows. They typically need to combine MongoDB data with other data—multiple MongoDB collections or data from Amazon S3 or Apache Kafka—in near real-time. To make optimal decisions or take the best actions, they need to run complex queries over large-scale data but still require low latency. Knowing MongoDB is widely used as the primary database in many of these situations, we set out to make building intelligent applications as simple as possible, using Rockset to serve SQL queries on data from MongoDB.

Real-Time SQL Queries Through REST Endpoints

Rockset allows you to run a real-time SQL query by hitting a REST endpoint from your application, a feature called Query Lambdas. To make it possible to do the same on data from MongoDB, we built the ability for Rockset to continuously ingest and index data from MongoDB Atlas using MongoDB change streams. I’ll walk through what the developer experience might look like when using the Rockset + MongoDB combination.

Connecting Rockset to MongoDB Atlas

We collaborated with the MongoDB team to develop a connector from MongoDB Atlas to Rockset, two fully managed cloud services. Detailed documentation can be found here.

In MongoDB Atlas, you will create a custom role granted the find, changeStream, and collStats actions.

MongoDB Database Access Settings

Then, create a MongoDB Atlas user that has this custom role.

MongoDB Atlas custom user role

To allow the connection between MongoDB Atlas and Rockset, you must whitelist three Rockset IP addresses.

MongoDB Atlas Network Access Settings

Obtain the connection string for your MongoDB Atlas cluster, as you will specify this to Rockset later.

Connection string for MongoDB Atlas cluster

Moving to the Rockset console, which is where we can manage Rockset collections, integrations with data sources, queries, and users, you can create an integration with MongoDB. You will need to provide the username and password for the MongoDB Atlas user you just created, along with the connection string you retrieved.

Rockset Console

Building an External Index on a MongoDB Collection

Once the integration is created, you can create Rockset collections backed by MongoDB Atlas. Specify the source database and collection in MongoDB Atlas—each MongoDB collection maps to one Rockset collection—and you will see a preview of your data.

Collection preview in Rockset console

Rockset will first perform a full scan of the MongoDB data and subsequently, stay in sync with any updates to the collection through a MongoDB change stream.

Rockset automatically indexes all the ingested data, acting as an external index for your MongoDB data. Rockset uses a concept called Converged Indexing™ to build multiple indexes—a search index, a column index, and a row index—on every field of your data. This allows for fast SQL queries, including search-style queries, aggregations, and joins across different data sets, all without having to manually create a schema.

Creating APIs Powered by SQL on MongoDB Data

Now that your data from MongoDB is being continuously ingested and indexed by Rockset, you can construct your application using Query Lambdas—saved SQL queries, accessed via REST endpoints.

Start by constructing queries on the collection you created from your MongoDB integration. You can even bring in data from other Rockset collections, created from other MongoDB collections or data sources. We use a SQL JOIN to do so in this example. From there, you can create an API endpoint from the query you wrote.

Rockset Query Lambda

Name and describe the Query Lambda, so you know what it is used for. You’ll also get code snippets to embed in your application to make this API call.

Executing a Query Lambda in your Python application will look something like this:

import requests, json

def executeLambda():
     r = requests.post('https://api.rs2.usw2.rockset.com/v1/orgs/self/ws/commons/lambdas/getLatPositionJOIN/versions/2',
     headers={'Authorization': 'ApiKey API_KEY'})
     return r.json()

That’s it! No need to embed complex queries in your applications.

Using Query Lambdas to create APIs on MongoDB data allows you to:

  • Perform real-time search, aggregations, and joins. Use the full capabilities of SQL to express different types of queries. Since all your data is indexed, you get millisecond-latency APIs to power your applications.
  • Simplify application logic. Save on writing application-side logic when you are able to join MongoDB and other data sets using SQL JOINs. In addition, you just need to call a REST endpoint in your code to trigger a query, instead of embedding the entire query.
  • Ship features faster. Productionize your queries as version-controlled REST endpoints as part of Rockset’s standard query development flow.
  • Manage less infrastructure. Build serverless applications that auto-scale as needed.

Together, Rockset and MongoDB Atlas make intelligent applications simpler to build.

Get Started with MongoDB and Rockset

Power your application with real-time APIs on MongoDB data by signing up for MongoDB Atlas and Rockset. And register to learn more at our session at MongoDB.live in June.