Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Learn why MongoDB was selected as a leader in the 2024 Gartner® Magic Quadrant™
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Center
chevron-right
Developer Topics
chevron-right
Products
chevron-right
Atlas
chevron-right

Migrate from Azure CosmosDB to MongoDB Atlas Using Apache Kafka

Robert Walters3 min read • Published Nov 09, 2021 • Updated May 09, 2022
KafkaAtlasJavaScript
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty

Overview

When you are the best of breed, you have many imitators. MongoDB is no different in the database world. If you are reading this blog, you are most likely an Azure customer that ended up using CosmosDB.
You needed a database that could handle unstructured data in Azure and eventually realized CosmosDB wasn’t the best fit. Perhaps you found that it is too expensive for your workload or not performing well or simply have no confidence in the platform. You also might have tried using the MongoDB API and found that the queries you wanted to use simply don’t work in CosmosDB because it fails 67% of the compatibility tests.
Whatever the path you took to CosmosDB, know that you can easily migrate your data to MongoDB Atlas while still leveraging the full power of Azure. With MongoDB Atlas in Azure, there are no more failed queries, slow performance, and surprise bills from not optimizing your RDUs. MongoDB Atlas in Azure also gives you access to the latest releases of MongoDB and the flexibility to leverage any of the three cloud providers if your business needs change.
Note: When you originally created your CosmosDB, you were presented with these API options:
If you created your CosmosDB using Azure Cosmos DB API for MongoDB, you can use mongo tools such as mongodump, mongorestore, mongoimport, and mongoexport to move your data. The Azure CosmosDB Connector for Kafka Connect does not work with CosmosDB databases that were created for the Azure Cosmos DB API for MongoDB.
In this blog post, we will cover how to leverage Apache Kafka to move data from Azure CosmosDB Core (Native API) to MongoDB Atlas. While there are many ways to move data, using Kafka will allow you to not only perform a one-time migration but to stream data from CosmosDB to MongoDB. This gives you the opportunity to test your application and compare the experience so that you can make the final application change to MongoDB Atlas when you are ready. The complete example code is available in this GitHub repository.

Getting started

You’ll need access to an Apache Kafka cluster. There are many options available to you, including Confluent Cloud, or you can deploy your own Apache Kafka via Docker as shown in this blog. Microsoft Azure also includes an event messaging service called Azure Event Hubs. This service provides a Kafka endpoint that can be used as an alternative to running your own Kafka cluster. Azure Event Hubs exposes the same Kafka Connect API, enabling the use of the MongoDB connector and Azure CosmosDB DB Connector with the Event Hubs service.
If you do not have an existing Kafka deployment, perform these steps. You will need docker installed on your local machine:
1git clone ​​https://github.com/RWaltersMA/CosmosDB2MongoDB.git
Next, build the docker containers.
1docker-compose up -d --build
The docker compose script (docker-compose.yml) will stand up all the components you need, including Apache Kafka and Kafka Connect. Install the CosmosDB and MongoDB connectors.

Configuring Kafka Connect

Modify the cosmosdb-source.json file and replace the placeholder values with your own.
1{
2 "name": "cosmosdb-source",
3 "config": {
4 "connector.class": "com.azure.cosmos.kafka.connect.source.CosmosDBSourceConnector",
5 "tasks.max": "1",
6 "key.converter": "org.apache.kafka.connect.json.JsonConverter",
7 "value.converter": "org.apache.kafka.connect.json.JsonConverter",
8 "connect.cosmos.task.poll.interval": "100",
9 "connect.cosmos.connection.endpoint":
10"https://**<cosmosinstance-name>**.documents.azure.com:443/",
11 "connect.cosmos.master.key": **"<cosmosdbprimarykey>",**
12 "connect.cosmos.databasename": **"<database name>",**
13 "connect.cosmos.containers.topicmap": **"<containers>#<topicname>”,**
14 "connect.cosmos.offset.useLatest": false,
15 "value.converter.schemas.enable": "false",
16 "key.converter.schemas.enable": "false"
17 }
18}
Modify the mongo-sink.json file and replace the placeholder values with your own.
1{"name": "mongo-sink",
2 "config": {
3 "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
4 "tasks.max":"1",
5 "topics":"<topicname>",
6 "connection.uri":"<MongoDB Atlas Connection String>",
7 "database":"<Desired Database Name>",
8 "collection":"<Desired Collection Name>",
9 "key.converter": "org.apache.kafka.connect.json.JsonConverter",
10 "value.converter":"org.apache.kafka.connect.json.JsonConverter",
11 "value.converter.schemas.enable": "false",
12 "key.converter.schemas.enable": "false"
13
14 }}
Note: Before we configure Kafka Connect, make sure that your network settings on both CosmosDB and MongoDB Atlas will allow communication between these two services. In CosmosDB, select the Firewall and Virtual Networks. While the easiest configuration is to select “All networks,” you can provide a more secure connection by specifying the IP range from the Firewall setting in the Selected networks option. MongoDB Atlas Network access also needs to be configured to allow remote connections. By default, MongoDB Atlas does not allow any external connections. See Configure IP Access List for more information.
To configure our two connectors, make a REST API call to the Kafka Connect service:
1curl -X POST -H "Content-Type: application/json" -d @cosmosdb-source.json http://localhost:8083/connectors
2
3
4curl -X POST -H "Content-Type: application/json" -d @mongodb-sink.json http://localhost:8083/connectors
That’s it!
Provided the network and database access was configured properly, data from your CosmosDB should begin to flow into MongoDB Atlas. If you don’t see anything, here are some troubleshooting tips:
  • Try connecting to your MongoDB Atlas cluster using the mongosh tool from the server running the docker container.
  • View the docker logs for the Kafka Connect service.
  • Verify that you can connect to the CosmosDB instance using the Azure CLI from the server running the docker container.
Summary In this post, we explored how to move data from CosmosDB to MongoDB using Apache Kafka. If you’d like to explore this method and other ways to migrate data, check out the 2021 MongoDB partner of the year award winner, Peerslands', five-part blog post on CosmosDB migration.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Using the Confluent Cloud With Atlas Stream Processing


Nov 19, 2024 | 5 min read
Tutorial

How to Deploy MongoDB Atlas with Terraform on AWS


Jan 23, 2024 | 12 min read
Tutorial

Supercharge Your AI Applications: AWS Bedrock, MongoDB, and TypeScript


Oct 10, 2024 | 9 min read
Tutorial

Build an Image Search Engine With Python & MongoDB


Sep 18, 2024 | 8 min read
Table of Contents
  • Overview