How to Use PyMongo to Connect MongoDB Atlas with AWS Lambda
Anaiya Raisinghani6 min read • Published Apr 02, 2024 • Updated Apr 02, 2024
Rate this tutorial
Picture a developer’s paradise: a world where instead of fussing over hardware complexities, we are free to focus entirely on running and executing our applications. With the combination of AWS Lambda and MongoDB Atlas, this vision becomes a reality.
Armed with AWS Lambda’s pay-per-execution structure and MongoDB Atlas’ unparalleled scalability, developers will truly understand what it means for their applications to thrive without the hardware limitations they might be used to.
This tutorial will take you through how to properly set up an Atlas cluster, connect it to AWS Lambda using MongoDB’s Python Driver, write an aggregation pipeline on our data, and return our wanted information. Let’s get started.
- AWS Account; Lambda access is necessary
- GitHub repository
- Python 3.8+
Our first step is to create an Atlas cluster. Log into the Atlas UI and follow the steps to set it up. For this tutorial, the free tier is recommended, but any tier will work!
Please ensure that the cloud provider picked is AWS. It’s also necessary to pick a secure username and password so that we will have the proper authorization later on in this tutorial, along with proper IP address access.
Once your cluster is up and running, click the ellipses next to the Browse Collections button and download the
sample dataset
. Your finished cluster will look like this:
Once our cluster is provisioned, let’s set up our AWS Lambda function.
Sign into your AWS account and search for “Lambda” in the search bar. Hit the orange “Create function” button at the top right side of the screen, and you’ll be taken to the image below. Here, make sure to first select the “Author from scratch” option. Then, we want to select a name for our function (AWSLambdaDemo), the runtime (3.8), and our architecture (x86_64).
Hit the orange “Create function” button on the bottom right to continue. Once your function is created, you’ll see a page with your function overview above and your code source right below.
Now, we are ready to set up our connection from AWS Lambda to our MongoDB cluster.
To make things easier for ourselves because we are going to be using Pymongo, a dependency, instead of editing directly in the code source, we will be using Visual Studio Code. AWS Lambda has a limited amount of pre-installed libraries and dependencies, so in order to get around this and incorporate Pymongo, we will need to package our code in a special way. Due to this “workaround,” this will not be a typical tutorial with testing at every step. We will first have to download our dependencies and upload our code to Lambda prior to ensuring our code works instead of using a typical
requirements.txt
file. More on that below.Now we are ready to establish a connection between AWS Lambda and our MongoDB cluster!
Create a new directory on your local machine and name it
awslambda-demo
.Let’s install
pymongo
. As said above, Lambda doesn’t have every library available. So, we need to download pymongo
at the root of our project. We can do it by working with .zip file archives:
In the terminal, enter our awslambda-demo
directory:1 cd awslambda-demo
Create a new directory where your dependencies will live:
1 mkdir dependencies
Install
pymongo
directly in your dependencies
package:1 pip install --target ./dependencies pymongo
Open Visual Studio Code, open the
awslambda-demo
directory, and create a new Python file named lambda_function.py
. This is where the heart of our connection will be.Insert the code below in our
lambda_function.py
. Here, we are setting up our console to check that we are able to connect to our Atlas cluster. Please keep in mind that since we are incorporating our environment variables in a later step, you will not be able to connect just yet. We have copied the lambda_handler
definition from our Lambda code source and have edited it to insert one document stating my full name into a new “test” database and “test” collection. It is best practice to incorporate our MongoClient outside of our lambda_handler
because to establish a connection and performing authentication is reactively expensive, and Lambda will re-use this instance.1 import os 2 from pymongo import MongoClient 3 4 5 client = MongoClient(host=os.environ.get("ATLAS_URI")) 6 7 8 def lambda_handler(event, context): 9 # Name of database 10 db = client.test 11 12 # Name of collection 13 collection = db.test 14 15 # Document to add inside 16 document = {"first name": "Anaiya", "last name": "Raisinghani"} 17 18 19 # Insert document 20 result = collection.insert_one(document) 21 22 23 if result.inserted_id: 24 return "Document inserted successfully" 25 else: 26 return "Failed to insert document"
If this is properly inserted in AWS Lambda, we will see “Document inserted successfully” and in MongoDB Atlas, we will see the creation of our “test” database and collection along with the single document holding the name “Anaiya Raisinghani.” Please keep in mind we will not be seeing this yet since we haven’t configured our environment variables and will be doing this a couple steps down.
Now, we need to create a .zip file, so we can upload it in our Lambda function and execute our code. Create a .zip file at the root:
1 cd dependencies 2 zip -r ../deployment.zip *
This creates a
deployment.zip
file in your project directory.Now, we need to add in our
lambda_function.py
file to the root of our .zip file:1 cd .. 2 zip deployment.zip lambda_function.py
Once you have your .zip file, access your AWS Lambda function screen, click the “Upload from” button, and select “.zip file” on the right hand side of the page:
Upload your .zip file and you should see the code from your
lambda_function.py
in your “Code Source”:
Let’s configure our environment variables. Select the “Configuration” tab and then select the “Environment Variables” tab. Here, put in your “ATLAS_URI” string. To access your connection string, please follow the instructions in our docs.
Once you have your Environment Variables in place, we are ready to run our code and see if our connection works. Hit the “Test” button. If it’s the first time you’re hitting it, you’ll need to name your event. Keep everything else on the default settings. You should see this page with our “Execution results.” Our document has been inserted!
When we double-check in Atlas, we can see that our new database “test” and collection “test” have been created, along with our document with “Anaiya Raisinghani.”
This means our connection works and we are capable of inserting documents from AWS Lambda to our MongoDB cluster. Now, we can take things a step further and input a simple aggregation pipeline!
For our pipeline, let’s change our code to connect to our
sample_restaurants
database and restaurants
collection. We are going to be incorporating our aggregation pipeline to find a sample size of five American cuisine restaurants that are located in Brooklyn, New York. Let’s dive right in!Since we have our
pymongo
dependency downloaded, we can directly incorporate our aggregation pipeline into our code source. Change your lambda_function.py
to look like this:1 import os 2 from pymongo import MongoClient 3 4 connect = MongoClient(host=os.environ.get("ATLAS_URI")) 5 6 def lambda_handler(event, context): 7 # Choose our "sample_restaurants" database and our "restaurants" collection 8 database = connect.sample_restaurants 9 collection = database.restaurants 10 11 # This is our aggregation pipeline 12 pipeline = [ 13 14 # We are finding American restaurants in Brooklyn 15 {"$match": {"borough": "Brooklyn", "cuisine": "American"}}, 16 17 # We only want 5 out of our over 20k+ documents 18 {"$limit": 5}, 19 20 # We don't want all the details, project what you need 21 {"$project": {"_id": 0, "name": 1, "borough": 1, "cuisine": 1}} 22 23 ] 24 25 # This will show our pipeline 26 result = list(collection.aggregate(pipeline)) 27 28 # Print the result 29 for restaurant in result: 30 print(restaurant)
Here, we are using
$match
to find all the American cuisine restaurants located in Brooklyn. We are then using $limit
to only five documents out of our database. Next, we are using $project
to only show the fields we want. We are going to include “borough”, “cuisine”, and the “name” of the restaurant. Then, we are executing our pipeline and printing out our results.Click on “Deploy” to ensure our changes have been deployed to the code environment. After the changes are deployed, hit “Test.” We will get a sample size of five Brooklyn American restaurants as the result in our console:
Our aggregation pipeline was successful!
This tutorial provided you with hands-on experience to connect a MongoDB Atlas database to AWS Lambda. We also got an inside look on how to write to a cluster from Lambda, how to read back information from an aggregation pipeline, and how to properly configure our dependencies when using Lambda. Hopefully now, you are ready to take advantage of AWS Lambda and MongoDB to create the best applications without worrying about external infrastructure.
If you enjoyed this tutorial and would like to learn more, please check out our MongoDB Developer Center and YouTube channel.