Docs Menu
Docs Home
/ / /
PyMongo
/ /

One-to-One Join

On this page

  • Introduction
  • Aggregation Task Summary
  • Before You Get Started
  • Tutorial
  • Add a match stage for orders in 2020
  • Add a lookup stage to link the collections
  • Add set stages to create new document fields
  • Add an unset stage to remove unneeded fields
  • Run the aggregation pipeline
  • Interpret results

In this tutorial, you can learn how to use PyMongo to construct an aggregation pipeline, perform the aggregation on a collection, and print the results by completing and running a sample app.

This aggregation performs a one-to-one join. A one-to-one join occurs when a document in one collection has a field value that matches a single document in another collection that has the same field value. The aggregation matches these documents on the field value and combines information from both sources into one result.

Tip

A one-to-one join does not require the documents to have a one-to-one relationship. To learn more about this data relationship, see the Wikipedia entry about One-to-one (data model).

This tutorial demonstrates how to combine data from a collection that describes product information with another collection that describes customer orders. The results show a list of all orders placed in 2020 that includes the product details associated with each order.

This example uses two collections:

  • orders: contains documents describing individual orders for products in a shop

  • products: contains documents describing the products that a shop sells

An order can only contain one product, so the aggregation uses a one-to-one join to match an order document to the document for the product. The collections are joined by a field called product_id that exists in documents in both collections.

Before you start this tutorial, complete the Aggregation Template App instructions to set up a working Python application.

After you set up the app, access the orders and products collections by adding the following code to the application:

orders_coll = agg_db["orders"]
products_coll = agg_db["products"]

Delete any existing data and insert sample data into the orders collection as shown in the following code:

orders_coll.delete_many({})
order_data = [
{
"customer_id": "elise_smith@myemail.com",
"orderdate": datetime(2020, 5, 30, 8, 35, 52),
"product_id": "a1b2c3d4",
"value": 431.43
},
{
"customer_id": "tj@wheresmyemail.com",
"orderdate": datetime(2019, 5, 28, 19, 13, 32),
"product_id": "z9y8x7w6",
"value": 5.01
},
{
"customer_id": "oranieri@warmmail.com",
"orderdate": datetime(2020, 1, 1, 8, 25, 37),
"product_id": "ff11gg22hh33",
"value": 63.13
},
{
"customer_id": "jjones@tepidmail.com",
"orderdate": datetime(2020, 12, 26, 8, 55, 46),
"product_id": "a1b2c3d4",
"value": 429.65
}
]
orders_coll.insert_many(order_data)

Delete any existing data and insert sample data into the products collection as shown in the following code:

products_coll.delete_many({})
product_data = [
{
"id": "a1b2c3d4",
"name": "Asus Laptop",
"category": "ELECTRONICS",
"description": "Good value laptop for students"
},
{
"id": "z9y8x7w6",
"name": "The Day Of The Triffids",
"category": "BOOKS",
"description": "Classic post-apocalyptic novel"
},
{
"id": "ff11gg22hh33",
"name": "Morphy Richardds Food Mixer",
"category": "KITCHENWARE",
"description": "Luxury mixer turning good cakes into great"
},
{
"id": "pqr678st",
"name": "Karcher Hose Set",
"category": "GARDEN",
"description": "Hose + nosels + winder for tidy storage"
}
]
products_coll.insert_many(product_data)
1

Add a $match stage that matches orders placed in 2020:

pipeline.append({
"$match": {
"orderdate": {
"$gte": datetime(2020, 1, 1, 0, 0, 0),
"$lt": datetime(2021, 1, 1, 0, 0, 0)
}
}
})
2

Next, add a $lookup stage. The $lookup stage joins the product_id field in the orders collection to the id field in the products collection:

pipeline.append({
"$lookup": {
"from": "products",
"localField": "product_id",
"foreignField": "id",
"as": "product_mapping"
}
})
3

Next, add two $set stages to the pipeline.

The first $set stage sets the product_mapping field to the first element in the product_mapping object created in the previous $lookup stage.

The second $set stage creates two new fields, product_name and product_category, from the values in the product_mapping object field:

pipeline.extend([
{
"$set": {
"product_mapping": {"$first": "$product_mapping"}
}
},
{
"$set": {
"product_name": "$product_mapping.name",
"product_category": "$product_mapping.category"
}
}
])

Tip

Because this is a one-to-one join, the $lookup stage adds only one array element to the input document. The pipeline uses the $first operator to retrieve the data from this element.

4

Finally, add an $unset stage. The $unset stage removes unnecessary fields from the document:

pipeline.append({"$unset": ["_id", "product_id", "product_mapping"]})
5

Add the following code to the end of your application to perform the aggregation on the orders collection:

aggregation_result = orders_coll.aggregate(pipeline)

Finally, run the following command in your shell to start your application:

python3 agg_tutorial.py
6

The aggregated result contains three documents. The documents represent customer orders that occurred in 2020, with the product_name and product_category of the ordered product:

{
'customer_id': 'elise_smith@myemail.com',
'orderdate': datetime.datetime(2020, 5, 30, 8, 35, 52),
'value': 431.43,
'product_name': 'Asus Laptop',
'product_category': 'ELECTRONICS'
}
{
'customer_id': 'oranieri@warmmail.com',
'orderdate': datetime.datetime(2020, 1, 1, 8, 25, 37),
'value': 63.13,
'product_name': 'Morphy Richardds Food Mixer',
'product_category': 'KITCHENWARE'
}
{
'customer_id': 'jjones@tepidmail.com',
'orderdate': datetime.datetime(2020, 12, 26, 8, 55, 46),
'value': 429.65,
'product_name': 'Asus Laptop',
'product_category': 'ELECTRONICS'
}

The result consists of documents that contain fields from documents in the orders collection and the products collection, joined by matching the product_id field present in each original document.

To view the complete code for this tutorial, see the Completed One-to-one Join App on GitHub.

Back

Unpack Arrays and Group

Next

Multi-Field Join