$lookup (aggregation)

On this page

Definition
Compatibility
Syntax
Consideration
Examples

This version of the documentation is archived and no longer supported. View the current documentation to learn how to upgrade your version of MongoDB.

Definition

$lookup: New in version 3.2.
Performs a left outer join to an unsharded collection in the same database to filter in documents from the "joined" collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the "joined" collection. The $lookup stage passes these reshaped documents to the next stage.

Compatibility

You can use $lookup for deployments hosted in the following environments:

MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud

MongoDB Enterprise: The subscription-based, self-managed version of MongoDB
MongoDB Community: The source-available, free-to-use, and self-managed version of MongoDB

Syntax

The $lookup stage has the following syntaxes:

Equality Match with a Single Join Condition

To perform an equality match between a field from the input documents with a field from the documents of the "joined" collection, the $lookup stage has the following syntax:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

The $lookup takes a document with the following fields:

Field	Description
from	Specifies the collection in the same database to perform the join with. The `from` collection cannot be sharded. For details, see Sharded Collection Restrictions.
localField	Specifies the field from the documents input to the `$lookup` stage. `$lookup` performs an equality match on the `localField` to the `foreignField` from the documents of the `from` collection. If an input document does not contain the `localField`, the `$lookup` treats the field as having a value of `null` for matching purposes.
foreignField	Specifies the field from the documents in the `from` collection. `$lookup` performs an equality match on the `foreignField` to the `localField` from the input documents. If a document in the `from` collection does not contain the `foreignField`, the `$lookup` treats the value as `null` for matching purposes.
as	Specifies the name of the new array field to add to the input documents. The new array field contains the matching documents from the `from` collection. If the specified name already exists in the input document, the existing field is overwritten.

The operation corresponds to this pseudo-SQL statement:

SELECT *, (
   SELECT ARRAY_AGG(*)
   FROM <collection to join>
   WHERE <foreignField> = <collection.localField>
) AS <output array field>
FROM collection;

Note

The SQL statements on this page are included for comparison to the MongoDB aggregation pipeline syntax. The SQL statements aren't runnable.

For MongoDB examples, see these pages:

Perform a Single Equality Join with $lookup
Use $lookup with an Array
Use $lookup with $mergeObjects

Join Conditions and Uncorrelated Sub-queries

MongoDB supports:

To perform uncorrelated subqueries between two collections as well as allow other join conditions besides a single equality match, the $lookup stage has the following syntax:

{
   $lookup:
     {
       from: <collection to join>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to execute on the collection to join> ],
       as: <output array field>
     }
}

The $lookup takes a document with the following fields:

Field	Description
from	Specifies the collection in the same database to perform the join with. The `from` collection cannot be sharded. For details, see Sharded Collection Restrictions.
let	Optional. Specifies variables to use in the pipeline field stages. Use the variable expressions to access the fields from the documents input to the `$lookup` stage. The pipeline cannot directly access the input document fields. Instead, first define the variables for the input document fields, and then reference the variables in the stages in the `pipeline`. Note To reference variables in pipeline stages, use the `"$$<variable>"` syntax. The let variables can be accessed by the stages in the pipeline, including additional `$lookup` stages nested in the `pipeline`. A `$match` stage requires the use of an `$expr` operator to access the variables. `$expr` allows the use of aggregation expressions inside of the `$match` syntax. Without the use of the `$expr` operator, `$match` can refer to fields in a document but cannot access variables defined by a `$lookup` let clause. The `$expr` operator only uses indexes on the `from` collection for equality matches. Non-equality match queries, such as range queries, cannot use indexes on the `from` collection. Other (non-`$match`) stages in the pipeline do not require an `$expr` operator to access the variables.
pipeline	Specifies the pipeline to run on the joined collection. The `pipeline` determines the resulting documents from the joined collection. To return all documents, specify an empty pipeline `[]`. The `pipeline` cannot include the `$out` stage or the `$merge` stage. The `pipeline` cannot directly access the input document fields. Instead, first define the variables for the input document fields, and then reference the variables in the stages in the `pipeline`. Note To reference variables in pipeline stages, use the `"$$<variable>"` syntax. The let variables can be accessed by the stages in the pipeline, including additional `$lookup` stages nested in the `pipeline`. A `$match` stage requires the use of an `$expr` operator to access the variables. `$expr` allows the use of aggregation expressions inside of the `$match` syntax. Without the use of the `$expr` operator, `$match` can refer to fields in a document but cannot access variables defined by a `$lookup` let clause. The `$expr` operator only uses indexes on the `from` collection for equality matches. Non-equality match queries, such as range queries, cannot use indexes on the `from` collection. Other (non-`$match`) stages in the pipeline do not require an `$expr` operator to access the variables.
as	Specifies the name of the new array field to add to the input documents. The new array field contains the matching documents from the `from` collection. If the specified name already exists in the input document, the existing field is overwritten.

The operation would correspond to the following pseudo-SQL statement:

SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );

See the following examples:

Specify Multiple Join Conditions with $lookup
Uncorrelated Subquery

Consideration

Views and Collation

If performing an aggregation that involves multiple views, such as with $lookup or $graphLookup, the views must have the same collation.

Restrictions

Changed in version 4.2: You cannot include the $out or the $merge stage in the $lookup stage. That is, when specifying a pipeline for the joined collection, you cannot include either stage in the pipeline field.

{
  $lookup:
    {
       from: <collection to join>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to execute on the joined collection> ],  // Cannot include $out or $merge
       as: <output array field>
    }
}

Sharded Collection Restrictions

In the $lookup stage, the from collection cannot be sharded. However, the collection on which you run the aggregate() method can be sharded. That is, in the following:

db.collection.aggregate([
   { $lookup: { from: "fromCollection", ... } }
])

The collection can be sharded.
The fromCollection cannot be sharded.

As such, to join a sharded collection with an unsharded collection, you can run the aggregation on the sharded collection and lookup the unsharded collection; e.g.:

db.shardedCollection.aggregate([
   { $lookup: { from: "unshardedCollection", ... } }
])

Alternatively, or to join multiple sharded collections, consider:

Modifying client applications to perform manual lookups instead of using the $lookup aggregation stage.
If possible, using an embedded data model that removes the need to join collections.
If possible, using an Atlas Data Lake $lookup pipeline stage to lookup a sharded collection.

Performance Considerations

$lookup performance depends on the type of operation performed. Refer to the following table for performance considerations for different $lookup operations.

`$lookup` Operation	Performance Considerations
Equality Match with a Single Join	`$lookup` operations that perform equality matches with a single join perform better when the foreign collection contains an index on the `foreignField`. Important If a supporting index on the `foreignField` does not exist, a `$lookup` operation that performs an equality match with a single join will likely have poor performance.
Uncorrelated Subqueries	`$lookup` operations that contain uncorrelated subqueries perform better when the inner pipeline can reference an index of the foreign collection. MongoDB only needs to run the `$lookup` subquery once before caching the query because there is no relationship between the source and foreign collections. The `$lookup` subquery is not based on any value in the source collection. This behavior improves performance for subsequent executions of this query.

$lookup Operation

Performance Considerations

Equality Match with a Single Join

$lookup operations that perform equality matches with a single join perform better when the foreign collection contains an index on the foreignField.
Important
If a supporting index on the foreignField does not exist, a $lookup operation that performs an equality match with a single join will likely have poor performance.

Uncorrelated Subqueries

$lookup operations that contain uncorrelated subqueries perform better when the inner pipeline can reference an index of the foreign collection.
MongoDB only needs to run the $lookup subquery once before caching the query because there is no relationship between the source and foreign collections. The $lookup subquery is not based on any value in the source collection. This behavior improves performance for subsequent executions of this query.

For general performance strategies, see Indexing Strategies and Query Optimization.

Important

Excessive use of $lookup within a query may slow down performance. To avoid multiple $lookup stages, consider an embedded data model to optimize query performance.

Examples

Perform a Single Equality Join with `$lookup`

Create a collection orders with the following documents:

db.orders.insert([
   { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
   { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
   { "_id" : 3  }
])

Create another collection inventory with the following documents:

db.inventory.insert([
   { "_id" : 1, "sku" : "almonds", "description": "product 1", "instock" : 120 },
   { "_id" : 2, "sku" : "bread", "description": "product 2", "instock" : 80 },
   { "_id" : 3, "sku" : "cashews", "description": "product 3", "instock" : 60 },
   { "_id" : 4, "sku" : "pecans", "description": "product 4", "instock" : 70 },
   { "_id" : 5, "sku": null, "description": "Incomplete" },
   { "_id" : 6 }
])

The following aggregation operation on the orders collection joins the documents from orders with the documents from the inventory collection using the fields item from the orders collection and the sku field from the inventory collection:

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }
])

The operation returns the following documents:

{
   "_id" : 1,
   "item" : "almonds",
   "price" : 12,
   "quantity" : 2,
   "inventory_docs" : [
      { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
   ]
}
{
   "_id" : 2,
   "item" : "pecans",
   "price" : 20,
   "quantity" : 1,
   "inventory_docs" : [
      { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
   ]
}
{
   "_id" : 3,
   "inventory_docs" : [
      { "_id" : 5, "sku" : null, "description" : "Incomplete" },
      { "_id" : 6 }
   ]
}

The operation would correspond to the following pseudo-SQL statement:

SELECT *, inventory_docs
FROM orders
WHERE inventory_docs IN (SELECT *
FROM inventory
WHERE sku= orders.item);

For more information, see Equality Match Performance Considerations.

Use `$lookup` with an Array

Starting MongoDB 3.4, if the localField is an array, you can match the array elements against a scalar foreignField without needing an $unwind stage.

For example, create an example collection classes with the following document:

db.classes.insert( [
   { _id: 1, title: "Reading is ...", enrollmentlist: [ "giraffe2", "pandabear", "artie" ], days: ["M", "W", "F"] },
   { _id: 2, title: "But Writing ...", enrollmentlist: [ "giraffe1", "artie" ], days: ["T", "F"] }
])

Create another collection members with the following documents:

db.members.insert( [
   { _id: 1, name: "artie", joined: new Date("2016-05-01"), status: "A" },
   { _id: 2, name: "giraffe", joined: new Date("2017-05-01"), status: "D" },
   { _id: 3, name: "giraffe1", joined: new Date("2017-10-01"), status: "A" },
   { _id: 4, name: "panda", joined: new Date("2018-10-11"), status: "A" },
   { _id: 5, name: "pandabear", joined: new Date("2018-12-01"), status: "A" },
   { _id: 6, name: "giraffe2", joined: new Date("2018-12-01"), status: "D" }
])

The following aggregation operation joins documents in the classes collection with the members collection, matching on the enrollmentlist field to the name field:

db.classes.aggregate([
   {
      $lookup:
         {
            from: "members",
            localField: "enrollmentlist",
            foreignField: "name",
            as: "enrollee_info"
        }
   }
])

The operation returns the following:

{
   "_id" : 1,
   "title" : "Reading is ...",
   "enrollmentlist" : [ "giraffe2", "pandabear", "artie" ],
   "days" : [ "M", "W", "F" ],
   "enrollee_info" : [
      { "_id" : 1, "name" : "artie", "joined" : ISODate("2016-05-01T00:00:00Z"), "status" : "A" },
      { "_id" : 5, "name" : "pandabear", "joined" : ISODate("2018-12-01T00:00:00Z"), "status" : "A" },
      { "_id" : 6, "name" : "giraffe2", "joined" : ISODate("2018-12-01T00:00:00Z"), "status" : "D" }
   ]
}
{
   "_id" : 2,
   "title" : "But Writing ...",
   "enrollmentlist" : [ "giraffe1", "artie" ],
   "days" : [ "T", "F" ],
   "enrollee_info" : [
      { "_id" : 1, "name" : "artie", "joined" : ISODate("2016-05-01T00:00:00Z"), "status" : "A" },
      { "_id" : 3, "name" : "giraffe1", "joined" : ISODate("2017-10-01T00:00:00Z"), "status" : "A" }
   ]
}

Use `$lookup` with `$mergeObjects`

Changed in version 3.6: MongoDB 3.6 adds the $mergeObjects operator to combine multiple documents into a single document

Create a collection orders with the following documents:

db.orders.insert([
   { "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
   { "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }
])

Create another collection items with the following documents:

db.items.insert([
  { "_id" : 1, "item" : "almonds", description: "almond clusters", "instock" : 120 },
  { "_id" : 2, "item" : "bread", description: "raisin and nut bread", "instock" : 80 },
  { "_id" : 3, "item" : "pecans", description: "candied pecans", "instock" : 60 }
])

The following operation first uses the $lookup stage to join the two collections by the item fields and then uses $mergeObjects in the $replaceRoot to merge the joined documents from items and orders:

db.orders.aggregate([
   {
      $lookup: {
         from: "items",
         localField: "item",    // field in the orders collection
         foreignField: "item",  // field in the items collection
         as: "fromItems"
      }
   },
   {
      $replaceRoot: { newRoot: { $mergeObjects: [ { $arrayElemAt: [ "$fromItems", 0 ] }, "$$ROOT" ] } }
   },
   { $project: { fromItems: 0 } }
])

The operation returns the following documents:

{ "_id" : 1, "item" : "almonds", "description" : "almond clusters", "instock" : 120, "price" : 12, "quantity" : 2 }
{ "_id" : 2, "item" : "pecans", "description" : "candied pecans", "instock" : 60, "price" : 20, "quantity" : 1 }

Specify Multiple Join Conditions with `$lookup`

Create a collection orders with the following documents:

db.orders.insert([
  { "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2 },
  { "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1 },
  { "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60 }
])

Create another collection warehouses with the following documents:

db.warehouses.insert([
  { "_id" : 1, "stock_item" : "almonds", warehouse: "A", "instock" : 120 },
  { "_id" : 2, "stock_item" : "pecans", warehouse: "A", "instock" : 80 },
  { "_id" : 3, "stock_item" : "almonds", warehouse: "B", "instock" : 60 },
  { "_id" : 4, "stock_item" : "cookies", warehouse: "B", "instock" : 40 },
  { "_id" : 5, "stock_item" : "cookies", warehouse: "A", "instock" : 80 }
])

The following operation joins the orders collection with the warehouse collection by the item and whether the quantity in stock is sufficient to cover the ordered quantity:

db.orders.aggregate([
   {
      $lookup:
         {
           from: "warehouses",
           let: { order_item: "$item", order_qty: "$ordered" },
           pipeline: [
              { $match:
                 { $expr:
                    { $and:
                       [
                         { $eq: [ "$stock_item",  "$$order_item" ] },
                         { $gte: [ "$instock", "$$order_qty" ] }
                       ]
                    }
                 }
              },
              { $project: { stock_item: 0, _id: 0 } }
           ],
           as: "stockdata"
         }
    }
])

The operation returns the following documents:

{ "_id" : 1, "item" : "almonds", "price" : 12, "ordered" : 2,
   "stockdata" : [ { "warehouse" : "A", "instock" : 120 },
   { "warehouse" : "B", "instock" : 60 } ] }
{ "_id" : 2, "item" : "pecans", "price" : 20, "ordered" : 1,
   "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] }
{ "_id" : 3, "item" : "cookies", "price" : 10, "ordered" : 60,
   "stockdata" : [ { "warehouse" : "A", "instock" : 80 } ] }

The operation corresponds to the following pseudo-SQL statement:

SELECT *, stockdata
FROM orders
WHERE stockdata IN ( SELECT warehouse, instock
                     FROM warehouses
                     WHERE stock_item = orders.item
                     AND instock >= orders.ordered );

The $expr operator only uses indexes on the from collection for equality matches. For example, if the index { stock_item: 1, instock: 1 } exists on the warehouses collection:

The equality match on the warehouses.stock_item field uses the index.
The range part of the query on the warehouses.instock field does not use the indexed field in the compound index.

Tip

Uncorrelated Subquery

Changed in version 3.6: MongoDB 3.6 adds support for executing a pipeline on the joined collection, which allows for specifying multiple join conditions as well as uncorrelated sub-queries.

Create a collection absences with the following documents:

db.absences.insert([
   { "_id" : 1, "student" : "Ann Aardvark", sickdays: [ new Date ("2018-05-01"),new Date ("2018-08-23") ] },
   { "_id" : 2, "student" : "Zoe Zebra", sickdays: [ new Date ("2018-02-01"),new Date ("2018-05-23") ] },
])

Create another collection holidays with the following documents:

db.holidays.insert([
   { "_id" : 1, year: 2018, name: "New Years", date: new Date("2018-01-01") },
   { "_id" : 2, year: 2018, name: "Pi Day", date: new Date("2018-03-14") },
   { "_id" : 3, year: 2018, name: "Ice Cream Day", date: new Date("2018-07-15") },
   { "_id" : 4, year: 2017, name: "New Years", date: new Date("2017-01-01") },
   { "_id" : 5, year: 2017, name: "Ice Cream Day", date: new Date("2017-07-16") }
])

The following operation joins the absences collection with 2018 holiday information from the holidays collection:

db.absences.aggregate([
   {
      $lookup:
         {
           from: "holidays",
           pipeline: [
              { $match: { year: 2018 } },
              { $project: { _id: 0, date: { name: "$name", date: "$date" } } },
              { $replaceRoot: { newRoot: "$date" } }
           ],
           as: "holidays"
         }
    }
])

The operation returns the following:

{ "_id" : 1, "student" : "Ann Aardvark", "sickdays" : [ ISODate("2018-05-01T00:00:00Z"), ISODate("2018-08-23T00:00:00Z") ],
    "holidays" : [ { "name" : "New Years", "date" : ISODate("2018-01-01T00:00:00Z") }, { "name" : "Pi Day", "date" : ISODate("2018-03-14T00:00:00Z") }, { "name" : "Ice Cream Day", "date" : ISODate("2018-07-15T00:00:00Z") } ] }
{ "_id" : 2, "student" : "Zoe Zebra", "sickdays" : [ ISODate("2018-02-01T00:00:00Z"), ISODate("2018-05-23T00:00:00Z") ],
    "holidays" : [ { "name" : "New Years", "date" : ISODate("2018-01-01T00:00:00Z") }, { "name" : "Pi Day", "date" : ISODate("2018-03-14T00:00:00Z") }, { "name" : "Ice Cream Day", "date" : ISODate("2018-07-15T00:00:00Z") } ] }

The operation would correspond to the following pseudo-SQL statement:

SELECT *, holidays
FROM absences
WHERE holidays IN (SELECT name, date
                    FROM holidays
                    WHERE year = 2018);

Back

$listSessions

$match (aggregation)

$lookup (aggregation)

Definition

Compatibility

Syntax

Equality Match with a Single Join Condition

Note

Join Conditions and Uncorrelated Sub-queries

Note

Note

Consideration

Views and Collation

Restrictions

Sharded Collection Restrictions

Performance Considerations

Important

Important

Examples

Perform a Single Equality Join with $lookup

Use $lookup with an Array

Use $lookup with $mergeObjects

Specify Multiple Join Conditions with $lookup

Tip

See also:

Uncorrelated Subquery

Perform a Single Equality Join with `$lookup`

Use `$lookup` with an Array

Use `$lookup` with `$mergeObjects`

Specify Multiple Join Conditions with `$lookup`