Docs Menu
Docs Home
/
MongoDB Manual
/ /

Map Schema Relationships

On this page

  • About this Task
  • Before you Begin
  • Steps
  • Identify related data in your schema
  • Create a schema map for your related data
  • Choose whether to embed related data or use references
  • Examples
  • Optimize Queries for Articles
  • Optimize Queries for Articles and Authors
  • Next Steps
  • Learn More

When you design your schema, consider how your application needs to query and return related data. How you map relationships between data entities affects your application's performance and scalability.

The recommended way to handle related data is to embed it in a sub-document. Embedding related data lets your application query the data it needs with a single read operation and avoid slow $lookup operations.

For some use cases, you can use a reference to point to related data in a separate collection.

To determine if you should embed related data or use references, consider the relative importance of the following goals for your application:

Improve queries on related data
If your application frequently queries one entity to return data about another entity, embed the data to avoid the need for frequent $lookup operations.
Improve data returned from different entities
If your application returns data from related entities together, embed the data in a single collection.
Improve update performance
If your application frequently updates related data, consider storing the data in its own collection and using a reference to access it. When you use a reference, you reduce your application's write workload by only needing to update the data in a single place.

To learn more about the benefits of embedded data and references, see Embedded Data Versus References.

Mapping relationships is the second step of the schema design process. Before you map relationships, identify your application's workload to determine the data it needs.

1

Identify the data that your application queries and how entities relate to each other.

Consider the operations you identified from your application's workload in the first step of the schema design process. Note the information these operations write and return, and what information overlaps between multiple operations.

2

Your schema map should show related data fields and the type of relationship between those fields (one-to-one, one-to-many, many-to-many).

Your schema map can resemble an entity-relationship model.

3

The decision to embed data or use references depends on your application's common queries. Review the queries you identified in the Identify Application Workload step and use the guidelines mentioned earlier on this page to design your schema to support frequent and critical queries.

Configure your databases, collections, and application logic to match the approach you choose.

Consider the following schema map for a blog application:

Schema map for a blog application

The following examples show how to optimize your schema for different queries depending on the needs of your application.

If your application primarily queries articles for information such as title, embed related information in the articles collection to return all data needed by the application in a single operation.

The following document is optimized for queries on articles:

db.articles.insertOne(
{
title: "My Favorite Vacation",
date: ISODate("2023-06-02"),
text: "We spent seven days in Italy...",
tags: [
{
name: "travel",
url: "<blog-site>/tags/travel"
},
{
name: "adventure",
url: "<blog-site>/tags/adventure"
}
],
comments: [
{
name: "pedro123",
text: "Great article!"
}
],
author: {
name: "alice123",
email: "alice@mycompany.com",
avatar: "photo1.jpg"
}
}
)

If your application returns article information and author information separately, consider storing articles and authors in separate collections. This schema design reduces the work required to return author information, and lets you return only author information without including unneeded fields.

In the following schema, the articles collection contains an authorId field, which is a reference to the authors collection.

db.articles.insertOne(
{
title: "My Favorite Vacation",
date: ISODate("2023-06-02"),
text: "We spent seven days in Italy...",
authorId: 987,
tags: [
{
name: "travel",
url: "<blog-site>/tags/travel"
},
{
name: "adventure",
url: "<blog-site>/tags/adventure"
}
],
comments: [
{
name: "pedro345",
text: "Great article!"
}
]
}
)
db.authors.insertOne(
{
_id: 987,
name: "alice123",
email: "alice@mycompany.com",
avatar: "photo1.jpg"
}
)

After you map relationships for your application's data, the next step in the schema design process is to apply design patterns to optimize your schema. See Apply Design Patterns.

Back

Identify Workload