Docs Menu
Docs Home
/
MongoDB Manual
/ / /

Store Computed Data

On this page

  • About this Task
  • Steps
  • Results
  • Learn More

An application might need to derive a value from source data stored in a database. Computing a new value can require significant CPU resources, especially in the case of large data sets or in cases where multiple documents must be examined.

If a computed value is requested often, it can be more efficient to save that value in the database ahead of time. When the application requests data, only one read operation is required.

If reads are significantly more common than writes, the computed pattern reduces the frequency of data computation. Instead of computing values on every read, the application stores the computed value and recalculates it as needed. The application can either recompute the value with every write that changes the computed value's source data, or as part of a periodic job.

Note

With periodic updates, the returned computed value is not guaranteed to be exact. However, this approach may be worth the performance improvement if exact accuracy isn't a requirement.

In this example, an application displays movie viewer and revenue information. Users can look up a particular movie and how much money that movie made.

1

Create the screenings collection:

db.screenings.insertMany( [
{
theater: "Alger Cinema",
location: "Lakeview, OR",
movie_title: "Lost in the Shadows",
movie_id: 1,
num_viewers: 344,
revenue: 3440
},
{
theater: "City Cinema",
location: "New York, NY",
movie_title: "Lost in the Shadows",
movie_id: 1,
num_viewers: 1496,
revenue: 22440
},
] )
2

Users often want to know how many people saw a certain movie and how much money that movie made. In the current schema, to add num_viewers and revenue, you must perform a read for theaters that screened a movie with the title "Lost in the Shadows" and sum the values of those fields.

To avoid performing that computation every time the information is requested, you can compute the total values and store them in a movies collection with the movie record itself:

db.movies.insertOne(
{
_id: 1,
title: "Lost in the Shadows",
total_viewers: 1840,
total_revenue: 25880
}
)
3

Consider a new screening is added to the screenings collection:

db.screenings.insertOne(
{
theater: "Overland Park Cinema",
location: "Boise, ID",
movie_title: "Lost in the Shadows",
movie_id: 1,
num_viewers: 760,
revenue: 7600
}
)

The computed data in the movies collection no longer reflects the current screening data. How often you update computed data depends on your application:

  • In a low write environment, the computation can occur in conjunction with any update of the screenings data.

  • In an environment with more regular writes, the computations can be done at defined intervals (every hour for example). The source data in screenings isn't affected by writes to the movies collection, so you can run calculations at any time.

To update the computed data based on the screenings data, you can run the following aggregation at a regular interval:

db.screenings.aggregate( [
{
$group: {
_id: "$movie_id",
total_viewers: {
$sum: "$num_viewers"
},
total_revenue: {
$sum: "$revenue"
}
}
},
{
$merge: {
into: { db: "test", coll: "movies" },
on: "_id",
whenMatched: "merge"
}
}
] )
4

To confirm that the movies collection was updated, query the collection:

db.movies.find()

Output:

[
{
_id: 1,
title: 'Lost in the Shadows',
total_viewers: 2600,
total_revenue: 33480
}
]

The computed pattern reduces CPU workload and increases application performance. Consider the computed pattern your application performs the same calculations repeatedly and has a high read to write ratio.

  • Use the Approximation Pattern

  • Group Data

  • Data Consistency

Back

Computed Values