Docs Menu

Model IoT Data

The Internet of Things (IoT) is a network of physical objects that are connected to the internet. Many of these devices, like sensors, generate data.

To store and retrieve this data efficiently, you can use the bucket pattern.

A common method to organize IoT data is to group the data into buckets. Bucketing organizes specific groups of data to help:

  • Discover historical trends,

  • Forecast future trends, and

  • Optimize storage usage.

Common parameters to group data by are:

  • time

  • data source (if you have multiple data sets)

  • customer

  • type of data (for example, transaction type in financial data)

Note

Starting in MongoDB 5.0, time series collections are the recommended collection type for time series data. Do not use the bucket pattern in conjunction with time series collections as this can degrade performance.

Consider a collection that stores temperature data obtained from a sensor. The sensor records the temperature every minute and stores the data in a collection called temperatures:

// temperatures collection
{
"_id": 1,
"sensor_id": 12345,
"timestamp": ISODate("2019-01-31T10:00:00.000Z"),
"temperature": 40
}
{
"_id": 2,
"sensor_id": 12345,
"timestamp": ISODate("2019-01-31T10:01:00.000Z"),
"temperature": 40
}
{
"_id": 3,
"sensor_id": 12345,
"timestamp": ISODate("2019-01-31T10:02:00.000Z"),
"temperature": 41
}
...

This approach does not scale well in terms of data and index size. For example, if the application requires indexes on the sensor_id and timestamp fields, every incoming reading from the sensor would need to be indexed to improve performance.

You can leverage the document model to bucket the data into documents that hold the measurements for a particular timespan. Consider the following updated schema which buckets the readings taken every minute into hour-long groups:

{
"_id": 1,
"sensor_id": 12345,
"start_date": ISODate("2019-01-31T10:00:00.000Z"),
"end_date": ISODate("2019-01-31T10:59:59.000Z"),
"measurements": [
{
"timestamp": ISODate("2019-01-31T10:00:00.000Z"),
"temperature": 40
},
{
"timestamp": ISODate("2019-01-31T10:01:00.000Z"),
"temperature": 40
},
...
{
"timestamp": ISODate("2019-01-31T10:42:00.000Z"),
"temperature": 42
}
],
"transaction_count": 42,
"sum_temperature": 1783
}

This updated schema improves scalability and mirrors how the application actually uses the data. A user likely wouldn't query for a specific temperature reading. Instead, a user would likely query for temperature behavior over the course of an hour or day. The Bucket pattern helps facilitate those queries by grouping the data into uniform time periods.

The example document contains two computed fields: transaction_count and sum_temperature. If the application frequently needs to retrieve the sum of temperatures for a given hour, computing a running total of the sum can help save application resources. This Computed Pattern approach eliminates the need to calculate the sum each time the data is requested.

The pre-aggregated sum_temperature and transaction_count values enable further computations such as the average temperature (sum_temperature / transaction_count) for a particular bucket. It is much more likely that users will query the application for the average temperature between 2:00 and 3:00 PM rather than querying for the specific temperature at 2:03 PM. Bucketing and pre-computing certain values allows the application to more readily provide that information.

MongoDB stores times in UTC by default, and converts any local time representations into this form. Applications that must operate or report on some unmodified local time value may store the time zone alongside the UTC timestamp, and compute the original local time in their application logic.

In the MongoDB shell, you can store both the current date and the current client's offset from UTC.

var now = new Date();
db.data.insertOne( { date: now,
offset: now.getTimezoneOffset() } );

You can reconstruct the original local time by applying the saved offset:

var record = db.data.findOne();
var localNow = new Date( record.date.getTime() - ( record.offset * 60000 ) );