Model IoT Data
The Internet of Things (IoT) is a network of physical objects that are connected to the internet. Many of these devices, like sensors, generate data.
To store and retrieve this data efficiently, you can use the bucket pattern.
The Bucket Pattern
A common method to organize IoT data is to group the data into buckets. Bucketing organizes specific groups of data to help:
Discover historical trends,
Forecast future trends, and
Optimize storage usage.
Common parameters to group data by are:
time
data source (if you have multiple data sets)
customer
type of data (for example, transaction type in financial data)
Note
Starting in MongoDB 5.0, time series collections are the recommended collection type for time series data. Do not use the bucket pattern in conjunction with time series collections as this can degrade performance.
Consider a collection that stores temperature data obtained from a
sensor. The sensor records the temperature every minute and
stores the data in a collection called temperatures
:
// temperatures collection { "_id": 1, "sensor_id": 12345, "timestamp": ISODate("2019-01-31T10:00:00.000Z"), "temperature": 40 } { "_id": 2, "sensor_id": 12345, "timestamp": ISODate("2019-01-31T10:01:00.000Z"), "temperature": 40 } { "_id": 3, "sensor_id": 12345, "timestamp": ISODate("2019-01-31T10:02:00.000Z"), "temperature": 41 } ...
This approach does not scale well in terms of data and index size. For
example, if the application requires indexes on the sensor_id
and
timestamp
fields, every incoming reading from the sensor would need
to be indexed to improve performance.
You can leverage the document model to bucket the data into documents that hold the measurements for a particular timespan. Consider the following updated schema which buckets the readings taken every minute into hour-long groups:
{ "_id": 1, "sensor_id": 12345, "start_date": ISODate("2019-01-31T10:00:00.000Z"), "end_date": ISODate("2019-01-31T10:59:59.000Z"), "measurements": [ { "timestamp": ISODate("2019-01-31T10:00:00.000Z"), "temperature": 40 }, { "timestamp": ISODate("2019-01-31T10:01:00.000Z"), "temperature": 40 }, ... { "timestamp": ISODate("2019-01-31T10:42:00.000Z"), "temperature": 42 } ], "transaction_count": 42, "sum_temperature": 1783 }
This updated schema improves scalability and mirrors how the application actually uses the data. A user likely wouldn't query for a specific temperature reading. Instead, a user would likely query for temperature behavior over the course of an hour or day. The Bucket pattern helps facilitate those queries by grouping the data into uniform time periods.
Combine the Computed and Bucket Patterns
The example document contains two computed
fields: transaction_count
and sum_temperature
. If the
application frequently needs to retrieve the sum of temperatures for a
given hour, computing a running total of the sum can help save
application resources. This Computed Pattern approach eliminates the
need to calculate the sum each time the data is requested.
The pre-aggregated sum_temperature
and transaction_count
values
enable further computations such as the average temperature
(sum_temperature
/ transaction_count
) for a particular bucket.
It is much more likely that users will query the application for
the average temperature between 2:00 and 3:00 PM rather than querying
for the specific temperature at 2:03 PM. Bucketing and pre-computing
certain values allows the application to more readily provide that
information.
Time Representations in MongoDB
MongoDB stores times in UTC by default, and converts any local time representations into this form. Applications that must operate or report on some unmodified local time value may store the time zone alongside the UTC timestamp, and compute the original local time in their application logic.
Example
In the MongoDB shell, you can store both the current date and the current client's offset from UTC.
var now = new Date(); db.data.save( { date: now, offset: now.getTimezoneOffset() } );
You can reconstruct the original local time by applying the saved offset:
var record = db.data.findOne(); var localNow = new Date( record.date.getTime() - ( record.offset * 60000 ) );