Hi,
We are using mongodb v7.0 with feature set 6.0 enabled.
Version: 6.0.5
meta field: userid
time field: event time
granularity: hours
collection size: 40billion events / 3TB compressed
indexes: userid & userid_eventtime
We have already migrated around 40billion events to timeseries data model and the writes and reads were fine. As we progress thru 3 weeks, we started writes and reads gettting slow. After debugging we found that the number of buckets got higher for metaField (user id) and if I read the documents of that user key, delete and re-insert, the performace increased more than 10x.
This could be a crude way of merging the buckets and there should be some function to do it.
Also why hudreds or thousands of buckets get created if I expect only one bucket for a month for a metaField (userid)
Below numbers for a sample user:
db.system.buckets.userEvents.find({“meta”:380580264}).count()
3120
db.userEvents.find({“uid”:380580264}).count()
20539
After deleting and re-inserting the user documents:
db.system.buckets.userEvents.find({“meta”:380580264}).count()
370
db.userEvents.find({“uid”:380580264}).count()
20539
Please help with
- How can i make sure the events are merged to the respective bucket and not new buckets getting created frequently
- How to merge the existing several hundreds of buckets
Let me know if you need any further info.
Thanks.