Aggregating 50M of records

aditya_rai1 · June 14, 2024, 1:06pm

i have a timeseries collection whose size is expected to have 50M records, right now i’m doing a poc with 3M records in which i aggregate just using group and it takes 20 sec, how can i make it fast? Note: i have a index on source field but mongo ends up doing collscan.

Below query:

[
  {
    $group: {
      _id: "$source",
      sum: {
        $sum: 1
      }
    }
  }
]

Ayush_Tiwari2 · November 11, 2024, 3:13am

same problem I am also getting, for me it is taking around 10 sec for 6 million records. Just a normal project and group. my actual logic is taking around a minute.
I wanted to use this for analytics but the performance is not looking great. Did you find any solution.

steevej · November 12, 2024, 10:52pm

Try some of the things that were share in

steevej · November 20, 2024, 2:06am

@Ayush_Tiwari2, can you provide on any findings you got while try the proposed alternative.

Please provide some feedback.

Ayush_Tiwari2 · November 20, 2024, 4:42am

will try this and share the findings