Is there a way do to Bulk-findOne queries?

Frank · November 24, 2024, 9:09pm

Hi all,

In my current NodeJS app I need to get many individual documents by a (unique) id field. Currently we do that with an {$in: [...]} query:

<collection>.find({ guid: { $in: guids }.limit(ids.length)

However the list of guids can get a long as a 300 ids, and we executed such queries on large (up to multi million document) collections. Making these some of the slowest queries on our cluster. Despite the index on the guid field.
I’m not sure if MongoDB is recognizing that it’s a field with a unique index, and if it might be doing a lot of unnecessary work because of that. Since nothing in the query itself signals there can only be 1 document per guid.

The only alternative I can think of is making separate findOne() queries, but I worry that 300 entirely separate queries from the NodeJS app will bring a lot of overhead.
I have done some googling and searching the docs but have not been able to find anything else.

For writing data we use the <collection>.bulkWrite() feature which solves the problem of making many separate requests to the cluster. But I can’t find an equivalent for reading data, to do bulk findOne() queries.

Is there any alternative to the $in query, or a way to speed it up that I am now missing?

steevej · November 25, 2024, 4:01am

I would try to use an aggregation that start with a $documents stage where each document is a guid. The I would $lookup in the collection using localField and foreignField.
The general form of the pipeline would be something like:

I would worry too about:

steevej · December 2, 2024, 2:51am

Any luck with this?

I linked another thread where the same $documents technique could be used. So it would be nice if you could share your findings.

Frank · December 2, 2024, 2:28pm

Hi Steve,

In the end I found out these were in fact not quite the slowest queries on the cluster.
So my focus has been shifted for now.
I haven’t actually bench-marked the pipeline- approach yet, though it seems promising.

Thank you very much for taking to time to answer