I have tested two methods of fetching the same data from a MongoDb Atlas hosted database. embeddingOpenAi3Large is an array of 3072 numbers . Method 1 individually fetches all 50 items, if I include the embeddingOpenAi3Large it takes 1400ms seconds, if I don’t, it takes 380ms.
let startTime = Date.now();
const promises: Promise<INewsItemWithEmbeddedContent[]>[] = new Array<
Promise<INewsItemWithEmbeddedContent[]>
>();
for (const result of results) {
promises.push(
NewsItem.loadItems(
{ _id: result._id },
{ _id: 1, embeddingOpenAi3Large: 1 },
),
);
}
await Promise.all(promises);
let endTime = Date.now();
console.log(endTime - startTime);
Method 2 adds the 50 ids to an array to fetch all the data as a single query, if I include embeddingOpenAi3Large it takes over 21 seconds, if I don’t it takes 80ms.
startTime = Date.now();
for (const result of results) {
resultIds.push(result._id);
}
const embeddingResult: INewsItemWithEmbeddedContent[] =
await NewsItem.loadItems(
{ _id: { $in: resultIds } },
{ _id: 1, embeddingOpenAi3Large: 1 },
);
endTime = Date.now();
console.log(endTime - startTime);
I don’t understand why downloading all the embeddings in a single query should be so much slower than fetching them individually. Some insights would be appreciated.
loadItems does a lean find with the passed filter and projection.