So we have around 312,000 documents in our vectorDB, and around 12.71 GB of data (including replicas/shards).
Right now, we are at M10 cluster, which occassionally scales to M20 cluster in light of memory spikes (that happen mostly during ingestion/syncing of data in AWS Bedrock Knowledge Base/MongoDB Vector Store)
My question is twofold,
First, is there a way to reduce memory usage somehow during periods of inactivity? For eg, when there is no more data being ingested, can we minimize the cluster usage. Maybe “pause the storage of the embeddings in RAM” or some similar strategy.
Second, what is the expected memory/RAM usage if we scale up to 1 million docs (~ 40 GB ) data? Is there some concrete relationship that we can look up to to figure out the expected system usage and thereby cost?
Would we need 8 GB RAM and hence (M40 and above clusters) in case we have 1 million documents because that would increase the static memory/RAM usage ? or
Would we just need to autoscale to higher clusters during ingestion?
Clarity on these topics would be great to have.
Hi @Abhay_Saini! Thank you for posting to the forums, hopefully I can get you the answers you need.
There is no way of reducing memory usage during periods of activity. Presently the machine you use needs to have enough RAM to capably serve queries without needing to perform disk seeks. You could always pause your cluster in times where you expect there to be no usage.
Typically the RAM required to host an index in filesystem cache, as noted in our deployment options guidance here, is roughly 3kb/ 768dimension vector. Having additional metadata can increase this, but am I surprised to hear you say that you need 13GB for 312k documents. Are you sure this is the vectorSearch index size and not the reported collection size?
Thanks a lot for responding Henry!
Before diving into the response, lemme just give some context. The entire premise is based off constant RAM/Memory usage in MongoDB, which apparently increases as the data increases. We don’t know any exact relation between that data and the RAM/memory usage.
I actually asked whether we can reduce memory during periods of in-activity. So, a period of activity would be when we are syncing our data with a mongodb vector store (which is in turn linked to AWS Bedrock) while a period of in-activity would be when this is not happening.
And yes, apart from pausing a cluster would there be any other recourse to minimize memory/RAM usage?
What exaclty does this 3KB refer to? Btw our embedding dimensions are 1536.
And the 13.71 GB is the storage of the embeddings within the collection. Not the vectorsearchIndex size, which is a few Mega Bytes.
Just reiterating my questions :
What exactly is responsible for a constant memory usage within MongoDB. And how does that scale up with increasing collection size.