Vector Search index partitioning

Anton_Kulikalov · November 11, 2024, 4:14am

Hey guys,
I couldn’t find any recipe on search index partitioning.
I have 1.000 customers, each one has 20.000 to 100.000 records I need to search through with lowest possible latency. They never overlap, I never need to search through multiple of them at once.
How do I handle this case?
I guess the only way right now is to create collections per customer? I wonder how mongoose is going to handle that.

Henry_Weller · November 11, 2024, 3:00pm

Hi @Anton_Kulikalov!

You should be in good shape actually having all of these colocated within a single collection with a single vectorSearch index defined against it that also indexes a tenant id field to be considered as a prefilter at query time. We should have an update to our docs soon reflecting this as our recommended multitenancy scheme.

Although our docs recommend 10k per tenant as an upper limit for exact vector search, I actually think you should be fine performing ENN if you have 20k to 100k records per tenant, as the prefilter would be applied before matching documents are exhaustively searched via vector similarity. This is set at query time by specifying exact: true in your query interface. I suspect if you didn’t set this to true that you would get a similar response in query time due to the nature of how pre-filtering works with our implementation of HNSW, though I would also recommend testing that.

Anton_Kulikalov · November 11, 2024, 3:35pm

Thanks! So far looking good, ~100ms latency