Hi, Team. I have a REST service in fastapi which has an endpoint which accepts string name as input then it checks if mongo collection has similar name using vector_search index. If not it inserts the name else it updates count in the record with the name which has the highest similarity score. For example:
- request 1 name: “Test”
- request 2 name: “Test”
- request 3 name: “Test”
similarity score would be 1.0 for these inputs so the result in mongo collection will be:
{name: "Test", "count": 3}
Example with different names
- request 1 name: “Test”
- request 2 name: “Examination”
- request 3 name: “Test”
The result in mongo collection will be:
{name: "Test", "count": 2}
{name: "Examination", "count": 1}
Expected result:
Inserts the different names and updates existing
Actual results:
-
If I send 30 requests one after another most of the time similarity search returns empty list even if the next request contains the same name as in previous request and inserted data is present in the collection before the search query starts.
-
If I send one request, then wait for some time in debug mode or with time.sleep() before search and then send another request with the same name search returns the expected data.
Question
May you please tell me what could be the reason of such behavior? Do I need to wait for some time after insert before doing the vector search? Maybe index is being updated during this time and search operates outdated data? How this can be resolved?
Index configuration:
{
"mappings": {
"dynamic": false,
"fields": {
"topic_id": {
"type": "token"
},
"vector": [
{
"dimensions": 1024,
"similarity": "cosine",
"type": "knnVector"
}
]
}
}
}
Vector search pipeline:
[ { "$search": { "index": "topic_vector_index", "knnBeta": { "vector": topic_vector, "path": "vector", "k": 2, }, }, }, {"$project": {"vector": 0, "_id": 0, "score": {"$meta": "searchScore"}}},]