Hi
I have a Spark connector that reads from my MongoDB Database with version information:
"clientMetadata": {
"driver": {
"name": "mongo-java-driver|legacy|mongo-spark",
"version": "3.12.3|2.4.1"
},
"os": {
"type": "Linux",
"name": "Linux",
"architecture": "amd64",
"version": "3.10.0-1160.99.1.el7.x86_64"
},
"platform": "Java/Red Hat, Inc./1.8.0_382-b05|Scala/2.11.12:Spark/2.4.8.7.1.9.0-387"
},
and for some reason that I cannot find the root cause of, every read query creates a filter:
{
"$match": {
"_id": {
"$lt": "747877945yrhduwedu"
}
}
}
which i do not specify in the aggregation pipeline at all - this then causes the query to scan the entire collection and creates slow queries - if I test an aggregation pipeline removing this $match the query is lightning fast.
Any assistance would be greatly appreciated
Kindest Regards
Gareth Furnell