Index not working with $project

I have a collection of Address with collation and with following example document.

Collation: { locale: ‘en’, strength: 2}

{
“_id”: {
“$oid”: “66cc60ce52482617a4fbdba7”
},
“pkAddressId”: null,
“addressLine”: “1 A RUSSELL ST”,
“serviceableMarkets”: [
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da46d”
},
“providerCode”: “ABC”
},
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da46e”
},
“providerCode”: “DEF”
},
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da474”
},
“providerCode”: “GHI”
}
],
“apartments”: null,
“zipCode”: “02149”,
“city”: “Everett”,
“state”: “Massachusetts”,
“stateCode”: “MA”
}

I have created an index on providerCode field

I have a 2 stage aggregation which is as follows
Stage 1 : $project
{
_id: 1,
pkAddressId: 1,
addressLine: 1,
serviceableMarkets: 1,
apartments: 1,
zipCode: 1,
city: 1,
state: 1,
stateCode: 1
}

Stage 2 : $match
{
serviceableMarkets: {
$elemMatch: {
providerCode: “GHI”
}
}
}

Due to the projection, my aggregation is skipping index whereas if I switch off the projection stage then my index is working. What changes do I need to do for the index to work?

image

{
“explainVersion”: “1”,
“stages”: [
{
“$cursor”: {
“queryPlanner”: {
“namespace”: “BundleDealer.PrdCtlg.Address”,
“indexFilterSet”: false,
“parsedQuery”: {},
“collation”: {
“locale”: “en”,
“caseLevel”: false,
“caseFirst”: “off”,
“strength”: 2,
“numericOrdering”: false,
“alternate”: “non-ignorable”,
“maxVariable”: “punct”,
“normalization”: false,
“backwards”: false,
“version”: “57.1”
},
“queryHash”: “F862A2A6”,
“planCacheKey”: “F862A2A6”,
“maxIndexedOrSolutionsReached”: false,
“maxIndexedAndSolutionsReached”: false,
“maxScansToExplodeReached”: false,
“winningPlan”: {
“stage”: “PROJECTION_DEFAULT”,
“transformBy”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$$dtoServiceableMarket.marketId”,
“providerCode”: “$$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
},
“inputStage”: {
“stage”: “COLLSCAN”,
“direction”: “forward”
}
},
“rejectedPlans”:
},
“executionStats”: {
“executionSuccess”: true,
“nReturned”: 12126215,
“executionTimeMillis”: 72553,
“totalKeysExamined”: 0,
“totalDocsExamined”: 12126215,
“executionStages”: {
“stage”: “PROJECTION_DEFAULT”,
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 16906,
“works”: 12126216,
“advanced”: 12126215,
“needTime”: 0,
“needYield”: 0,
“saveState”: 15284,
“restoreState”: 15284,
“isEOF”: 1,
“transformBy”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$$dtoServiceableMarket.marketId”,
“providerCode”: “$$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
},
“inputStage”: {
“stage”: “COLLSCAN”,
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 1438,
“works”: 12126216,
“advanced”: 12126215,
“needTime”: 0,
“needYield”: 0,
“saveState”: 15284,
“restoreState”: 15284,
“isEOF”: 1,
“direction”: “forward”,
“docsExamined”: 12126215
}
}
}
},
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 63830
},
{
“$match”: {
“serviceableMarkets”: {
“$elemMatch”: {
“providerCode”: {
“$eq”: “SMITHVILLE”
}
}
}
},
“nReturned”: 41520,
“executionTimeMillisEstimate”: 72511
}
],
“serverInfo”: {
“host”: “IBDXANV-DEVRPT1”,
“port”: 51072,
“version”: “6.0.14”,
“gitVersion”: “25225db95574916fecab3af75b184409f8713aef”
},
“serverParameters”: {
“internalQueryFacetBufferSizeBytes”: 104857600,
“internalQueryFacetMaxOutputDocSizeBytes”: 104857600,
“internalLookupStageIntermediateDocumentMaxSizeBytes”: 104857600,
“internalDocumentSourceGroupMaxMemoryBytes”: 104857600,
“internalQueryMaxBlockingSortMemoryUsageBytes”: 104857600,
“internalQueryProhibitBlockingMergeOnMongoS”: 0,
“internalQueryMaxAddToSetBytes”: 104857600,
“internalDocumentSourceSetWindowFieldsMaxMemoryBytes”: 104857600
},
“command”: {
“aggregate”: “PrdCtlg.Address”,
“pipeline”: [
{
“$project”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$$dtoServiceableMarket.marketId”,
“providerCode”: “$$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
}
},
{
“$match”: {
“serviceableMarkets”: {
“$elemMatch”: {
“providerCode”: “SMITHVILLE”
}
}
}
}
],
“cursor”: {},
“maxTimeMS”: 60000,
“$db”: “BundleDealer”
},
“ok”: 1,
“$clusterTime”: {
“clusterTime”: {
“$timestamp”: “7452783790692237313”
},
“signature”: {
“hash”: “0if33Bi2IeYQhqg7sTLPIgNjWBk=”,
“keyId”: {
“low”: 2,
“high”: 1726729873,
“unsigned”: false
}
}
},
“operationTime”: {
“$timestamp”: “7452783790692237313”
}
}

Please read Formatting code and log snippets in posts and update your post so that we can read, understand and cut-n-paste the code and documents you shared.

As you suspected in

the $project is the culprit. The server cannot assume that the index on the original documents could be use after the $project since it changes the structure of the documents. In your index probably could be used but determining that could be expensive for the server.

If you move the $project after the $match the index will probably be used.

Anyway, if you think about it, index or not, doing a $project on all documents then filtering a subset with a $match is certainly less efficient that filtering first and then altering.

2 Likes

I have provided above details with the formatting now. Please note that I am using Microsoft.AspNetCore.OData 9.1.1 with MongoDB.Driver 2.29.0. And my webapi is designed with CleanArchitecture i.e. I have 3 layers in my project

  1. WebAPI
  2. Application Layer
  3. Repository Layer
    All of these layers have their own data models and data when sent from one layer to the other is being translated using AutoMapper 13.0.1 into their respective models. oData is applied on webapi layer and due the translations on each layer $project is being applied by automapper on IQueryable.

Following is my collation

{ locale: ‘en’, strength: 2}

Following is my Address Document

{
“_id”: {
“$oid”: “66cc60ce52482617a4fbdba7”
},
“pkAddressId”: null,
“addressLine”: “1 A RUSSELL ST”,
“serviceableMarkets”: [
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da46d”
},
“providerCode”: “ABC”
},
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da46e”
},
“providerCode”: “DEF”
},
{
“marketId”: {
“$oid”: “66cd95d04d9e9ff7e31da474”
},
“providerCode”: “GHI”
}
],
“apartments”: null,
“zipCode”: “02149”,
“city”: “Everett”,
“state”: “Massachusetts”,
“stateCode”: “MA”
}

Following is my 2 stage aggregation
Stage 1: $project

{
_id: 1,
pkAddressId: 1,
addressLine: 1,
serviceableMarkets: 1,
apartments: 1,
zipCode: 1,
city: 1,
state: 1,
stateCode: 1
}

Stage 2: $match

{
serviceableMarkets: {
$elemMatch: {
providerCode: “GHI”
}
}
}

Following is the explain plan

{
“explainVersion”: “1”,
“stages”: [
{
“$cursor”: {
“queryPlanner”: {
“namespace”: “BundleDealer.PrdCtlg.Address”,
“indexFilterSet”: false,
“parsedQuery”: {},
“collation”: {
“locale”: “en”,
“caseLevel”: false,
“caseFirst”: “off”,
“strength”: 2,
“numericOrdering”: false,
“alternate”: “non-ignorable”,
“maxVariable”: “punct”,
“normalization”: false,
“backwards”: false,
“version”: “57.1”
},
“queryHash”: “F862A2A6”,
“planCacheKey”: “F862A2A6”,
“maxIndexedOrSolutionsReached”: false,
“maxIndexedAndSolutionsReached”: false,
“maxScansToExplodeReached”: false,
“winningPlan”: {
“stage”: “PROJECTION_DEFAULT”,
“transformBy”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$dtoServiceableMarket.marketId”,
“providerCode”: “$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
},
“inputStage”: {
“stage”: “COLLSCAN”,
“direction”: “forward”
}
},
“rejectedPlans”: 
},
“executionStats”: {
“executionSuccess”: true,
“nReturned”: 12126215,
“executionTimeMillis”: 72553,
“totalKeysExamined”: 0,
“totalDocsExamined”: 12126215,
“executionStages”: {
“stage”: “PROJECTION_DEFAULT”,
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 16906,
“works”: 12126216,
“advanced”: 12126215,
“needTime”: 0,
“needYield”: 0,
“saveState”: 15284,
“restoreState”: 15284,
“isEOF”: 1,
“transformBy”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$dtoServiceableMarket.marketId”,
“providerCode”: “$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
},
“inputStage”: {
“stage”: “COLLSCAN”,
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 1438,
“works”: 12126216,
“advanced”: 12126215,
“needTime”: 0,
“needYield”: 0,
“saveState”: 15284,
“restoreState”: 15284,
“isEOF”: 1,
“direction”: “forward”,
“docsExamined”: 12126215
}
}
}
},
“nReturned”: 12126215,
“executionTimeMillisEstimate”: 63830
},
{
“$match”: {
“serviceableMarkets”: {
“$elemMatch”: {
“providerCode”: {
“$eq”: “GHI”
}
}
}
},
“nReturned”: 41520,
“executionTimeMillisEstimate”: 72511
}
],
“serverInfo”: {
“host”: “IBDXANV-DEVRPT1”,
“port”: 51072,
“version”: “6.0.14”,
“gitVersion”: “25225db95574916fecab3af75b184409f8713aef”
},
“serverParameters”: {
“internalQueryFacetBufferSizeBytes”: 104857600,
“internalQueryFacetMaxOutputDocSizeBytes”: 104857600,
“internalLookupStageIntermediateDocumentMaxSizeBytes”: 104857600,
“internalDocumentSourceGroupMaxMemoryBytes”: 104857600,
“internalQueryMaxBlockingSortMemoryUsageBytes”: 104857600,
“internalQueryProhibitBlockingMergeOnMongoS”: 0,
“internalQueryMaxAddToSetBytes”: 104857600,
“internalDocumentSourceSetWindowFieldsMaxMemoryBytes”: 104857600
},
“command”: {
“aggregate”: “PrdCtlg.Address”,
“pipeline”: [
{
“$project”: {
“_id”: “$_id”,
“addressLine”: “$addressLine”,
“serviceableMarkets”: {
“$map”: {
“input”: “$serviceableMarkets”,
“as”: “dtoServiceableMarket”,
“in”: {
“marketId”: “$dtoServiceableMarket.marketId”,
“providerCode”: “$dtoServiceableMarket.providerCode”
}
}
},
“apartments”: “$apartments”,
“zipCode”: “$zipCode”,
“city”: “$city”,
“state”: “$state”,
“stateCode”: “$stateCode”
}
},
{
“$match”: {
“serviceableMarkets”: {
“$elemMatch”: {
“providerCode”: “GHI”
}
}
}
}
],
“cursor”: {},
“maxTimeMS”: 60000,
“$db”: “BundleDealer”
},
“ok”: 1,
“$clusterTime”: {
“clusterTime”: {
“$timestamp”: “7452783790692237313”
},
“signature”: {
“hash”: “0if33Bi2IeYQhqg7sTLPIgNjWBk=”,
“keyId”: {
“low”: 2,
“high”: 1726729873,
“unsigned”: false
}
}
},
“operationTime”: {
“$timestamp”: “7452783790692237313”
}
}

One more thing I have witnessed is that if I modify my $match stage by replacing $elemMatch with a filter directly on the field. The index is working along with $project

{
  "serviceableMarkets.providerCode": "GHI"
}

Following is the explain plan

{
  "explainVersion": "1",
  "stages": [
    {
      "$cursor": {
        "queryPlanner": {
          "namespace": "BundleDealer.PrdCtlg.Address",
          "indexFilterSet": false,
          "parsedQuery": {
            "serviceableMarkets.providerCode": {
              "$eq": "GHI"
            }
          },
          "collation": {
            "locale": "en",
            "caseLevel": false,
            "caseFirst": "off",
            "strength": 2,
            "numericOrdering": false,
            "alternate": "non-ignorable",
            "maxVariable": "punct",
            "normalization": false,
            "backwards": false,
            "version": "57.1"
          },
          "queryHash": "D7A189BF",
          "planCacheKey": "323455B5",
          "maxIndexedOrSolutionsReached": false,
          "maxIndexedAndSolutionsReached": false,
          "maxScansToExplodeReached": false,
          "winningPlan": {
            "stage": "PROJECTION_DEFAULT",
            "transformBy": {
              "_id": "$_id",
              "pkAddressId": "$pkAddressId",
              "addressLine": "$addressLine",
              "serviceableMarkets": {
                "$map": {
                  "input": "$serviceableMarkets",
                  "as": "dtoServiceableMarketEntity",
                  "in": {
                    "marketId": "$$dtoServiceableMarketEntity.marketId",
                    "providerCode": "$$dtoServiceableMarketEntity.providerCode"
                  }
                }
              },
              "apartments": "$apartments",
              "zipCode": "$zipCode",
              "city": "$city",
              "state": "$state",
              "stateCode": "$stateCode"
            },
            "inputStage": {
              "stage": "FETCH",
              "inputStage": {
                "stage": "IXSCAN",
                "keyPattern": {
                  "serviceableMarkets.providerCode": 1
                },
                "indexName": "ProviderCode",
                "collation": {
                  "locale": "en",
                  "caseLevel": false,
                  "caseFirst": "off",
                  "strength": 2,
                  "numericOrdering": false,
                  "alternate": "non-ignorable",
                  "maxVariable": "punct",
                  "normalization": false,
                  "backwards": false,
                  "version": "57.1"
                },
                "isMultiKey": true,
                "multiKeyPaths": {
                  "serviceableMarkets.providerCode": [
                    "serviceableMarkets"
                  ]
                },
                "isUnique": false,
                "isSparse": true,
                "isPartial": false,
                "indexVersion": 2,
                "direction": "forward",
                "indexBounds": {
                  "serviceableMarkets.providerCode": [
                    "[CollationKey(0x4d41394f3753393f3f31010e), CollationKey(0x4d41394f3753393f3f31010e)]"
                  ]
                }
              }
            }
          },
          "rejectedPlans": []
        },
        "executionStats": {
          "executionSuccess": true,
          "nReturned": 41520,
          "executionTimeMillis": 400,
          "totalKeysExamined": 41520,
          "totalDocsExamined": 41520,
          "executionStages": {
            "stage": "PROJECTION_DEFAULT",
            "nReturned": 41520,
            "executionTimeMillisEstimate": 81,
            "works": 41521,
            "advanced": 41520,
            "needTime": 0,
            "needYield": 0,
            "saveState": 61,
            "restoreState": 61,
            "isEOF": 1,
            "transformBy": {
              "_id": "$_id",
              "pkAddressId": "$pkAddressId",
              "addressLine": "$addressLine",
              "serviceableMarkets": {
                "$map": {
                  "input": "$serviceableMarkets",
                  "as": "dtoServiceableMarketEntity",
                  "in": {
                    "marketId": "$$dtoServiceableMarketEntity.marketId",
                    "providerCode": "$$dtoServiceableMarketEntity.providerCode"
                  }
                }
              },
              "apartments": "$apartments",
              "zipCode": "$zipCode",
              "city": "$city",
              "state": "$state",
              "stateCode": "$stateCode"
            },
            "inputStage": {
              "stage": "FETCH",
              "nReturned": 41520,
              "executionTimeMillisEstimate": 22,
              "works": 41521,
              "advanced": 41520,
              "needTime": 0,
              "needYield": 0,
              "saveState": 61,
              "restoreState": 61,
              "isEOF": 1,
              "docsExamined": 41520,
              "alreadyHasObj": 0,
              "inputStage": {
                "stage": "IXSCAN",
                "nReturned": 41520,
                "executionTimeMillisEstimate": 20,
                "works": 41521,
                "advanced": 41520,
                "needTime": 0,
                "needYield": 0,
                "saveState": 61,
                "restoreState": 61,
                "isEOF": 1,
                "keyPattern": {
                  "serviceableMarkets.providerCode": 1
                },
                "indexName": "ProviderCode",
                "collation": {
                  "locale": "en",
                  "caseLevel": false,
                  "caseFirst": "off",
                  "strength": 2,
                  "numericOrdering": false,
                  "alternate": "non-ignorable",
                  "maxVariable": "punct",
                  "normalization": false,
                  "backwards": false,
                  "version": "57.1"
                },
                "isMultiKey": true,
                "multiKeyPaths": {
                  "serviceableMarkets.providerCode": [
                    "serviceableMarkets"
                  ]
                },
                "isUnique": false,
                "isSparse": true,
                "isPartial": false,
                "indexVersion": 2,
                "direction": "forward",
                "indexBounds": {
                  "serviceableMarkets.providerCode": [
                    "[CollationKey(0x4d41394f3753393f3f31010e), CollationKey(0x4d41394f3753393f3f31010e)]"
                  ]
                },
                "keysExamined": 41520,
                "seeks": 1,
                "dupsTested": 41520,
                "dupsDropped": 0
              }
            }
          }
        }
      },
      "nReturned": 41520,
      "executionTimeMillisEstimate": 296
    },
    {
      "$project": {
        "_id": "$_id",
        "addressLine": "$addressLine",
        "serviceableMarkets": {
          "$map": {
            "input": "$serviceableMarkets",
            "as": "dtoServiceableMarket",
            "in": {
              "marketId": "$$dtoServiceableMarket.marketId",
              "providerCode": "$$dtoServiceableMarket.providerCode"
            }
          }
        },
        "apartments": "$apartments",
        "zipCode": "$zipCode",
        "city": "$city",
        "state": "$state",
        "stateCode": "$stateCode"
      },
      "nReturned": 41520,
      "executionTimeMillisEstimate": 361
    }
  ],
  "serverInfo": {
    "host": "IBDXANV-DEVRPT1",
    "port": 51072,
    "version": "6.0.14",
    "gitVersion": "25225db95574916fecab3af75b184409f8713aef"
  },
  "serverParameters": {
    "internalQueryFacetBufferSizeBytes": 104857600,
    "internalQueryFacetMaxOutputDocSizeBytes": 104857600,
    "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600,
    "internalDocumentSourceGroupMaxMemoryBytes": 104857600,
    "internalQueryMaxBlockingSortMemoryUsageBytes": 104857600,
    "internalQueryProhibitBlockingMergeOnMongoS": 0,
    "internalQueryMaxAddToSetBytes": 104857600,
    "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600
  },
  "command": {
    "aggregate": "PrdCtlg.Address",
    "pipeline": [
      {
        "$project": {
          "_id": "$_id",
          "pkAddressId": "$pkAddressId",
          "addressLine": "$addressLine",
          "serviceableMarkets": {
            "$map": {
              "input": "$serviceableMarkets",
              "as": "dtoServiceableMarketEntity",
              "in": {
                "marketId": "$$dtoServiceableMarketEntity.marketId",
                "providerCode": "$$dtoServiceableMarketEntity.providerCode"
              }
            }
          },
          "apartments": "$apartments",
          "zipCode": "$zipCode",
          "city": "$city",
          "state": "$state",
          "stateCode": "$stateCode"
        }
      },
      {
        "$project": {
          "_id": "$_id",
          "addressLine": "$addressLine",
          "serviceableMarkets": {
            "$map": {
              "input": "$serviceableMarkets",
              "as": "dtoServiceableMarket",
              "in": {
                "marketId": "$$dtoServiceableMarket.marketId",
                "providerCode": "$$dtoServiceableMarket.providerCode"
              }
            }
          },
          "apartments": "$apartments",
          "zipCode": "$zipCode",
          "city": "$city",
          "state": "$state",
          "stateCode": "$stateCode"
        }
      },
      {
        "$match": {
          "serviceableMarkets.providerCode": "ABC"
        }
      }
    ],
    "cursor": {},
    "maxTimeMS": 60000,
    "$db": "BundleDealer"
  },
  "ok": 1,
  "$clusterTime": {
    "clusterTime": {
      "$timestamp": "7454122149746311169"
    },
    "signature": {
      "hash": "2GCki7A3PR/o43JvgibVzxPq2uQ=",
      "keyId": {
        "low": 2,
        "high": 1726729873,
        "unsigned": false
      }
    }
  },
  "operationTime": {
    "$timestamp": "7454122149746311169"
  }
}

I am very much confused at this stage about index behaviour working with field level filter and not working with $elemMatch.

I hope I am able to explain my scenario and situation. Any advise here would be helpful for the resolution.

I think that this is really the key of the issue. When you do

You are querying the array serviceableMarkets, while with

You are querying the indexed field serviceableMarkets.providerCode.

While the server could try to analyze the inside of $elemMatch to see if all the queried fields are indexed, I think that it might be too complex or expensive to do just to accommodate some edge cases like yours.

I think that once you know $elemMatch might force a fetch before deciding if documents match or not, you can decide not to use it when querying a single indexed array field.

1 Like