Create Indexes to Ensure Query Selectivity
On this page
Selectivity is the ability of a query to narrow results using indexes. Effective queries are more selective and allow MongoDB to use indexes for a larger portion of the work associated with fulfilling the query.
To ensure selectivity, write queries that limit the number of possible documents with the indexed field or fields. Write queries that are appropriately selective relative to your indexed data.
Examples
Selectivity with Many Common Values
Consider a collection of documents that have the following form:
{ status: "processed", product_type: "electronics" }
In this example, the status
of 99% of documents in the collection is
processed
. If you add an index on status
and query for documents
with the status
of processed
, the index has low selectivity with
this query. However, if you want to query for documents that do not
have the status
of processed
, this index has high selectivity
because the query only reads 1% of the index.
Selectivity with Distributed Values
Consider a collection of documents where the status
field has three
values distributed across the collection:
[ { _id: ObjectId(), "status": "processed", "product_type": "electronics" }, { _id: ObjectId(), "status": "processed", "product_type": "grocery" }, { _id: ObjectId(), "status": "processed", "product_type": "household" }, { _id: ObjectId(), "status": "pending", "product_type": "electronics" }, { _id: ObjectId(), "status": "pending", "product_type": "grocery" }, { _id: ObjectId(), "status": "pending", "product_type": "household" }, { _id: ObjectId(), "status": "new", "product_type": "electronics" }, { _id: ObjectId(), "status": "new", "product_type": "grocery" }, { _id: ObjectId(), "status": "new", "product_type": "household" } ]
If you add an index on status
and query for { "status": "pending",
"product_type": "electronics" }
, MongoDB must read three index keys to
return the one matching result. Similarly, a query for { "status": {
$in: ["processed", "pending"] }, "product_type" : "electronics" }
must
read six documents to return the two matching documents.
Consider the same index on a collection where status
has nine
values distributed across the collection:
[ { _id: ObjectId(), "status": 1, "product_type": "electronics" }, { _id: ObjectId(), "status": 2, "product_type": "grocery" }, { _id: ObjectId(), "status": 3, "product_type": "household"}, { _id: ObjectId(), "status": 4, "product_type": "electronics" }, { _id: ObjectId(), "status": 5, "product_type": "grocery"}, { _id: ObjectId(), "status": 6, "product_type": "household"}, { _id: ObjectId(), "status": 7, "product_type": "electronics" }, { _id: ObjectId(), "status": 8, "product_type": "grocery" }, { _id: ObjectId(), "status": 9, "product_type": "household" } ]
If you query for { "status": 2, "product_type": "grocery" }
, MongoDB
only reads one document to fulfill the query. The index and query are
more selective because there is only one matching document and the query
can select that specific document using the index.
Although this example's query on status
equality is more selective,
a query such as { "status": { $gt: 5 }, "product_type": "grocery" }
would still need to read four documents. However, if you create a
compound index on product_type
and status
, a query for {
"status": { $gt: 5 }, "product_type": "grocery" }
would only need to
read two documents.
To improve selectivity, you can create a compound index that narrows the documents that queries read. For
example, if you want to improve selectivity for queries on status
and product_type
, you could create a compound index on those two
fields.
If MongoDB reads a high number of documents to return results, some queries may perform faster without indexes. To determine performance, see Measure Index Use.