Review Index Ranking¶

On this page

How Performance Advisor Suggests and Ranks Indexes
Index Field Order
Limiting Proposed Indexes
Example: New York City Taxi Data

The indexes suggested by the Performance Advisor are ordered by their respective Impact scores. Impact indicates the estimated performance improvement that the suggested index would bring.

How Performance Advisor Suggests and Ranks Indexes¶

The Performance Advisor monitors queries that take longer than 100 milliseconds to execute and groups these queries into common query shapes. The Performance Advisor calculates the inefficiency of each query shape by considering the following aggregated metrics from queries which match the shape:

Amount of time spent executing the query.
Number of documents scanned.
Number of documents returned.

To establish recommended indexes, the Performance Advisor uses these metrics in a formula to calculate the Impact, or performance improvement that creating an index matching that query shape would cause. The Performance Advisor compares the amount of time spent executing index-specific operations to the total operational latency in the deployment. When the Performance Advisor suggests indexes, the indexes are ranked by their Impact score.

Index Field Order¶

The type of query operation in the query shape affects the order of the fields used to construct the index. In general, fields are ranked by their cardinality.

The following table shows how the Performance Advisor ranks various operation types by order of relative importance:

Rank	Operation Type	Example Operator
1	Equality match	`$eq`
2	Array query	`$in`
3	Range query	`$gte`
4	Type query	`$type`
5	Exists	`$exists`
6	All other operators	`$nearSphere`
7	Sort	sort()

Limiting Proposed Indexes¶

The Performance Advisor does not suggest indexes which:

Have more than 16 fields, and/or
Contain _id as a field key.

Additionally, the Performance Advisor only suggests the index if:

For impacted queries, the difference between scanned documents and returned documents is greater than 500, and
At least 60 seconds cumulatively were spent executing impacted queries over the past 24 hours.

Index De-Duplication¶

The Performance Advisor de-duplicates overlapping indexes before making suggestions. For example, consider if the Performance Advisor calculates the following potential suggested indexes:

{ a : 1 }
{ a : 1, b : 1 }

Since { a : 1 } is a prefix of { a : 1, b : 1 }, Performance Advisor only suggests { a : 1, b : 1 }. For more information on index prefixes, see Prefixes.

Example: New York City Taxi Data¶

This example uses a database named cab-db containing information about New York City taxi rides, with fields for the times of pickup and dropoff, ride distance, and a breakdown of ride costs. A typical document in the collection yellow looks like this:

{
     "_id" : ObjectId("5db9daab0b2a17b7706cd6a3"),
     "pickup_datetime" : "2014-06-30 02:09:23",
     "dropoff_datetime" : "2014-06-30 02:20:36",
     "passenger_count" : 2,
     "trip_distance" : 3,
     "fare_amount" : 12,
     "tip_amount" : 2.6,
     "total_amount" : 15.6
}

The collection contains more than 10 million documents, so an application which needs to run queries based on specific field data is going to generate some very inefficient operations unless the collection is properly indexed.

Typical queries for this application search for documents which contain a specific dropoff time, combined with one or more other fields. For example:

db.yellow.find({ "dropoff_datetime": "2014-06-19 21:45:00",
                 "passenger_count": 1,
                 "trip_distance": {"$gt": 3 }
               })

The Performance Advisor recommends the following indexes to improve performance:

Note

By default, the Performance Advisor shows index recommendations for all collections in your cluster. To narrow the recommendations down to a specific collection, select one from the Collection dropdown menu.

The recommended indexes are listed in order of performance impact, from greatest to least. In this example, the first recommended index is estimated to decrease overall cluster operation latency time by 50%.

The first recommendation is for an index on three fields:

passenger_count
dropoff_datetime
trip_distance

Click the Create Index button to create the index with any desired additional options. To learn more about creating an index in the Performance Advisor, see Create Suggested Indexes.

Creating this index improves performance by removing the need for the database engine to scan the entire collection to find documents which match the query. Queries with the shape shown in the example return results in 50 milliseconds or less on the indexed collection, as opposed to several seconds on the unindexed collection.

Note

You can also create indexes with the Data Explorer.

← Configure the Slow Query Threshold Review Drop Index Recommendations →