쿼리에 효율적으로 응답하기 위한 선택적 인덱스 생성

이 페이지의 내용

예시

Selectivity is a query property that describes the ratio of documents matching the query versus the total number of documents in a collection. The selectivity of an index describes how many documents a unique index key matches. A query or index has high selectivity when proportionally few documents match a query or a given index key.

Because indexes can have different selectivities depending on the index keys used, ensure that the most selective indexes are available based on the predicates contained in a query. To ensure the most efficient query execution, create indexes that most uniquely match the predicates contained in a query.

예시

많은 공통 값을 지닌 선택성

다음과 같은 형식의 문서 컬렉션을 고려합니다.

{
   status: "processed",
   product_type: "electronics"
}

In this example, the status of 99% of documents in the collection is processed. If you add an index on status and query for documents with the status of processed, both the index and the query have low selectivity. However, if you want to query for documents that do not have the status of processed, the index and the query have high selectivity because the query only returns 1% of the documents in a collection.

Selectivity When Values are Distributed

status 필드에 다음과 같은 세 개의 값이 컬렉션 전체에 분산되어 있는 문서 컬렉션을 예로 들어 보겠습니다.

[
   { _id: ObjectId(), status: "processed", product_type: "electronics" },
   { _id: ObjectId(), status: "processed", product_type: "grocery" },
   { _id: ObjectId(), status: "processed", product_type: "household" },
   { _id: ObjectId(), status: "pending", product_type: "electronics" },
   { _id: ObjectId(), status: "pending", product_type: "grocery" },
   { _id: ObjectId(), status: "pending", product_type: "household" },
   { _id: ObjectId(), status: "new", product_type: "electronics" },
   { _id: ObjectId(), status: "new", product_type: "grocery" },
   { _id: ObjectId(), status: "new", product_type: "household" }
]

If you add an index on status and query for { "status": "pending", "product_type": "electronics" }, MongoDB must read three index keys, retrieve three documents matching that status, and filter those documents further on product_type to return the one matching document. Similarly, a query for { "status": {$in: ["processed", "pending"] }, "product_type" : "electronics" } must read six documents to return the two matching documents.

status에 다음과 같은 9개의 값이 컬렉션 전체에 분산되어 있는 문서 컬렉션을 예로 들어 보겠습니다.

[
   { _id: ObjectId(), status: 1, product_type: "electronics" },
   { _id: ObjectId(), status: 2, product_type: "grocery" },
   { _id: ObjectId(), status: 3, product_type: "household"},
   { _id: ObjectId(), status: 4, product_type: "electronics"  },
   { _id: ObjectId(), status: 5, product_type: "grocery"},
   { _id: ObjectId(), status: 6, product_type: "household"},
   { _id: ObjectId(), status: 7, product_type: "electronics"  },
   { _id: ObjectId(), status: 8, product_type: "grocery" },
   { _id: ObjectId(), status: 9, product_type: "household" }
]

If you query for { "status": 2, "product_type": "grocery" }, MongoDB only reads one document matching the index key, indicating the index is highly selective. By using this index, you can receive a query response more efficiently, since MongoDB must only further filter one document matching the index value. In this case, the filter also matches, and the query only returns one document.

Although this example's query on status equality is more selective, a query such as { "status": { $gt: 5 }, "product_type": "grocery" } still needs to read four documents if you use the same index on status. However, if you create a compound index on product_type and status, MongoDB can more efficiently answer a query for {"status": { $gt: 5 }, "product_type": "grocery" } via the compound index, as the query returns only one matching document.

To improve query performance, you can create a 복합 인덱스 that narrows the documents that queries read. For example, if you want to improve performance for queries on status and product_type, you could create a compound index on those two fields.

If MongoDB reads a relatively large number of documents to return results, some queries may perform faster without indexes. To determine performance, see 측정값 인덱스 사용.

돌아가기

쿼리 결과 정렬

참조