쿼리에 효율적으로 응답하기 위한 선택적 인덱스 생성
이 페이지의 내용
Selectivity is a query property that describes the ratio of documents matching the query versus the total number of documents in a collection. The selectivity of an index describes how many documents a unique index key matches. A query or index has high selectivity when proportionally few documents match a query or a given index key.
Because indexes can have different selectivities depending on the index keys used, ensure that the most selective indexes are available based on the predicates contained in a query. To ensure the most efficient query execution, create indexes that most uniquely match the predicates contained in a query.
예시
많은 공통 값을 지닌 선택성
다음과 같은 형식의 문서 컬렉션을 고려합니다.
{ status: "processed", product_type: "electronics" }
In this example, the status
of 99% of documents in the collection is
processed
. If you add an index on status
and query for documents
with the status
of processed
, both the index and the query have low
selectivity. However, if you want to query for documents that do not
have the status
of processed
, the index and the query have high selectivity
because the query only returns 1% of the documents in a collection.
Selectivity When Values are Distributed
status
필드에 다음과 같은 세 개의 값이 컬렉션 전체에 분산되어 있는 문서 컬렉션을 예로 들어 보겠습니다.
[ { _id: ObjectId(), status: "processed", product_type: "electronics" }, { _id: ObjectId(), status: "processed", product_type: "grocery" }, { _id: ObjectId(), status: "processed", product_type: "household" }, { _id: ObjectId(), status: "pending", product_type: "electronics" }, { _id: ObjectId(), status: "pending", product_type: "grocery" }, { _id: ObjectId(), status: "pending", product_type: "household" }, { _id: ObjectId(), status: "new", product_type: "electronics" }, { _id: ObjectId(), status: "new", product_type: "grocery" }, { _id: ObjectId(), status: "new", product_type: "household" } ]
If you add an index on status
and query for { "status": "pending",
"product_type": "electronics" }
, MongoDB must read three index keys,
retrieve three documents matching that status, and filter those
documents further on product_type
to return the one matching document.
Similarly, a query for { "status": {$in: ["processed", "pending"] }, "product_type" : "electronics" }
must read six documents to return the two matching documents.
status
에 다음과 같은 9개의 값이 컬렉션 전체에 분산되어 있는 문서 컬렉션을 예로 들어 보겠습니다.
[ { _id: ObjectId(), status: 1, product_type: "electronics" }, { _id: ObjectId(), status: 2, product_type: "grocery" }, { _id: ObjectId(), status: 3, product_type: "household"}, { _id: ObjectId(), status: 4, product_type: "electronics" }, { _id: ObjectId(), status: 5, product_type: "grocery"}, { _id: ObjectId(), status: 6, product_type: "household"}, { _id: ObjectId(), status: 7, product_type: "electronics" }, { _id: ObjectId(), status: 8, product_type: "grocery" }, { _id: ObjectId(), status: 9, product_type: "household" } ]
If you query for { "status": 2, "product_type": "grocery" }
, MongoDB
only reads one document matching the index key, indicating the index is
highly selective. By using this index, you can receive a query response more
efficiently, since MongoDB must only further filter one document
matching the index value. In this case, the
filter also matches, and the query only returns one document.
Although this example's query on status
equality is more selective,
a query such as { "status": { $gt: 5 }, "product_type": "grocery" }
still needs to read four documents if you use the
same index on status
. However, if you create a compound index
on product_type
and status
, MongoDB can more efficiently
answer a query for {"status": { $gt: 5 }, "product_type": "grocery" }
via the compound index, as the query returns only one matching document.
To improve query performance, you can create a 복합 인덱스 that narrows the documents that queries read. For
example, if you want to improve performance for queries on status
and product_type
, you could create a compound index on those two
fields.
If MongoDB reads a relatively large number of documents to return results, some queries may perform faster without indexes. To determine performance, see 측정값 인덱스 사용.