Shard Keys
On this page
The shard key is either a single indexed field or multiple fields covered by a compound index that determines the distribution of the collection's documents among the cluster's shards.
MongoDB divides the span of shard key values (or hashed shard key values) into non-overlapping ranges of shard key values (or hashed shard key values). Each range is associated with a chunk, and MongoDB attempts to distribute chunks evenly among the shards in the cluster.
The shard key has a direct relationship to the effectiveness of chunk distribution. See Choose a Shard Key.
Shard Key Indexes
All sharded collections must have an index that supports the shard key. The index can be an index on the shard key or a compound index where the shard key is a prefix of the index.
If the collection is empty,
sh.shardCollection()
creates the index on the shard key if such an index does not already exists.If the collection is not empty, you must create the index first before using
sh.shardCollection()
.
You cannot drop or hide an index if it is the only non-hidden index that supports the shard key.
Unique Indexes
MongoDB can enforce a uniqueness constraint on a ranged shard key index. Through the use of a unique index on the shard key, MongoDB enforces uniqueness on the entire key combination and not individual components of the shard key.
For a ranged sharded collection, only the following indexes can be unique:
the index on the shard key
a compound index where the shard key is a prefix
the default
_id
index; however, the_id
index only enforces the uniqueness constraint per shard if the_id
field is not the shard key or the prefix of the shard key.Important
Uniqueness and the _id Index
If the
_id
field is not the shard key or the prefix of the shard key,_id
index only enforces the uniqueness constraint per shard and not across shards.For example, consider a sharded collection (with shard key
{x: 1}
) that spans two shards A and B. Because the_id
key is not part of the shard key, the collection could have a document with_id
value1
in shard A and another document with_id
value1
in shard B.If the
_id
field is not the shard key nor the prefix of the shard key, MongoDB expects applications to enforce the uniqueness of the_id
values across the shards.
The unique index constraints mean that:
For a to-be-sharded collection, you cannot shard the collection if the collection has other unique indexes.
For an already-sharded collection, you cannot create unique indexes on other fields.
A unique index stores a null value for a document missing the indexed field; that is a missing index field is treated as another instance of a
null
index key value. For more information, see Unique Index and Missing Field.
To enforce uniqueness on the shard key values, pass the unique
parameter as true
to the sh.shardCollection()
method:
If the collection is empty,
sh.shardCollection()
creates the unique index on the shard key if such an index does not already exist.If the collection is not empty, you must create the index first before using
sh.shardCollection()
.
Although you can have a unique compound index where the shard
key is a prefix, if using unique
parameter, the collection must have a unique index that is on the shard
key.
You cannot specify a unique constraint on a hashed index.
Missing Shard Key Fields
Starting in version 4.4, documents in sharded collections can be missing the shard key fields. To set missing shard key fields, see Set Missing Shard Key Fields.
Chunk Range and Missing Shard Key Fields
Missing shard key fields fall within the same chunk range as shard keys
with null values. For example, if the shard key is on the fields { x:
1, y: 1 }
, then:
Document Missing Shard Key | Falls into Same Range As |
---|---|
{ x: "hello" } | { x: "hello", y: null } |
{ y: "goodbye" } | { x: null, y: "goodbye" } |
{ z: "oops" } | { x: null, y: null } |
Read/Write Operations and Missing Shard Key Fields
To target documents with missing shard key fields, you can use the
{ $exists: false }
filter condition on the shard key
fields. For example, if the shard key is on the fields { x: 1, y: 1
}
, you can find the documents with missing shard key fields by running
this query:
db.shardedcollection.find( { $or: [ { x: { $exists: false } }, { y: { $exists: false } } ] } )
If you specify a null equality match filter condition (e.g. { x: null
}
), the filter matches both those documents with missing shard
key fields and those with shard key fields set to null
.
Some write operations, such as a write with an upsert
specification, require an equality match on the shard key. In those
cases, to target a document that is missing the shard key, include
another filter condition in addition to the null
equality match.
For example:
{ _id: <value>, <shardkeyfield>: null } // _id of the document missing shard key