Docs Menu
Docs Home
/
MongoDB Manual
/ / /

sh.shardAndDistributeCollection()

On this page

  • Definition
  • Parameters
  • Considerations
  • Examples
  • Learn More
sh.shardAndDistributeCollection(namespace, key, unique, options)

Shards a collection and immediately redistributes the data using the provided shard key. The immediate redistribution of data allows for faster data movement and reduced impact to workloads.

Important

mongosh Method

This page documents a mongosh method. This is not the documentation for a language-specific driver, such as Node.js.

For MongoDB API drivers, refer to the language-specific MongoDB driver documentation.

Running sh.shardAndDistributeCollection() in mongosh has the same result as consecutively running the shardCollection and reshardCollection commands.

sh.shardAndDistributeCollection() takes the following parameters:

Parameter
Type
Necessity
Description
namespace
String
Required
The namespace of the collection to shard in the format "<database>.<collection>".
key
Document
Required

The document that specifies the field or fields to use as the shard key.

{ <field1>: <1|"hashed">, ... }

Set the field value to either:

  • 1 for range-based sharding

  • "hashed" to specify a hashed shard key.

    Shard keys must be supported by an index. The index must exist before you run the shardAndDistributeCollection() method.

See also: Shard Key Indexes

unique
Boolean
Optional

Specify true to ensure that the underlying index enforces a unique constraint. Defaults to false.

When using hashed shard keys, you can't specify true.

options
Document
Optional
A document containing optional fields, including numInitialChunks and collation.

The options argument supports the following options:

Parameter
Type
Description
numInitialChunks
Integer
Specifies the initial number of chunks to create across all shards in the cluster when sharding and resharding a collection. MongoDB then creates and balances chunks across the cluster. The numInitialChunks parameter must result in less than 8192 per shard. Defaults to 1000 chunks per shard.
collation
Document
If the collection specified to shardAndDistributeCollection() has a default collation, you must include a collation document with { locale : "simple" }, or the shardAndDistributeCollection() method fails.
presplitHashedZones
Boolean

Specify true to perform initial chunk creation and distribution for an empty or non-existing collection based on the defined zones and zone ranges for the collection. For hashed sharding only.

shardAndDistributeCollection() with presplitHashedZones: true returns an error if any of the following are true:

timeseries
Document

Specify this option to create a new sharded time series collection.

To shard an existing time series collection, omit this parameter.

When the collection specified to shardAndDistributeCollection is a time series collection and the timeseries option is not specified, MongoDB uses the values that define the existing time series collection to populate the timeseries field.

For detailed syntax, see Time Series Options.

The following factors can impact performance or the distribution of your data.

Although you can change your shard key later, carefully consider your shard key choice to optimize scalability and perfomance.

When sharding time series collections, you can only specify the following fields in the shard key:

  • The metaField

  • Sub-fields of metaField

  • The timeField

You may specify combinations of these fields in the shard key. No other fields, including _id, are allowed in the shard key pattern.

When you specify the shard key:

Tip

Avoid specifying only the timeField as the shard key. Since the timeField increases monotonically, it may result in all writes appearing on a single chunk within the cluster. Ideally, data is evenly distributed across chunks.

To learn how to best choose a shard key, see:

Hashed shard keys use a hashed index or a compound hashed index as the shard key.

To specify a hashed shard key field, use field: "hashed" .

Note

If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.

Tip

See also:

The shard collection operation (i.e. shardCollection command and the sh.shardCollection() helper) can perform initial chunk creation and distribution for an empty or a non-existing collection if zones and zone ranges have been defined for the collection. Initial chunk distribution allows for a faster setup of zoned sharding. After the initial distribution, the balancer manages the chunk distribution going forward per usual.

For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection. If sharding a collection using a ranged or single-field hashed shard key, the numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection.

To shard a collection using a compound hashed index, see Initial Chunk Distribution with Compound Hashed Indexes.

MongoDB supports sharding collections on compound hashed indexes. When sharding an empty or non-existing collection using a compound hashed shard key, additional requirements apply in order for MongoDB to perform initial chunk creation and distribution.

The numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection and presplitHashedZones is false.

For an example, see Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection.

Tip

See also:

If you specify unique: true, you must create the index before using sh.shardAndDistributeCollection().

Although you can have a unique compound index where the shard key is a prefix, if you use the unique parameter, the collection must have a unique index that is on the shard key.

If the collection has a default collation, the sh.shardAndDistributeCollection command must include a collation parameter with the value { locale: "simple" }. For non-empty collections with a default collation, you must have at least one index with the simple collation whose fields support the shard key pattern.

You do not need to specify the collation option for collections without a collation. If you do specify the collation option for a collection with no collation, it will have no effect.

mongos uses "majority" for the write concern of the shardCollection command, its helper sh.shardCollection(), and the sh.shardAndDistributeCollection() method.

The following examples show how you can use the sh.shardAndDistributeCollection() method with or without optional parameters.

A database named records contains a collection named people. The following command shards the collection by the zipcode field and immediately redistributes the data in the records.people collection:

sh.shardAndDistributeCollection("records.people", { zipcode: 1 } )

The phonebook database has a contacts collection with no default collation. The following example uses sh.shardAndDistributeCollection() to shard and redistribute the phonebook.contacts collection with:

sh.shardAndDistributeCollection(
"phonebook.contacts",
{ last_name: "hashed" },
false,
{
numInitialChunks: 5,
collation: { locale: "simple" }
}
)

Back

sh.setBalancerState