Reshard a Collection
On this page
- About this Task
- Before you Begin
- Steps
- Start the resharding operation.
- Monitor the resharding operation.
- Finish the resharding operation.
- Block writes early to force resharding to complete
- Abort resharding operation
- Behavior
- Minimum Duration of a Resharding Operation
- Retryable Writes
- Error Cases
- Primary Failovers
- Duplicate
_id
Values
New in version 5.0.
The ideal shard key allows MongoDB to distribute documents evenly throughout the cluster while facilitating common query patterns. A suboptimal shard key can lead to performance or scaling issues due to uneven data distribution. Starting in MongoDB 5.0, you can change the shard key for a collection to change the distribution of your data across a cluster.
Note
Before resharding your collection, read Troubleshoot Shard Keys for information on common performance and scaling issues and advice on how to fix them.
About this Task
Only one collection can be resharded at a time.
writeConcernMajorityJournalDefault
must betrue
.To reshard a collection that has a uniqueness constraint, the new shard key must satisfy the unique index requirements for any existing unique indexes.
The following commands and corresponding shell methods are not supported on the collection that is being resharded while the resharding operation is in progress:
The following commands and methods are not supported on the cluster while the resharding operation is in progress:
Warning
Using any of the preceding commands during a resharding operation causes the resharding operation to fail.
If the collection to be resharded uses Atlas Search, the search index will become unavailable when the resharding operation completes. You need to manually rebuild the search index once the resharding operation completes.
Before you Begin
Before you reshard your collection, ensure that you meet the following requirements:
Your application can tolerate a period of two seconds where the collection that is being resharded blocks writes. During the time period where writes are blocked your application experiences an increase in latency. If your workload cannot tolerate this requirement, consider refining your shard key instead.
Your database meets these resource requirements:
Available storage space: Ensure that the available storage space on each shard the collection will be distributed across is at least twice the size of the collection that you want to reshard and its total index size, divided by the number of shards.
storage_req = ( ( collection_size + index_size ) * 2 ) / shard_count For example, consider a collection that contains 2 TB of data and has a 400 GB index distributed across four shards. To perform a resharding operation on this collection, each shard would require 1.2 TB of available storage.
1.2 TB storage = ( ( 2 TB collection + 0.4 TB index ) * 2 ) / 4 shards To meet storage requirements, you may need to upgrade to the next tier of storage during the resharding operation. You can scale down once the operation completes.
I/O: Ensure that your I/O capacity is below 50%.
CPU load: Ensure that your CPU load is below 80%.
Important
These requirements are not enforced by the database. A failure to allocate enough resources can result in:
the database running out of space and shutting down
decreased performance
the resharding operation taking longer than expected
If your application has time periods with less traffic, reshard your collection during that time if possible.
You must rewrite your application's queries to use both the current shard key and the new shard key.
Tip
If your application can tolerate downtime, you can perform these steps to avoid rewriting your application's queries to use both the current and new shard keys:
Stop your application.
Rewrite your application to use the new shard key.
Wait until resharding completes. To monitor the resharding process, use the
$currentOp
pipeline stage.Deploy your rewritten application.
Before resharding completes, the following queries return an error if the query filter does not include either the current shard key or a unique field (like
_id
):For optimal performance, we recommend that you also rewrite other queries to include the new shard key.
Once the resharding operation completes, you can remove the old shard key from the queries.
No index builds are in progress. Use
db.currentOp()
to check for any running index builds:db.adminCommand( { currentOp: true, $or: [ { op: "command", "command.createIndexes": { $exists: true } }, { op: "none", "msg" : /^Index Build/ } ] } ) In the result document, if the
inprog
field value is an empty array, there are no index builds in progress:{ inprog: [], ok: 1, '$clusterTime': { ... }, operationTime: <timestamp> }
Note
Resharding is a write-intensive process which can generate increased rates of oplog. You may wish to:
set a fixed oplog size to prevent unbounded oplog growth.
increase the oplog size to minimize the chance that one or more secondary nodes becomes stale.
See the Replica Set Oplog documentation for more details.
Steps
Important
We strongly recommend that you check the About this Task and read the Steps section in full before resharding your collection.
In a collection resharding operation, a shard can be a:
donor, which currently stores chunks for the sharded collection.
recipient, which stores new chunks for the sharded collection based on the shard keys and zones.
A shard can be donor and a recipient at the same time. The set of donor shards is identical to the recipient shards, unless you use zones.
The config server primary is always the resharding coordinator and starts each phase of the resharding operation.
Start the resharding operation.
While connected to the mongos
, issue a
reshardCollection
command that specifies the collection
to be resharded and the new shard key:
db.adminCommand({ reshardCollection: "<database>.<collection>", key: <shardkey> })
MongoDB sets the max number of seconds to block writes to two seconds and begins the resharding operation.
Monitor the resharding operation.
To monitor the resharding operation, you can use the
$currentOp
pipeline stage:
db.getSiblingDB("admin").aggregate([ { $currentOp: { allUsers: true, localOps: false } }, { $match: { type: "op", "originatingCommand.reshardCollection": "<database>.<collection>" } } ])
Note
To see updated values, you need to continuously run the preceeding pipeline.
The $currentOp
pipeline outputs:
totalOperationTimeElapsedSecs
: elapsed operation time in secondsremainingOperationTimeEstimatedSecs
: estimated time remaining in seconds for the current resharding operation. It is returned as-1
when a new resharding operation starts.Starting in:
MongoDB 5.0, but before MongoDB 6.1,
remainingOperationTimeEstimatedSecs
is only available on a recipient shard during a resharding operation.MongoDB 6.1,
remainingOperationTimeEstimatedSecs
is also available on the coordinator during a resharding operation.
The resharding operation performs these phases in order:
The clone phase duplicates the current collection data.
The catch-up phase applies any pending write operations to the resharded collection.
remainingOperationTimeEstimatedSecs
is set to a pessimistic time estimate:The catch-up phase time estimate is set to the clone phase time, which is a relatively long time.
In practice, if there are only a few pending write operations, the actual catch-up phase time is relatively short.
[ { shard: '<shard>', type: 'op', desc: 'ReshardingRecipientService | ReshardingDonorService | ReshardingCoordinatorService <reshardingUUID>', op: 'command', ns: '<database>.<collection>', originatingCommand: { reshardCollection: '<database>.<collection>', key: <shardkey>, unique: <boolean>, collation: { locale: 'simple' } }, totalOperationTimeElapsedSecs: <number>, remainingOperationTimeEstimatedSecs: <number>, ... }, ... ]
Finish the resharding operation.
Throughout the resharding process, the estimated time to complete the
resharding operation (remainingOperationTimeEstimatedSecs
)
decreases. When the estimated time is below two seconds, MongoDB
blocks writes and completes the resharding operation. Until the
estimated time to complete the resharing operation is below two
seconds, the resharding operation does not block writes by default.
During the time period where writes are blocked your application
experiences an increase in latency.
Once the resharding process has completed, the resharding command
returns ok: 1
.
{ ok: 1, '$clusterTime': { clusterTime: <timestamp>, signature: { hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0), keyId: <number> } }, operationTime: <timestamp> }
To see whether the resharding operation completed successfully, check
the output of the sh.status()
method:
sh.status()
The sh.status()
method output contains a subsection for the
databases
. If resharding has completed successfully, the output
lists the new shard key for the collection:
databases [ { database: { _id: '<database>', primary: '<shard>', partitioned: true, version: { uuid: <uuid>, timestamp: <timestamp>, lastMod: <number> } }, collections: { '<database>.<collection>': { shardKey: <shardkey>, unique: <boolean>, balancing: <boolean>, chunks: [], tags: [] } } } ... ]
Note
If the resharded collection uses Atlas Search, the search index will become unavailable when the resharding operation completes. You need to manually rebuild the search index once the resharding operation completes.
Block writes early to force resharding to complete
You can manually force the resharding operation to complete by
issuing the commitReshardCollection
command. This is
useful if the current time estimate to complete the resharding
operation is an acceptable duration for your collection to block
writes. The commitReshardCollection
command blocks
writes early and forces the resharding operation to complete. The
command has the following syntax:
db.adminCommand({ commitReshardCollection: "<database>.<collection>" })
Abort resharding operation
You can abort the resharding operation during any stage of the
resharding operation, even after running the
commitReshardCollection
, until shards have fully caught
up.
For example, if remainingOperationTimeEstimatedSecs
does not
decrease, you can abort the resharding operation with the
abortReshardCollection
command:
db.adminCommand({ abortReshardCollection: "<database>.<collection>" })
After canceling the operation, you can retry the resharding operation during a time window with lower write volume. If this is not possible, add more shards before retrying.
Behavior
Minimum Duration of a Resharding Operation
The minimum duration of a resharding operation is always 5 minutes.
Retryable Writes
Retryable writes initiated before or during
resharding can be retried during and after the collection has been
resharded for up to 5 minutes. After 5 minutes you may be unable to find
the definitive result of the write and subsequent attempts to retry the
write fail with an IncompleteTransactionHistory
error.
Error Cases
Primary Failovers
If a primary failover on a replica set shard or config server occurs, the resharding operation aborts.
If a resharding operation aborts due to a primary failover, run the
cleanupReshardCollection
command before starting a new
resharding operation:
db.runCommand({ cleanupReshardCollection: "<database>.<collection>" })
Duplicate _id
Values
The resharding operation fails if _id
values are not globally unique
to avoid corrupting collection data. Duplicate _id
values can also
prevent successful chunk migration. If you have documents with duplicate
_id
values, copy the data from each into a new document, and then
delete the duplicate documents.