Back Up Sharded Clusters with Database Dumps
On this page
Starting in MongoDB 7.1, you can back up data on sharded clusters
using mongodump
.
About this Task
mongodump
is a utility that creates a binary export of database
content. You can use the mongodump
utility to take self-managed backups of
a sharded cluster.
To take a consistent backup of a sharded cluster with mongodump
, you must
first stop the balancer, stop writes, and stop any DDL operations on the
cluster. This ensures that the cluster remains in a consistent state for the
duration of the backup.
MongoDB provides backup and restore operations that can run with the balancer and running transactions through the following services:
Before you Begin
This task uses mongodump
to back up a sharded cluster. Ensure
that you have a cluster running that contains data in sharded collections.
Admin Privileges
To perform these tasks, your user must have the fsync
authorization, which allows the user to run the fsync
and
fsyncUnlock
commands.
Steps
To take a self-managed backup of a sharded cluster, complete the following steps:
Find a Backup Window
To find a good time to perform a backup, monitor your application and database usage to find a time when chunk migrations, resharding, and DDL operations are unlikely to occur, as these can cause an inconsistent backup.
For more information, see Schedule Backup Window for Sharded Clusters.
Stop the Balancer
To prevent chunk migrations from distruping the backup, use
the sh.stopBalancer()
method to stop the balancer:
sh.stopBalancer()
If a balancing round is currently in progress, the operation waits for balancing to complete.
To confirm that the balancer is stopped, use the
sh.getBalancerState()
method:
sh.getBalancerState()
false
The command returns false
when the balancer is stopped.
Lock the Cluster
The sharded cluster must be locked during the backup process to protect the database from writes, which may cause inconsistencies in the backup.
To lock a sharded cluster, use the db.fsyncLock()
method:
db.getSiblingDB("admin").fsyncLock()
To confirm the lock, on mongos
and the primary
mongod
of the config servers, run the following
aggregation pipeline and ensure that all of the shards are
locked:
db.getSiblingDB("admin").aggregate( [ { $currentOp: { } }, { $facet: { "locked": [ { $match: { $and: [ { fsyncLock: { $exists: true } }, { fsyncLock: true } ] } }], "unlocked": [ { $match: { fsyncLock: { $exists: false } } } ] } }, { $project: { "fsyncLocked": { $gt: [ { $size: "$locked" }, 0 ] }, "fsyncUnlocked": { $gt: [ { $size: "$unlocked" }, 0 ] } } } ] )
[ { fsyncLocked: true }, { fsyncUnlocked: false } ]
Take Backup
To back up the sharded cluster, use mongodump
to connect to
mongos
and perform the backup:
mongodump \ --host mongos.example.net \ --port 27017 \ --username user \ --password "passwd" \ --out /opt/backups/example-cluster-1
Unlock the Cluster
After the backup completes, you can unlock the cluster to allow writes to resume.
To unlock the cluster, use the db.fsyncUnlock()
method:
db.getSibling("admin").fsyncUnlock()
To confirm the unlock, on mongos
and the primary
mongod
of the config servers, run the following
aggregation pipeline and ensure that all shards are unlocked:
db.getSiblingDB("admin").aggregate( [ { $currentOp: { } }, { $facet: { "locked": [ { $match: { $and: [ { fsyncLock: { $exists: true } }, { fsyncLock: true } ] } }], "unlocked": [ { $match: { fsyncLock: { $exists: false } } } ] } }, { $project: { "fsyncLocked": { $gt: [ { $size: "$locked" }, 0 ] }, "fsyncUnlocked": { $gt: [ { $size: "$unlocked" }, 0 ] } } } ] )
[ { fsyncLocked: false }, { fsyncUnlocked: true } ]
Restart the Balancer
To restart the balancer, use the sh.startBalancer()
method:
sh.startBalancer()
To confirm that the balancer is running, use the
sh.getBalancerState()
method:
sh.getBalancerState()
true
The command returns true
when the balancer is running.