oplog Sizing
On this page
The mongosync program uses change streams to synchronize data between source and destination
clusters. mongosync
does not access the oplog directly,
but when a change stream returns events from the past, the events must
be within the oplog
time range.
mongosync
applies operations in the oplog
on the source cluster
to the data on the destination cluster. When operations
that mongosync
has not applied roll off the oplog
on the source cluster, the sync fails and mongosync
exits.
Note
mongosync
does not replicate applyOps
operations made on
the source cluster during sync to the destination cluster.
During the initial sync, mongosync
may apply operations at a slower
rate due to copying documents concurrently.
After the initial sync, mongosync
applies changes
faster and is more likely to maintain a position in the oplog
that is close to the real-time writes occurring on the source cluster.
If you anticipate syncing a large data set, or if you plan to pause
synchronization for an extended period of time, you might exceed the
oplog window. Use the oplogSizeMB
setting
to increase the size of the oplog
on the source cluster.
Considerations
The destination cluster must have enough disk storage to accommodate the logical data size being migrated and the destination oplog entries from the initial sync. For example, to migrate 10 GB of data, the destination cluster must have at least 10 GB available for the data and another 10 GB for the insert oplog entries from the initial sync.
To reduce the overhead of the destination oplog entries, you can:
Use the
oplogSizeMB
setting to lower the destination cluster's oplog size.Use to
oplogMinRetentionHours
setting to lower or remove the destination cluster's minimum oplog retention period.
Monitor oplog Size Needed for Initial Sync
Determine oplog Window
To get the difference in seconds between the first and last entry
in the oplog
run db.getReplicationInfo()
. If you
are replicating a sharded cluster, run the command on each shard.
db.getReplicationInfo().timeDiff
The value returned is the minimum oplog
window of the
cluster. If there are multiple shards, the smallest number is the
minimum oplog
window.
Determine mongosync Replication Lag
To get the lagTimeSeconds
value, run the
/progress command.
The lag time is the time in seconds between the
last event applied by mongosync
and time of the current
latest event on the source cluster.
It is a measure of how far behind the source cluster mongosync
is.
Validate oplog Size
If the lag time approaches the minimum oplog
window, make
one of the following changes:
Increase the
oplog
window. UsereplSetResizeOplog
to setminRetentionHours
greater than the currentoplog
window.Scale up the
mongosync
instance. Add CPU or memory to scale up themongosync
node so that it has a higher copy rate.
Note
The oplog window and rate of change for replication lag may vary during synchronization. Repeat these steps during a migration to monitor the progress.