oplog Sizing

On this page

Considerations
Monitor oplog Size Needed for Initial Sync
Determine oplog Window
Determine mongosync Replication Lag
Validate oplog Size

The mongosync program uses change streams to synchronize data between source and destination clusters. mongosync does not access the oplog directly, but when a change stream returns events from the past, the events must be within the oplog time range.

mongosync applies operations in the oplog on the source cluster to the data on the destination cluster after the collection copy phase. When operations that mongosync has not applied roll off the oplog on the source cluster, the sync fails and mongosync exits.

Note

mongosync does not replicate applyOps operations made on the source cluster during sync to the destination cluster.

If you anticipate syncing a large data set, or if you plan to pause synchronization for an extended period of time, you might exceed the oplog window. Use the oplogSizeMB setting to increase the size of the oplog on the source cluster.

Considerations

The destination cluster must have enough disk storage to accommodate the logical data size being migrated and the destination oplog entries from the initial sync. For example, to migrate 10 GB of data, the destination cluster must have at least 10 GB available for the data and another 10 GB for the insert oplog entries from the initial sync.

To use embedded verification, you must have a larger oplog on the destination. If you enable the embedded verifier and reduce the size of the destination oplog, the embedded verifier might not be able to keep up, causing mongosync to error.

If you need to reduce the overhead of the destination oplog entries and the embedded verifier is disabled, you can:

Use the oplogSizeMB setting to lower the destination cluster's oplog size.
Use to oplogMinRetentionHours setting to lower or remove the destination cluster's minimum oplog retention period.

Monitor oplog Size Needed for Initial Sync

Determine oplog Window

To get the difference in seconds between the first and last entry in the oplog run db.getReplicationInfo(). If you are replicating a sharded cluster, run the command on each shard.

db.getReplicationInfo().timeDiff

The value returned is the minimum oplog window of the cluster. If there are multiple shards, the smallest number is the minimum oplog window.

Determine mongosync Replication Lag

To get the lagTimeSeconds value, run the /progress command. The lag time is the time in seconds between the last event applied by mongosync and time of the current latest event on the source cluster.

It is a measure of how far behind the source cluster mongosync is.

Validate oplog Size

If the lag time approaches the minimum oplog window, make one of the following changes:

Increase the oplog window. Use replSetResizeOplog to set minRetentionHours greater than the current oplog window.
Note
replSetResizeOplog is unsupported in Atlas. To resize the oplog in Atlas, see Set Minimum Oplog Window.
Scale up the mongosync instance. Add CPU or memory to scale up the mongosync node so that it has a higher copy rate.

Note

The oplog window and rate of change for replication lag may vary during synchronization. Repeat these steps during a migration to monitor the progress.

Back

Authentication Using Workload Identity Federation

Finalize Cutover Process