mongosync
Behavior
On this page
- Embedded Verifier Disclaimer
- Settings
- Cluster Independence
- Configuration File
- Cluster and Collection Types
- Sharded Clusters
- Multiple Clusters
- Capped Collections
- Reads and Writes
- Write Blocking
- Read and Write Concern
- Read Preference
- Legacy Index Handling
- Considerations for Continuous Sync
- Temporary Changes to Collection Characteristics
- Rolling Index Builds
- Destination Clusters
- Consistency
- Profiling
- Views
- System Collections
- UUIDs
- Sorting
- Performance
- Resilience
- Data Definition Language (DDL) Operations
- Learn More
The mongosync
binary is the primary process used in
Cluster-to-Cluster Sync. mongosync
migrates data from one cluster to
another and can keep the clusters in continuous sync.
For an overview of the mongosync
process, see About mongosync
.
To get started with mongosync
, refer to the Quick Start Guide.
For more detailed information, refer to the
Installation or Connecting mongosync
page that best fits your
situation.
Embedded Verifier Disclaimer
Starting in 1.9, mongosync
includes an embedded verifier to
perform a series of verification checks on all supported
collections on the destination cluster to confirm that
it was successful in transferring documents from the
source cluster to the destination.
When you start the mongosync
process, it provides a
disclaimer advising the user that the verifier is enabled
by default.
Embedded verification is enabled by default for replica set to replica set migrations. Verification checks for data consistency between the source and destination clusters. Verification will cause mongosync to fail if any inconsistencies are detected, but it does not check for all possible data inconsistencies. Please see the documentation at https://www.mongodb.com/docs/cluster-to-cluster-sync/current/reference/verification/embedded for more details. Verification requires approximately 0.5 GB of memory per 1 million documents on the source cluster and will fail if insufficient memory is available. Accepting this disclaimer indicates that you understand the limitations and memory requirements for this tool. To skip this disclaimer prompt, use –-acceptDisclaimer. To disable the embedded verifier, specify 'verification: false' when starting mongosync. Please see https://www.mongodb.com/docs/cluster-to-cluster-sync/current/reference/verification/ for alternative verification methods. Do you want to continue? (y/n):
If you have already read and accepted the disclaimer, you can
start mongosync
with the --acceptDisclaimer
option
to skip this notification.
Settings
Cluster Independence
mongosync
syncs collection data between a source cluster and
destination cluster. mongosync
does not synchronize users or roles. As a result, you can create
users with different access permissions on each cluster.
Configuration File
Options for mongosync
can be set in a YAML configuration file. Use
the --config
option. For example:
$ mongosync --config /etc/mongosync.conf
For information on available settings, see Configuration.
Cluster and Collection Types
Sharded Clusters
Cluster-to-Cluster Sync supports replication between sharded clusters.
mongosync
replicates individual shards in parallel from the source
cluster to the destination cluster. However mongosync
does not
preserve the source cluster's sharding configuration.
Important
When the source or destination cluster is a sharded cluster, you must stop
the balancer on both clusters and not run the moveChunk
or
moveRange
commands for the duration of the migration. To stop
the balancer, run the balancerStop
command and wait for the
command to complete.
Pre-Split Chunks
When mongosync
syncs to a sharded destination cluster, it pre-splits chunks
for sharded collections on the destination cluster. For each sharded collection,
mongosync
creates twice as many chunks as there are shards in the
destination cluster.
Chunk Distribution
mongosync
does not preserve chunk distribution from the source to
the destination, even with multiple mongosync
instances. It is not
possible to reproduce a particular pre-split of chunks from a source
cluster on the destination cluster.
The only sharding configuration that mongosync
preserves from the
source cluster to the destination cluster is the sharding key. Once the
migration finishes, you can enable the destination cluster's balancer which
distributes documents independently of the source cluster's distribution.
Primary Shards
When you sync to a sharded destination cluster, mongosync
assigns a
primary shard to each database by means of a round-robin.
Warning
Running movePrimary
on the source or desintation cluster
during migration may result in a fatal error or require you to
restart the migration from the start. For more information, see
Sharded Clusters.
Multiple Clusters
To sync a source cluster to multiple destination clusters, use one
mongosync
instance for each destination cluster. For more
information, see Multiple Clusters Limitations.
Capped Collections
Starting in 1.3.0, Cluster-to-Cluster Sync supports capped collections with some limitations.
convertToCapped
is not supported. If you runconvertToCapped
,mongosync
exits with an error.cloneCollectionAsCapped
is not supported.
Capped collections on the source cluster work normally during sync.
Capped collections on the destination cluster have temporary changes during sync:
There is no maximum number of documents.
The maximum collection size is 1PB.
mongosync
restores the original values for maximum number of
documents and maximum document size during commit.
Reads and Writes
Write Blocking
mongosync
does not enable write-blocking by default. If you enable
write-blocking, mongosync
blocks writes:
On the destination cluster during sync.
On the source cluster when
commit
is received.
To enable write-blocking, use the start API
to set enableUserWriteBlocking
to true
. You cannot enable
write-blocking after the sync starts.
You must enable write-blocking when you start mongosync
if you want
to use reverse synchronization later.
User Permissions
To set enableUserWriteBlocking
, the mongosync
user must have a
role that includes the setUserWriteBlockMode
and
bypassWriteBlockingMode
ActionTypes.
Note
When using enableUserWriteBlocking
, writes are only blocked for users
that do not have the bypassWriteBlockingMode
ActionType. Users
who have this ActionType are able to perform writes.
Permissible Reads
Read operations on the source cluster are always permitted.
When the /progress endpoint reports canWrite
is
true
, the data on the source and destination clusters is consistent.
Permissible Writes
To see what state mongosync
is in, call the /progress API endpoint. The /progress
output includes a
boolean value, canWrite
.
When
canWrite
istrue
, it is safe to write to the destination cluster.When
canWrite
isfalse
, do not write to the destination cluster.
You can safely write to the source cluster while mongosync
is
syncing. Do not write to the destination cluster unless canWrite
is
true
.
Read and Write Concern
By default, mongosync
sets the read concern level to
"majority"
for reads on the source cluster. For writes on
the destination cluster, mongosync
sets the write concern level to
"majority"
with j: true.
For more information on read and write concern configuration and behavior, see Read Concern and Write Concern.
Read Preference
mongosync
requires the primary
read preference when
connecting to the source and destination clusters. For more information,
see Read Preference Options.
Legacy Index Handling
mongosync
rewrites legacy index values, like 0
or an empty
string, to 1
on the destination. mongosync
also removes any
invalid index options on the destination.
Considerations for Continuous Sync
For any continuous synchronization use cases with mongosync
, ensure that
mongosync
commits before cutting over from the source to the
destination.
If the source cluster shuts down before mongosync
can commit, such as in
a disaster scenario, the destination cluster might not have a consistent
snapshot of the source data. To learn more, see Consistency.
Note
After commit, you can't resume continuous sync between two clusters since
mongosync
can only sync into empty destination clusters. If you need to
use the same two clusters after cutover, you can
call the reverse
endpoint to keep the clusters in sync.
Otherwise, start a new continuous sync operation by using a new empty
destination cluster.
Temporary Changes to Collection Characteristics
mongosync
temporarily alters the following collection characteristics during
synchronization. The original values are restored during the commit process.
Change | Description |
---|---|
Unique Indexes | Unique indexes on the source cluster are synced as non-unique indexes
on the destination cluster. |
TTL Indexes | Synchronization sets expireAfterSeconds to the value of MAX_INT
on the destination cluster. |
Hidden Indexes | Synchronization replicates hidden indexes as non-hidden. |
Write Blocking | If you enable write-blocking,
To learn more, see Write Blocking. |
Capped Collections | Synchronization sets capped collections to the maximum allowable
size. |
Dummy Indexes | In some cases, synchronization may create dummy indexes on the
destination to support writes on sharded or collated collections. |
Rolling Index Builds
mongosync
does not support rolling index builds during migration. To avoid building
indexes in a rolling fashion during migration, use one of the following
methods to ensure that your destination indexes match your source
indexes:
Build the index on the source before migration.
Build the index on the source during migration with a default index build.
Build the index on the destination after migration.
Destination Clusters
Consistency
mongosync
supports eventual consistency on the destination
cluster. Read consistency is not guaranteed on the destination cluster until
commit. Before committing, the source and destination clusters may differ at a
given point in time. To learn more, see Considerations for Continuous Sync.
While mongosync
is syncing, mongosync
may reorder or combine writes
as it relays them from source to destination. For a given document, the total
number of writes may differ between source and destination.
Transactions might not appear atomically on the destination cluster. Retryable writes may not be retryable on the destination cluster.
Profiling
If profiling is enabled on a source database, MongoDB creates a special
collection named <db>.system.profile
. After synchronization is
complete, Cluster-to-Cluster Sync will not drop the
<db>.system.profile
collection from the destination even if the
source database is dropped at a later time. The <db>.system.profile
collection will not change the accuracy of user data on the
destination.
Views
If a database with views is dropped on the source, the destination may
show an empty system.views
collection in that database. The empty
system.views
collection will not change the accuracy of user
data on the destination.
System Collections
Cluster-to-Cluster Sync does not replicate system collections to the destination cluster.
If you issue a dropDatabase
command on the source cluster,
this change is not directly applied on the destination cluster. Instead,
Cluster-to-Cluster Sync drops user collections and views in the database
on the destination cluster, but it does not drop system collections
on that database.
For example, on the destination cluster:
The drop operation does not affect a user-created
system.js
collection.If you enable profiling, the
system.profile
collection remains.If you create views on the source cluster and then drop the database, replicating the drop removes the views, but leaves an empty
system.views
collection.
In these cases, the replication of dropDatabase
removes all user-created
collections from the database, but leaves its system collections on the
destination cluster.
UUIDs
mongosync
creates collections with new UUIDs on the destination cluster. There is no
relationship between UUIDs on the source cluster and the destination
cluster. If applications contain hard-coded UUIDs (which MongoDB does
not recommend), you may need to update those applications before they
work properly with the migrated cluster.
Sorting
mongosync
inserts documents on the destination cluster in an
undefined order which does not preserve natural sort order from the
source cluster. If applications depend on document order but don't have
a defined sort method, you may need to update those applications to
specify the expected sort order before the applications work properly
with the migrated cluster.
Performance
Resilience
mongosync
is resilient and able to handle non-fatal errors. Logs
that contain the word "error" or "failure" do not indicate that
mongosync
is failing or corrupting data. For example, if a network
error occurs, the mongosync
log may contain the word "error' but
mongosync
is still able to complete the sync. In the case that a
sync does not complete, mongosync
writes a fatal log entry.
Data Definition Language (DDL) Operations
Using DDL operations (operations that act on collections or databases
such as db.createCollection()
and db.dropDatabase()
)
during sync increase the risk of migration failure and may negatively
impact mongosync
performance. For best performance, refrain from
performing DDL operations on the source cluster while the sync is in
progress.
For more information on DDL operations, see Pending DDL Operations and Transactions.