Production Considerations
On this page
- Availability
- Feature Compatibility
- Runtime Limit
- Oplog Size Limit
- WiredTiger Cache
- Transactions and Security
- Shard Configuration Restriction
- Sharded Clusters and Arbiters
- Acquiring Locks
- Pending DDL Operations and Transactions
- In-progress Transactions and Write Conflicts
- In-progress Transactions and Stale Reads
- In-progress Transactions and Chunk Migration
- Outside Reads During Commit
- Errors
- Additional Information
The following page lists some production considerations for running transactions. These apply whether you run transactions on replica sets or sharded clusters. For running transactions on sharded clusters, see also the Production Considerations (Sharded Clusters) for additional considerations that are specific to sharded clusters.
Availability
In version 4.0, MongoDB supports multi-document transactions on replica sets.
In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on replica sets.
To use transactions on MongoDB 4.2 deployments (replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.
Note
Distributed Transactions and Multi-Document Transactions
Starting in MongoDB 4.2, the two terms are synonymous. Distributed transactions refer to multi-document transactions on sharded clusters and replica sets. Multi-document transactions (whether on sharded clusters or replica sets) are also known as distributed transactions starting in MongoDB 4.2.
Feature Compatibility
To use transactions, the featureCompatibilityVersion for all members of the deployment must be at least:
Deployment | Minimum featureCompatibilityVersion |
---|---|
Replica Set | 4.0 |
Sharded Cluster | 4.2 |
To check the fCV for a member, connect to the member and run the following command:
db.adminCommand( { getParameter: 1, featureCompatibilityVersion: 1 } )
For more information, see the
setFeatureCompatibilityVersion
reference page.
Runtime Limit
By default, a transaction must have a runtime of less than one minute.
You can modify this limit using
transactionLifetimeLimitSeconds
for the
mongod
instances. For sharded clusters, the parameter
must be modified for all shard replica set members. Transactions that
exceeds this limit are considered expired and will be aborted by a
periodic cleanup process.
For sharded clusters, you can also specify a maxTimeMS
limit on
commitTransaction
. For more information, see Sharded Clusters
Transactions Time Limit.
Oplog Size Limit
- Starting in version 4.2,
- MongoDB creates as many oplog entries as necessary to the encapsulate all write operations in a transaction, instead of a single entry for all write operations in the transaction. This removes the 16MB total size limit for a transaction imposed by the single oplog entry for all its write operations. Although the total size limit is removed, each oplog entry still must be within the BSON document size limit of 16MB.
- In version 4.0,
- MongoDB creates a single oplog (operations log) entry at the time of commit if the transaction contains any write operations. That is, the individual operations in the transactions do not have a corresponding oplog entry. Instead, a single oplog entry contains all of the write operations within a transaction. The oplog entry for the transaction must be within the BSON document size limit of 16MB.
WiredTiger Cache
To prevent storage cache pressure from negatively impacting the performance:
When you abandon a transaction, abort the transaction.
When you encounter an error during individual operation in the transaction, abort and retry the transaction.
The transactionLifetimeLimitSeconds
also ensures that
expired transactions are aborted periodically to relieve storage cache
pressure.
Note
If you have an uncommitted transaction that causes excessive pressure on the WiredTiger cache, the transaction aborts and returns a write conflict error.
If a transaction is too large to ever fit in the WiredTiger cache,
the transaction aborts and returns a TransactionTooLargeForCache
error.
Transactions and Security
If running with access control, you must have privileges for the operations in the transaction.
If running with auditing, operations in an aborted transaction are still audited. However, there is no audit event that indicates that the transaction aborted.
Shard Configuration Restriction
You cannot run transactions on a sharded cluster that has a shard
with writeConcernMajorityJournalDefault
set to false
(such as a shard with a voting member that uses the in-memory
storage engine).
Sharded Clusters and Arbiters
Transactions whose write operations span multiple shards will error and abort if any transaction operation reads from or writes to a shard that contains an arbiter.
Acquiring Locks
By default, transactions wait up to 5
milliseconds to acquire locks
required by the operations in the transaction. If the transaction
cannot acquire its required locks within the 5
milliseconds, the
transaction aborts.
Transactions release all locks upon abort or commit.
Tip
When creating or dropping a collection immediately before
starting a transaction, if the collection is accessed within the
transaction, issue the create or drop operation with write
concern "majority"
to ensure that the transaction
can acquire the required locks.
Lock Request Timeout
You can use the maxTransactionLockRequestTimeoutMillis
parameter to adjust how long transactions wait to acquire locks.
Increasing maxTransactionLockRequestTimeoutMillis
allows
operations in the transactions to wait the specified time to acquire
the required locks. This can help obviate transaction aborts on
momentary concurrent lock acquisitions, like fast-running metadata
operations. However, this could possibly delay the abort of deadlocked
transaction operations.
You can also use operation-specific timeout by setting
maxTransactionLockRequestTimeoutMillis
to -1
.
Pending DDL Operations and Transactions
If a multi-document transaction is in progress, new DDL operations that
affect the same database(s) or collection(s) wait behind the
transaction. While these pending DDL operations exist, new transactions
that access the same database(s) or collection(s) as the pending DDL
operations cannot obtain the required locks and and will abort after
waiting maxTransactionLockRequestTimeoutMillis
. In
addition, new non-transaction operations that access the same
database(s) or collection(s) will block until they reach their
maxTimeMS
limit.
Consider the following scenarios:
- DDL Operation That Requires a Collection Lock
While an in-progress transaction is performing various CRUD operations on the
employees
collection in thehr
database, an administrator issues thedb.collection.createIndex()
DDL operation against theemployees
collection.createIndex()
requires an exclusive collection lock on the collection.Until the in-progress transaction completes, the
createIndex()
operation must wait to obtain the lock. Any new transaction that affects theemployees
collection and starts while thecreateIndex()
is pending must wait until aftercreateIndex()
completes.The pending
createIndex()
DDL operation does not affect transactions on other collections in thehr
database. For example, a new transaction on thecontractors
collection in thehr
database can start and complete as normal.- DDL Operation That Requires a Database Lock
While an in-progress transaction is performing various CRUD operations on the
employees
collection in thehr
database, an administrator issues thecollMod
DDL operation against thecontractors
collection in the same database.collMod
requires a database lock on the parenthr
database.Until the in-progress transaction completes, the
collMod
operation must wait to obtain the lock. Any new transaction that affects thehr
database or any of its collections and starts while thecollMod
is pending must wait until aftercollMod
completes.
In either scenario, if the DDL operation remains pending for more than
maxTransactionLockRequestTimeoutMillis
, pending
transactions waiting behind that operation abort. That is, the value of
maxTransactionLockRequestTimeoutMillis
must at least cover
the time required for the in-progress transaction and the pending DDL
operation to complete.
In-progress Transactions and Write Conflicts
If a transaction is in progress and a write outside the transaction modifies a document that an operation in the transaction later tries to modify, the transaction aborts because of a write conflict.
If a transaction is in progress and has taken a lock to modify a document, when a write outside the transaction tries to modify the same document, the write waits until the transaction ends.
In-progress Transactions and Stale Reads
Read operations inside a transaction can return old data, which is known as a stale read. Read operations inside a transaction are not guaranteed to see writes performed by other committed transactions or non-transactional writes. For example, consider the following sequence:
A transaction is in-progress.
A write outside the transaction deletes a document.
A read operation inside the transaction can read the now-deleted document since the operation uses a snapshot from before the write operation.
To avoid stale reads inside transactions for a single document, you
can use the db.collection.findOneAndUpdate()
method. The following
mongosh
example demonstrates how you can use
db.collection.findOneAndUpdate()
to take a write lock and ensure
that your reads are up to date:
Use db.collection.findOneAndUpdate()
inside the transaction
employeeDoc = employeesCollection.findOneAndUpdate( { _id: 1, status: "Active" }, { $set: { lockId: ObjectId() } }, { returnNewDocument: true } )
Note that inside the transaction, the findOneAndUpdate
operation
sets a new lockId
field. You can set lockId
field to any
value, as long as it modifies the document. By updating the
document, the transaction acquires a lock.
If an operation outside of the transaction attempts to modify the document before you commit the transaction, MongoDB returns a write conflict error to the external operation.
In-progress Transactions and Chunk Migration
Chunk migration acquires exclusive collection locks during certain stages.
If an ongoing transaction has a lock on a collection and a chunk migration that involves that collection starts, these migration stages must wait for the transaction to release the locks on the collection, thereby impacting the performance of chunk migrations.
If a chunk migration interleaves with a transaction (for instance, if a transaction starts while a chunk migration is already in progress and the migration completes before the transaction takes a lock on the collection), the transaction errors during the commit and aborts.
Depending on how the two operations interleave, some sample errors include (the error messages have been abbreviated):
an error from cluster data placement change ... migration commit in progress for <namespace>
Cannot find shardId the chunk belonged to at cluster time ...
Outside Reads During Commit
During the commit for a transaction, outside read operations may try to read the same documents that will be modified by the transaction. If the transaction writes to multiple shards, then during the commit attempt across the shards:
Outside reads that use read concern
"snapshot"
or"linearizable"
wait until all writes of a transaction are visible.Outside reads that are part of causally consistent sessions (those that include afterClusterTime) wait until all writes of a transaction are visible.
Outside reads using other read concerns do not wait until all writes of a transaction are visible, but instead read the before-transaction version of the documents.
Errors
Use of MongoDB 4.0 Drivers
To use transactions on MongoDB 4.2 deployments (replica sets and sharded clusters), clients must use MongoDB drivers updated for MongoDB 4.2.
On sharded clusters with multiple mongos
instances,
performing transactions with drivers updated for MongoDB 4.0 (instead
of MongoDB 4.2) will fail and can result in errors, including:
Note
Your driver may return a different error. Refer to your driver's documentation for details.
Error Code | Error Message |
---|---|
251 | cannot continue txnId -1 for session ... with txnId 1 |
50940 | cannot commit with no participants |