Docs Menu
Docs Home
/
MongoDB Manual
/ / /

$changeStreamSplitLargeEvent (aggregation)

On this page

  • Definition
  • Behavior
  • Example
$changeStreamSplitLargeEvent

New in version 6.0.9.

If a change stream has large events that exceed 16 MB, a BSONObjectTooLarge exception is returned. Starting in MongoDB 6.0.9, you can use a $changeStreamSplitLargeEvent stage to split the events into smaller fragments.

You should only use $changeStreamSplitLargeEvent when strictly necessary. For example, if your application requires full document pre- or post-images, and generates large events that exceed 16 MB, use $changeStreamSplitLargeEvent.

Before you decide to use $changeStreamSplitLargeEvent, you should first try to reduce the change event size. For example:

  • Don't request document pre- or post-images unless your application requires them. This generates fullDocument and fullDocumentBeforeChange fields in more cases, which are typically the largest objects in a change event.

  • Use a $project stage to include only the fields necessary for your application. This reduces the change event size and avoids the additional time to split large events into fragments. This allows more change events to be returned in each batch.

You can only have one $changeStreamSplitLargeEvent stage in your pipeline, and it must be the last stage. You can only use $changeStreamSplitLargeEvent in a $changeStream pipeline.

$changeStreamSplitLargeEvent syntax:

{
$changeStreamSplitLargeEvent: {}
}

$changeStreamSplitLargeEvent splits events that exceed 16 MB into fragments and returns the fragments sequentially using the change stream cursor.

The fragments are split so that the maximum number of fields are returned in the first fragment. This ensures the event context is returned as quickly as possible.

When the change event is split, only the size of top-level fields are used. $changeStreamSplitLargeEvent does not recursively process or split subdocuments. For example, if you use a $project stage to create a change event with a single field that is 20 MB in size, the event is not split and the stage returns an error.

Each fragment has a resume token. A stream that is resumed using a fragment's token will either:

  • Begin a new stream from the subsequent fragment.

  • Start at the next event if resuming from the final fragment in the sequence.

Each fragment for an event includes a splitEvent document:

splitEvent: {
fragment: <int>,
of: <int>
}

The following table describes the fields.

Field
Description

fragment

Fragment index, starting at 1.

of

Total number of fragments for the event.

The example scenario in this section shows the use of $changeStreamSplitLargeEvent with a new collection named myCollection.

Create myCollection and insert one document with just under 16 MB of data:

db.myCollection.insertOne(
{ _id: 0, largeField: "a".repeat( 16 * 1024 * 1024 - 1024 ) }
)

largeField contains the repeated letter a.

Enable changeStreamPreAndPostImages for myCollection, which allows a change stream to retrieve a document as it was before an update (pre-image) and after an update (post-image):

db.runCommand( {
collMod: "myCollection",
changeStreamPreAndPostImages: { enabled: true }
} )

Create a change stream cursor to monitor changes to myCollection using db.collection.watch():

myChangeStreamCursor = db.myCollection.watch(
[ { $changeStreamSplitLargeEvent: {} } ],
{ fullDocument: "required", fullDocumentBeforeChange: "required" }
)

For the change stream event:

  • fullDocument: "required" includes the document post-image.

  • fullDocumentBeforeChange: "required" includes the document pre-image.

For details, see $changeStream.

Update the document in myCollection, which also produces a change stream event with the document pre- and post-images:

db.myCollection.updateOne(
{ _id: 0 },
{ $set: { largeField: "b".repeat( 16 * 1024 * 1024 - 1024 ) } }
)

largeField now contains the repeated letter b.

Retrieve the fragments from myChangeStreamCursor using the next() method and store the fragments in objects named firstFragment, secondFragment, and thirdFragment:

const firstFragment = myChangeStreamCursor.next()
const secondFragment = myChangeStreamCursor.next()
const thirdFragment = myChangeStreamCursor.next()

Show firstFragment.splitEvent:

firstFragment.splitEvent

Output with the fragment details:

splitEvent: { fragment: 1, of: 3 }

Similarly, secondFragment.splitEvent and thirdFragment.splitEvent return:

splitEvent: { fragment: 2, of: 3 }
splitEvent: { fragment: 3, of: 3 }

To examine the object keys for firstFragment:

Object.keys( firstFragment )

Output:

[
'_id',
'splitEvent',
'wallTime',
'clusterTime',
'operationType',
'documentKey',
'ns',
'fullDocument'
]

To examine the size in bytes for firstFragment.fullDocument:

bsonsize( firstFragment.fullDocument )

Output:

16776223

secondFragment contains the fullDocumentBeforeChange pre-image, which is approximately 16 MB in size. The following example shows the object keys for secondFragment:

Object.keys( secondFragment )

Output:

[ '_id', 'splitEvent', 'fullDocumentBeforeChange' ]

thirdFragment contains the updateDescription field, which is approximately 16 MB in size. The following example shows the object keys for thirdFragment:

Object.keys( thirdFragment )

Output:

[ '_id', 'splitEvent', 'updateDescription' ]

For more information about change streams and events, see Change Events.

Back

$changeStream