Hi,
I am storing data inside a timeseries collection on a basic 3 node replica set.
The data is coming from a kafka topic at a low rate of 1000 messages / sec.
I have a Kafka - Mongo sink connector running this config:
{
"name": "mongo-sink",
"config" : {
"connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
"tasks.max": "1",
"topics": "mytopic",
"connection.uri": "mongodb://user:password@rs-1-1:27017,rs-1-2:27017,rs-1-3:27017/?replicaSet=rs-1&w=1&appName=mongosh+2.2.5",
"database": "mydb",
"collection": "mycol",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": "false",
"document.id.strategy": "com.mongodb.kafka.connect.sink.processor.id.strategy.BsonOidStrategy",
"document.id.strategy.overwrite.existing": "true",
"writemodel.strategy": "com.mongodb.kafka.connect.sink.writemodel.strategy.InsertOneDefaultStrategy",
"delete.on.null.values": "false",
"timeseries.timefield": "_timestamp",
"timeseries.metafield": "_metadata",
"timeseries.timefield.auto.convert": true,
"errors.tolerance":"all",
"max.batch.size": "10000",
"batch.size": "2000"
}
}
I have tried a few different configurations (more tasks + smaller batches and less task with bigger batches). In every case i find myself in the next situation :
- The ingestion rate is regular during a few minutes, and then start to go all over the place
- The ingestion works in small spikes but can keep up with new data incoming into kafka, as suggested by the kafka consumer offset continuously falling behind the current topic offset.
Here is a screenshot of my InfluxDB monitoring dashboard of mongoDB.
The graph is split into 3 parts :
- No data is produced at first
- Data start to be produce correctly with regular rates (middle part)
- Data ingestion abruptly gets unstable. Write locks, commands, write latency, and CPU starts to go up or have an irregular profiles
more infos :
- When the last part starts, the primary mongoDB instance uses 100% of a thread, suggesting a CPU bottleneck
- The disk I/O is very low. no bottleneck here
- RAM : 128Go, 48 CPU, mongoDB instances are docker containers with a limit of 30Go of RAM each
- Indices are around 200 Mb
There are my multiple questions:
- I understand that mongo has some kind of non-parallel + locking write mechanism. If true, is this supposed to have such an impact on this kind of work ?
- Is mongoDB running background tasks (in general) that may lock the ingestion process during its work? If yes, how come they happen this abruptly?
- Considering the low ingest rate (1000 mess/sec) is this normal? should i use a sharded cluster to have multi-threaded write?
- other clues?
Thank you for your time. I can provide more info if needed.