MongoDB Connector (source)

Crispy_N_A · June 26, 2023, 2:35pm

Hey,

I’ve been using the MongoDB Connector for Kafka Connect for a while on a Kubernetes cluster (using the Strimzi operator for deployment/config). Until now all seems to have been working perfectly well… tbh it still is working well until I hit very high load. In this case I am seeing the distribution of messages across topic partitions to be uneven. I would say that 50% of the partitions are not really being utilised.

According the the Kafka Connect docs it is down to the producer to define the partitioner in use and I do not see a place this could be configured with the MongoDB connector.

So my question is this… is the connector using the DefaultPartitioner, and if so is it possible to force round-robin behaviour?

Thanks.

Crispy_N_A · June 29, 2023, 2:52pm

As an update I figured out that the normal Kafka producer config can be passed down to the connector specifying the desired partitioner (partitioner.class). However I am still seeing half of the partitions for a topic unused. As a test I manually published to one of the unused partitions which worked fine.

So the question remains… why would half of the partitions for a perfectly valid Kafka topic not be published to by the Kafka Connect connector?

Jean_Francois · January 7, 2024, 10:17pm

what partitioner did you set for partitioner.class?

Crispy_N_A · January 22, 2024, 12:50pm

I had thought that I was using the default class, however in the end I found an override that was using the round-robin partitioner. After dropping that and reverting to the defaults the distribution of messages was pretty even across the partitions.