MongoDB Connector (source)


I’ve been using the MongoDB Connector for Kafka Connect for a while on a Kubernetes cluster (using the Strimzi operator for deployment/config). Until now all seems to have been working perfectly well… tbh it still is working well until I hit very high load. In this case I am seeing the distribution of messages across topic partitions to be uneven. I would say that 50% of the partitions are not really being utilised.

According the the Kafka Connect docs it is down to the producer to define the partitioner in use and I do not see a place this could be configured with the MongoDB connector.

So my question is this… is the connector using the DefaultPartitioner, and if so is it possible to force round-robin behaviour?

Thanks. :slight_smile:

As an update I figured out that the normal Kafka producer config can be passed down to the connector specifying the desired partitioner (partitioner.class). However I am still seeing half of the partitions for a topic unused. As a test I manually published to one of the unused partitions which worked fine.

So the question remains… why would half of the partitions for a perfectly valid Kafka topic not be published to by the Kafka Connect connector?

what partitioner did you set for partitioner.class?

I had thought that I was using the default class, however in the end I found an override that was using the round-robin partitioner. After dropping that and reverting to the defaults the distribution of messages was pretty even across the partitions. :slight_smile: