Scaling Kafka Source Connector

Punith_Kumar · June 3, 2020, 4:37pm

Hi,
I am currently working in WalmartLabs Software Division.

We are trying to read data from MongoDB Collection using Kafka Source Connector https://docs.mongodb.com/kafka-connector/master/kafka-source/. We are noticing that one task is able to read only 500 Documents Per Second. We have no custom filters and not doing any processing on the document read form the MongoDB Change Stream. We also notice that there is no Spike in CPU or Memory on the VM where the Kafka Connector in running. So below are some questions:

Is there any benchmark numbers for the Kafka Source Connector in terms of Reading Documents Per Sec from MongoDB ?
Are there any Performance Best Practice for the Kafka Source Connector which can give optimum performance in terms of reading optimum Documents Per Sec ?
Any reason why Kafka Source Connector is built as a Singelton Task ?

Request you to help us in this regard which will unblock our development and able to deliver quality sofware

Punith

Hamid_Jawaid · July 8, 2020, 3:55pm

How many tasks are spawned?
How many cores are in CPU?