Mongo Kafka Connector Collection Listen Limitations

Hi,

We have several collections based on n tenants and want the kafka connector to only watch for specific collections.

Below is my mongosource.properties file where I have added the pipeline filter to listen only to specific collections.It works

pipeline=[{$match:{“ns.coll”:{"$in":[“ecom-tesla-cms-instance”,“ca-tesla-cms-instance”,“ecom-tesla-cms-page”,“ca-tesla-cms-page”]}}}]

the collections will grow in the future to maybe 200 collections which have to be watched, wanted to know the below three things

  1. is their some performance impact with one connector listening to huge number of collections ?
  2. is their any limit on the collections one connector can watch ?
  3. what would be the best practice, to run one connector listening to 100 collections or 10 different connectors listening to 10 collections each ?

Thanks
Harinder Singh

Can we also specify regex lookup , that watch all collections specific to some regex.
@Robert_Walters

You can accomplish that by setting the database property then use a pipeline to match the collection name, something like:

"pipeline" : "[ { $match: { \"ns.coll\": { \"$in\": [\"<collection_1>\", \"<collection_2>\", \"<collection_3>\" ] } } } ]",

is their any limitation on the number of collections one connector can listen to ?

No limit you’ll eventually run into a query size limit if you make your pipeline too long but outside of that there is no predefined limit

What configuration is needed in Sink connector to listen to multiple database changes? In source connector i can use pipeline filter to select databases and collections name and make database field empty, but in sink connector database field is mandatory. How can we sync changes in multiple databases(more than 4k count) to the respective database in another cluster?? It is impossible to add seperate connectors for each databases. As per the documentation sink connector can listen to multiple topics, but how it will write to different databases.?