Distribute aggregations in round-robin mode across replicas

I have a large dataset on which I need to perform aggregation pipelines in batch workflows and I was considering utilizing secondary nodes for these aggregations.

I conducted a test by assigning a label to each node, and in the code, I use the label to distribute the load in round-robin fashion on three nodes.

// mongosh
conf = rs.conf();
conf.members[0].tags = { "myTag": "0" };
conf.members[1].tags = { "myTag": "1" };
conf.members[2].tags = { "myTag": "2" };
rs.reconfig(conf);
// pseudocode
i := 0
pipelines := [pipeline1, pipeline2, pipeline3, ... ]
FOR pipeline IN pipelines
   i := (i + 1) % 3
   async {
     collection.secondaryPreferred.tagSet({ myTag: i }).aggregate(pipeline)
   }

Do you think this technique is appropriate?
Are there any reasons why I shouldn’t do it?

Thank you very much.

Hi @Giacomo_Benvenuti,
The data will have to be consistent across all three nodes if you want to use this method.

From documentation:

Regards

Hello Fabio, thank you, certainly the data that the pipelines will work on must be “cold,” meaning already aligned among all the nodes.

1 Like