Handling mongodb conncetions at scale

We have an internal processing engine that will pull 5000 source data id and then pass them into MongoDB to get its information to do further processing. One source data-id fetching will take 1 sec and then we run our process and it will be done in 10 seconds.

The flow is,
List down the source_data_id from CSV → Collect 1st source_data_id info from mongo → run the batch job → Repeat the same for 2nd source_data_id

Its a sequential process, we are going with parallel processing, we may hit 5000 connections at the same time via AWS Lambda.

Is there a way to handle these connections in a better way or re-use the existing connections if I batch it by 1000 connections at a time?

Mongo clients manage connections and reuse on their own i think. They are supposed to do a good job for it. (meaning if they are able to reuse a connection, they will, otherwise they try to create a new connection if possible)

You can check office doc regarding the options for connection pool, e.g. min/max/idle timeout…