We have an internal processing engine that will pull 5000 source data id and then pass them into MongoDB to get its information to do further processing. One source data-id fetching will take 1 sec and then we run our process and it will be done in 10 seconds.
The flow is,
List down the source_data_id from CSV → Collect 1st source_data_id info from mongo → run the batch job → Repeat the same for 2nd source_data_id
Its a sequential process, we are going with parallel processing, we may hit 5000 connections at the same time via AWS Lambda.
Is there a way to handle these connections in a better way or re-use the existing connections if I batch it by 1000 connections at a time?