So far been struggling to see why the aggregation queries thrown by my service are taking so long to execute, let me give you a little bit of context here.
A Java service performs aggregations (using allowDiskUse: true set always) via the MongoDB Java Driver to RS secondaries (not sharded).
So far so good, indexes are in place and everything is shiny. However, the service has got a 45 secs timeout for everything related to Mongo data retrieval, and here is when things get dark. Every single aggregation requested by the service gives a timeout (> 45 secs). If you log into the secondaries and check the current ops running the aggregations are taking > 5 minutes most of them! Happens to retrieve data at times, not always tho.
Now, if I log into the secondaries via the Mongo cli in the terminal and throw the exact same aggregation with it then times are constant! We’re talking about 3 seconds, 4 at most for the aggregation to be executed!
How can the difference be so high? What can be the different or wrong driver setup that can be causing this?