I have data split between two Mongo databases. One is relatively small and its working set fits easily in RAM. The other is much bigger, and its working set does not fit in RAM (yes, I know ideally it should!) Various queries are run against the databases, and when only the first is used, everything is fine. When the second is queried in parallel with the first, things are obviously a little slow due to the working set not fitting in RAM, but it’s not catastrophic. Over time, though, various queries pop up that run for hours. These queries are no different to thousands of queries that are being run continually and that usually execute in milliseconds. If I manually type these queries into Compass, they run in milliseconds.
I’m not sure if it’s relevant, but these queries have huge values for
numYields (in the millions). I guess this is perhaps just a symptom of how long the queries are running for. They also have
"ReplicationStateTransition": "w" in their
locks section. Is this expected? It seems strange that a query should require a write lock, especially on
These queries start to appear only after a little while (10 - 30 minutes maybe?) and then gradually accumulate. I think they’re the cause of the number of read tickets also gradually reducing, until there are zero and I assume either the load of these queries (if they’re causing load?) and/or the lack of read tickets means everything slows down a lot.
Anyway got any idea what’s going on here?!