100% client CPU, little server CPU, nodejs event queue blocked

Hello All,

First post so please be nice:)

I am trying to debug an odd problem we’re having - and to be honest I’m struggling. We’re performing a aggregate query and depending on the exact query (as it can be altered by the end-user), the client CPU goes to 100%, blocks the event queue while CPU load on the server is not noticeable.

So, lets except for the moment that the query in question is rubbish and really slow, why would the client go to 100% cpu and block the node event queue?

The code is something like:

async function RunReport(report) {
    c.LogMe('pre RunReport');
    var cursor = await c.db.collection("activity").aggregate(report);
    c.LogMe('post RunReport');
    var answer = await cursor.toArray();
    c.LogMe('post cusor');
    return answer;
}

The CPU load and blocking is at the line:

var answer = await cursor.toArray();

The results of the query is always small (as there is a limit in the aggregate) and if I’ve interpreted the docs correctly, I can’t see how this line could block even if it has to wait for ‘sometime’ for results to be returned?

Our Environment:

Server, MongoAtlas 4.2.18
Client, NodeJS v16.13.2, mongodb@3.6.5

Thanks in advances,

jez.

Hi @Jeremy_White ,

Investigating performance issues is not as simple as looking at the outside code.

First we will need to get the full aggregation pipeline you pass to the aggregate command.

The toArray method is the place that the query. Is actually executed and waits for a full result set as toArray is not a cursor like iterator.

Limiting results is not always enough as the stages before can yield massive resources consumption while client wait. What is the client spec?

Against what MongoDB deployment do you run this? What is the topology and machine sizes. Also provide sample docs and output.

Thanks
Pavel

Hi Pavel,

Thanks for the reply.

So this query is basically built by the user, ie, adding various filters and it’s one of these filters that is causing a slow running query. There is an expectation from the user that some of these queries will take time to run and there are appropriate UI controls in place to show this.

I am happy to share the pipeline, however, what we’re really confused/concerned about is why the 100% load on the client side and the blocking of the nodejs event loop (while the DB is doing little).

It may be my misunderstanding, but with the line:

await cursor.toArray();

I assume that it effectively yields to the node event loop (allowing other tasks/queries to take place) until the cursor has processed all entries. Or are you saying, toArray() is effectually a blocking operation and we should use another approach to yield back to the event loop while the query is running on the DB?

Putting the question slightly differently, if I have a pipeline that takes 10 seconds to run, how do I run it without blocking the nodejs event loop for 10 seconds (with the nodejs process at 100% load)?

Cheers,

jez.

I am not a Mongo Expert,

  1. It looks like an operation that blocks your app cause you’re using await (this is how it should work)…

  2. if you want to keep the flow that way and just not block the app use Promises instead of await - as in Promises the code doesn’t stop, it gets to the return statement once the data comes back from Mongo.

  3. you need to take into account the size of the results because you’re keeping it inside memory - could cause indirectly a high CPU usage.

  4. any way you can limit the results amount to ensure no abuse will be created by fetching data.

1 Like

Thanks for the reply. So, I was under the assumption that toArray returns a promise, and the await, well, waits until the promise is resolved? It shouldn’t block the entire nodejs queue, ie, nothing else can run, no other incoming events are processed.

Yes, this may be something that I may misunderstanding, I was expected the whole pipeline to be run on the server side, not partly on the client? I will look into this more.

I meant to use Promises like this:

`async function RunReport(report) {
c.db.collection(“activity”).aggregate(report).then( (cursor) => {
do somthing…
});
}
in this case, the app isn’t blocking the code while fetching the data…

anyway I see that toArray() is a pretty heavy operation by itself:
toArray()

there are other async solutions for that like:
MongoDB forum discussion

or running on the cursor one-by-one and accumulate your result batch as you want:
Cursor one-by-one fetch

I am having the same problem with NodeJS. Whereas Python and Java are working pretty fine.

I have also created a Stackoverflow post here
https://stackoverflow.com/questions/71966751/nodeexpressmongodb-native-client-performance-issue