I relized then when I do a find query and the filter, its actualy sorting and all the processing on the client. What would I do when I get a huge cluster of nodes and get over 1M documents
This is a pretty broad statement. Can you provide a minimal set of data/code for replication of this?
It has not been my experience that this occurs.
Lets say, jack sent 3Million messages to john through my server api.
Now john opens the client and send a TCP message to the server to fetch the 30 most newest records. The problem is mongo fetches all and by all i mean ALL in the collection, then when you do .Limit() it cut off the array. How can I tell mongo to only fetch 30 in the server
NOTICE: I am using the C# driver
Hi @Rayyan_Khan,
Can you provide some code for context?
Filtering happens on the server side by default. You can confirm how a query is processed by Explaining Results.
For example, create a collection with a million documents using mongosh
:
for (let i=0; i<1000000; i++) {
db.million.insertOne({i:i, widget:'blue'})
}
Find the first 30 documents matching widget:blue
:
db.million.find({widget: 'blue'}).limit(30)
I haven’t created an index on the widget
field yet so the MongoDB server will have to use a collection scan strategy to find matching documents. However, since I specified a limit
(and no other query criteria), a result can be returned after 30 matching documents are found:
> db.million.find({widget:'blue'}).limit(30).explain(true).executionStats.totalDocsExamined
30
My result seems OK, but in my sample data I have widget:blue
in every document. If I search for criteria that does not match any documents, the worst case scenario is a full collection scan:
> db.million.find({widget:'green'}).limit(30).explain(true).executionStats.totalDocsExamined
1000000
However, if this is a common query shape I would add an index:
> db.million.createIndex({widget:1})
widget_1
… and now the same query only has to look at the index to confirm there are no matching documents:
> db.million.find({widget:'green'}).limit(30).explain(true).executionStats.totalDocsExamined
0
Appropriate indexes to create will depend on your common query criteria and options like sort order.
Since you mentioned “30 most newest”, I assume you will be sorting on a date field and will want to add an index covering your search criteria as well as the sort order. See Use Indexes to Sort Query Results for some relevant examples and Tips and Tricks for Query Performance: Let Us .explain() Them for a helpful presentation around explaining queries.
Regards,
Stennie
Ohh, well then that explains my question, ty!