Difference between find() vs aggregate() Lecture: Cursor-like stages: Part 1

Ping_Pong · July 25, 2021, 4:44pm

It is easy to understand operator used in aggregate() below:
db.solarSystem.aggregate([
{$project:{name:1,_id:0}},
{$skip:1},
{$limit:2},
{$sort:{name:1}},
{$count:“count”}
])

However, the result from find() is a surprise to me. Can anyone explan on the issues below:

db.solarSystem.find({},{name:1,_id:0})
.skip(1)
.limit(2)
.sort({name:1})

Issue 1:
the result of find above is sort all elements before skip and limit

db.solarSystem.find({},{name:1,_id:0})
.skip(1)
.limit(2)
.sort({name:1})
.count()

Issue 2:
the result is 9, which is the result after find() but before skip, limit.

Takis · July 26, 2021, 2:12pm

Hello

In aggregation pipeline its clear how the order will be.

Besides aggregation we have 2 other database commands
1)find command
2)count command

Find command can have sort,limit,skip,project,query
Count command can have query,limit,skip.

When you give to the driver something like the bellow(a driver method call)

db.solarSystem.find({},{name:1,_id:0})
.skip(1)
.limit(2)
.sort({name:1})
.count()

MongodDB will see a command that will look like bellow

{
  count: 'solarSystem',
  query: {},
  limit: 2,
  skip: 1,
  lsid: {
    id: new Binary(Buffer.from("c686ad6eb9f140649cefed4ea6f10708", "hex"), 4)
  },
  '$db': 'solar'
}

It will ignore the sort and the project,and just send a count-command(they dont make sense anyways) to monodb.
If you dont put the count in the end,driver will send 1 find command to mongodb

No matter the order you give,the order is defined by the find command.
For example sort is done before limit which make sense see here

Aggregate/Find methods are lazy,you say what you want to do,with cursor methods,
but nothing goes to the database,when its needed for example you asked for results,
driver will take all that info,create a command,that will run and you will get the results.

Prasad_Saya · July 26, 2021, 2:37pm

Hello.

The aggregation query is a pipeline - at each stage documents are passed, processed and forwarded to the next stage. So, you can apply the stage’s funtionality and see the result of that stage immediately. This can be observed clearly in the MongoDB Compass’s Aggregation Builder.

About the find with similar looking query options:

db.solarSystem.find({},{name:1,_id:0})
  .skip(1)
  .limit(2)
  .sort({name:1})

The find method returns a cursor. And you are applying the skip, limit and sort cursor methods. The behavior of these methods is quite different from that of the similar sounding aggregation stages.

The cursor is a pointer (an interface) to the data of the result set on the database server. You get the cursor on the client (e.g., mongo shell) from where you submitted the query.

And, you must apply skip, limit and sort to the cursor before retrieving any documents from the database. Also, no matter in what order you apply these three methods the result will be same.

db.solarSystem.find({},{name:1,_id:0})
  .skip(1)
  .limit(2)
  .sort({name:1})
  .count()

The count returns the cursor’s original value before applying the skip and limit operations. To get the actual count you need to use the size method (and this would be 2 for the above query). The size method calculates the count after applying the limit and skip operations on the cursor. There is also a itcount method which gives the result as 2.

Please refer documentation for more details:

Ping_Pong · July 26, 2021, 9:20pm

And, you must apply skip, limit and sort to the cursor before retrieving any documents from the database.

My understanding is that the code below via aggregate() is performed on the MongoDB server side.
db.solarSystem.aggregate([
{$project:{name:1,_id:0}},
{$skip:1},
{$limit:2},
{$sort:{name:1}},
{$count:“count”}
])

Do you mean the code below can also have skip, limit and sort before retrieving any documents from the database? If so, how? Please elaborate on it if possible.

db.solarSystem.find({},{name:1,_id:0})
.skip(1)
.limit(2)
.sort({name:1})

Prasad_Saya · July 27, 2021, 4:25am

Hello @Ping_Pong,

I suggest you try out some code yourself and see how these work. For example, with the aggregation query apply one stage at a time and see what the results are.

db.collection.aggregate([
    {$project:{title:1,_id:0}},
])

db.collection.aggregate([
    {$project:{title:1,_id:0}},
    {$skip:1},
])

And, apply the remaining stages one at a time and see the results. Then, try the same procedure with the find query also, and compare the results.

One more point you can note is both the following methods return a cursor.

db.collection.find()
db.collection.aggregate()

Ping_Pong · July 27, 2021, 10:14am

Thanks for your reply. But I don’t think I get the answer.

Anyway, it is a bonus if I knew the answer to what you said about the find() method, but I guess aggregate() is a better version to use.

Prasad_Saya · July 27, 2021, 10:21am

@Ping_Pong, the aggregate and find methods have their own use cases depending upon the functionality you are trying to achieve. For example, aggregation queries allow complex data transformation and that is to possible with the find. As you use them you will get to know when to use what. That said, there quite a few cases where both of them provide same functionality.