Hi @Dmytro_Sheyko ,
Interesting question!!
The aggregation operations can be considered as the pipeline through which the documents are flowing. Now the consecutive stages work on the set/shape of the documents that are returned from the previous stage.
Consider the scenario where you have the following documents:
{ _id: 1, user_name: "John", department: "Biology", score: 87},
{ _id: 2, user_name: "Harry", department: "Physics", score: 60},
{ _id: 3, user_name: "Roger", department: "Biology", score: 44},
{ _id: 5, user_name: "Jenny", department: "History", score: 82},
{ _id: 6, user_name: "Srivi", department: "Biology", score: 78},
{ _id: 7, user_name: "Tom", department: "History", score: 80}
Now if we want to get the average passing score of users in each of the departments (passing marks = 75), we will run the following aggregation query:
db.user_data.aggregate ( [
{ $match: { score: {$gt: 75} }},
{$group: { _id: "$department", average_passing_mark: { $avg: "$score" } }}
])
In this aggregation pipeline, all the documents will pass through the $match stage and as a result, only the documents with _id
: 1, 5, 6, and 7 are returned and then these documents are passed to the next $group stage.
The idea of having multiple stages is to isolate the operations that we are are going to perform on the doocuments consecutively. Within each stage, we can use these aggregation operators to construct expressions.
I hope it helps!
Please feel free to reach out if you have any questions.
Kind Regards,
Sonali