Aggregation Vs. Chaining

Ndifreke_Essien · May 25, 2020, 8:49pm

What is the difference between using chaining and aggregation for data transformation. Example

Using chaining

db.collection.find({}).limit(1).sort({ createdAt: 1 })

Using aggregation

db.collection.aggregation([
 { $sort: { createdAt: 1 }},
  { $limit: 1 }
])

Please some one should help demystify this for me. Thanks

michael_hoeller · May 25, 2020, 9:39pm

Hi @Ndifreke_Essien, welcome to the community!

Chaining aggregates documents from a single collection. While these operations provide simple access to common aggregation processes, they lack the flexibility and capabilities of the aggregation pipeline and map-reduce. You will find further details here in the MongoDB documentation

Hope this helps as a starter
Michael

Stennie_X · May 26, 2020, 2:14pm

Hi @Ndifreke_Essien,

The db.collection.find() helper in the mongo shell happens to use method chaining (aka a fluent API), but it would be possible to implement similar syntactic sugar for aggregation pipelines.

The general distinction between these two commands is that find (historically, at least) does not perform any data transformation and has fewer options. Chained methods like sort() and limit() set some of the options used to construct a query cursor, and a subset of document fields can be specified in the query projection. If you need to return results without any data transformation, find is the straightforward choice.

The aggregate command includes a large variety of pipeline stages and expression operators to allow you to reshape and transform documents. Since new aggregation stages and expressions continue to be added in successive major MongoDB releases, a fluent API will become outdated more quickly than constructing pipelines directly. Aggregation pipelines are designed to process larger result sets, so also have options like allowDiskUse to enable writing data to temporary files if needed as well as output stages like $out (output results to new collection) and $merge (merge results into a specified collection). If you are doing any significant data transformation, aggregation pipeline is the best approach.

However, there has been some convergence in features of these two commands over time. MongoDB 3.6 introduced the $expr query operator which allows the use of aggregation expressions within a find query, and MongoDB 4.4 adds an allowDiskUse cursor option to allow find queries to write temporary data to disk if needed for in-memory sorts.

Regards,
Stennie

Ndifreke_Essien · June 4, 2020, 7:51pm

Alright, Thanks so much, i get it now

system · June 27, 2020, 2:10am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.