How to run two different pipelines from the data returned by the common pipeline

Zhihong_GUO · January 28, 2021, 10:39am

Hello,

I have build a pipeline PA which will create a data set DSA. I need to count the number of items in DSA and return to the client the total number, so here I created a pipeline PB,

PB = append(PA, bson.M{“$count”: “number”}).

In the mean time, I need to return a subset of DSA to client, so I used another pipeline PC, PC will looks like:

PC = append(PA, bson.M{ “$skip”: offset })

So here the PB and PC will work on the same collection, I run PB to get data set DSA then count the total items in DSA; then I run PC to get data set DSA again then return subset in DSA from the location offset. I am not sure if there are something can be improved, for example, I create a dataset DSA by PA, then only run the {“$count” : “number”} on DSA to get total, then run the { “$skip”: offset } on DSA to get the subset after location offset.

Thanks for the support!

James

Prasad_Saya · January 28, 2021, 12:51pm

Hello @Zhihong_GUO, you can try the following approach.

Assume, your source collections is “collectionA”. Apply, the pipeline PA on the “collectionA” to create a Materialized View, called as “DSA”.

Then, you can run one Aggregation query on this view DSA to get the two outputs - one is the count of DSA and the second is the subset of the DSA.

How to get two outputs with one query? Use $facet stage with two facets (“$facet processes multiple aggregation pipelines within a single stage on the same set of input documents.”).

Updated:

An option is to use a View. The view would be the “DSA”, which is created once. And run the same aggreagtion query using the facets I had mentioned above. But, the view is not persistent, like a collection. The definition says:

A MongoDB view is a queryable object whose contents are defined by an aggregation pipeline on other collections or views. MongoDB does not persist the view contents to disk. A view’s content is computed on-demand when a client queries the view.

Zhihong_GUO · January 28, 2021, 11:56pm

@Prasad_Saya, many thanks for the suggestion. The facet operator seems very promising solution for my problem.
Best,
James

Sudhesh_Gnanasekaran · January 29, 2021, 1:52pm

@Zhihong_GUO FYI

From the Documentation

The $facet stage and its sub-pipelines cannot make use of indexes, even if its sub-pipelines use [ $match ] or $facet first stage in the pipeline. The $facet stage will always perform a COLLSCAN during execution.