How to optimize aggregation pipeline, using the $facet operator?

In our project we use the aggregation pipeline and $facet operator extensively.

When we use the explain: true option, we look at the query plan. But the output related to $facet is really limited, and does not mention anything about indexes etc.

How do you effectively analyze aggregation pipelines with the $match step?

Hi @Alex_Bjorlig thanks for your interest and glad to see you back in the forum. While I cannot say much just yet, we will be releasing something that I think your team will be able to use that leverages search indexes and facets in Lucene very soon. Stay tuned.

1 Like

Hi @Alex_Bjorlig,

Can you provide an example of your aggregation pipeline and confirm your version of MongoDB server? The actual explain output would also be helpful if you have a specific query about outcomes.

Per the documentation on $facet Index Use (as at MongoDB 5.0):

The $facet stage, and its sub-pipelines, cannot make use of indexes, even if its sub-pipelines use $match or if $facet is the first stage in the pipeline. The $facet stage will always perform a COLLSCAN during execution.

If you have initial $match & $sort stages before your $facet, those can be candidates for index usage and should be covered in the explain output. For more information, see Pipeline Operators and Indexes and Aggregation Pipeline Optimization.

As @Marcus notes, there is also some development progress toward Faster Faceting in Atlas Search which may also be of interest when available.

Regards,
Stennie

2 Likes

Thanks for awesome answers - and now it makes more sense we get a bunch of warnings about collection scans, because we use facets for every operation. We implement cursor based pagination, and in a facet operation we:

  1. Lkp the actuel rows to return
  2. The total count
  3. The paginated count
  4. Optional facet results.

Are there any feature requests or issues tracking the fact the facet sub-pipelines can’t use an index?

Maybe @Stennie_X can pull strings to get you in the private preview! :slight_smile: Given every query has this syntax.

We are already thinking about how to rewrite the resolvers - because this is not something we do in atlas-search, but “just” $facets in the aggregation pipeline.