$facets compared to unionWith

Rekha_Madhava · August 29, 2025, 4:15am

I have pipeline where facets are used in order to sort the documents differently [
{ “$facet”:
{ “non_past”: [ { “$match”: { “CategoryOrder”: { “$in”: [ 1, 2 ] } } },
{ “$sort”: { “CategoryOrder”: 1, “advance_option.feature.is_featured”: -1,
“schedule.start_time”: 1 } } ],
“past”: [ { “$match”: { “CategoryOrder”: { “$in”: [ 3, 4 ] } } },
{ “$sort”: { “CategoryOrder”: 1, “advance_option.feature.is_featured”: -1, “schedule.start_time”: -1 } } ] } }]

I have used $facet

how does this compare with using two pipelines and doing a unionwith with regards to latency, memory and cpu usage?

steevej · August 30, 2025, 9:00pm

Often questions like that are better answered by yourself using your own dataset since there is usually not a good answer for all cases.

And this is something very easy to test. Writing an $unionWith should take only a few hours.

But in my opinion, I think the union with would be faster because you do not do anything before the $facet.

But in my opinion, you should not trust anyone who does not back his claims with data.

But in my opinion too, $facet might be faster depending of your indexes. Because without proper index, $facet might do a single collection scan and unionWith will do to. With proper index you might end up with 2 index scan which are probably faster than 1 collection scan.

No clear answer, sorry.

If you do test please share the result.

hermann_samimi2 · September 1, 2025, 9:33am

$facet runs multiple sub-pipelines in one pass over the input, so it’s usually more efficient in terms of I/O, latency, and memory than running two separate pipelines and combining with unionWith.

unionWith means executing two full queries and merging results, higher CPU and memory, more round-trips, slower.

aneroid · September 1, 2025, 10:16am

A $facet stage will process the documents as they are currently in your pipeline. A $unionWith stage will re-read documents from the collection you specify, even if it’s the same one. So you can think of it as an additional ‘read from disk’ for that collection.

From the docs, emphasis mine:

Input documents are passed to the $facet stage only once. $facet enables various aggregations on the same set of input documents, without needing to retrieve the input documents multiple times.
This also means that if have 20 stages (for example) in your aggregation which transform your data before facet/unionWith, then:
- With $facet, you can just add the two pipelines for the two fields; as you have done.
- But for $unionWith, you’ll need to repeat those 20 stages before you can add the second field.
- (If you are adding more fields using unionWith, all those stages will need to be repeated every time for each union. For facet, it’s just once.)

Exactly this. And make sure your dataset/collection is large enough to show a difference. Like 1-10 million documents.

steevej · September 1, 2025, 2:26pm

You made a valid points.

It is right, usually more efficient, but I am not sure it is in this case. Since the pipeline

you must likely incur a collection scan which might fetch all documents. But we know nothing about the dataset and the indexes. My claim is that with proper indexes you may end up with 2 index scans rather than 1 collection scan. In some cases, 2 index scans might be better than 1 collection scan. Specially in this case since the dataset of the 2 facets are mutually exclusives.

That is why testing should be done by original poster with its own dataset.

Another point about the $facet, is that it returns only 2 top documents which might breaks the BSON limit. See $facet (expression operator) - Database Manual - MongoDB Docs.

And testing should also try without the $unionWith and do 2 completely different aggregations (yes 2 round trips to the server) that would run in parallel in the server. You combine the output in the application. All this could be done because nothing is done before and after $facet.

It might be even possible to improve the $facet.

I would try to move $sort of category order and advance_option… before the $facet.
Then I would have 4 $facet rather than 2, one for each CategoryOrder, this would limit the risk of busting the BSON limit. Each $facet would only $sort on start_time which could improve in-memory sort if the indexes cannot be used.

Thanks for pitching in, we all get better when we oppose opinions and ideas.

This was your first post. Keep it going that how we like it.

steevej · September 1, 2025, 2:34pm

Many valid points. Thanks for pitching in.

Valid point 1

Yes, but in this case the documents are mutually exclusive and doing 2 index scans might avoid reading the documents from disk, while a collection scan might need to fetch all documents. I do know enough about the dataset and its indexes to provide anything else as recommendation, than test it.

Valid point 2

Completely agree, hence

Thanks again! I really hope the original author will test and report here.

Rekha_Madhava · September 3, 2025, 5:51pm

Tested and found that the $unionwith slightly better than $facet in providing the output. The unionwith included a repeat of previous stages in my query. However the larger intent of the question was to understand how Mongo actually handles the data internally in both the cases. A chatGPT reading had this - Key Difference

$facet: blocking, because it must assemble a single combined result document.
$unionWith: streaming, because it just concatenates cursors from multiple pipelines into a single stream.

If this correct, the $facets also require more memory to hold the data till the processing is complete.

Thanks to all who provided some insight into my query

steevej · September 3, 2025, 7:53pm

Thanks for the results.

You may mark your own post as the solution.