I’m trying to put together an aggregation pipeline that filters a large data set, then returns a paginated subset, along with the total count and the earliest timestamp within the collection (regardless of the page). Would very much like to avoid a FETCH.
[
{
$match: {
$and: [
{
"identifier.arrayField": {
$in: ["0000000000001"],
},
},
{
$and: [
{
someArrayField: {
$exists: true,
},
},
{
$and: [
{
someArrayField: {
$elemMatch: {
itemId: {
$in: [
"65353aeabf6d627ba9bdc5cc",
],
},
},
},
},
{
anotherField: {
$ne: "65361492e800c0150c7a7ed",
},
},
],
},
],
},
{
anIndexedField: "value",
},
],
},
},
{
$facet: {
items: [
{
$sort: {
"sort_field": 1,
},
},
{
$skip: 0,
},
{
$limit: 10,
},
],
count: [
{
$count: "count",
},
],
ets: [
{
$group: {
_id: null,
earliestTimestamp: {
$min: "$createDate",
},
},
},
],
},
},
{
$addFields: {
total: {
$arrayElemAt: ["$count.count", 0],
},
earliestTimestamp: {
$arrayElemAt: [
"$ets.earliestTimestamp",
0,
],
},
},
},
]
The resulting explanation is extremely long but I can paste that in if it helps.
Ideally, I would like to avoid fetching the entire matching collection for the sake of a count and getting the earliest timestamp.