How to calculate same doc's size then take the first one?

I get some data as blow:

{
    "uid": 1056066,
    "event_start_time": 1677207512684,
    "event_end_time": 1677207512684,
    "article_id": 5760884
}

// 2
{
    "uid": 1056066,
    "event_start_time": 1677210902918,
    "event_end_time": 1677210902918,
    "article_id": 5760884
}

// 3
{
    "uid": 1056066,
    "event_start_time": 1677211072966,
    "event_end_time": 1677211072966,
    "article_id": 5763688
}

// 4
{
    "uid": 1056066,
    "event_start_time": 1677217109856,
    "event_end_time": 1677217109856,
    "article_id": 5234061
}

// 5
{
    "uid": 1056066,
    "event_start_time": 1677217227239,
    "event_end_time": 1677217227239,
    "article_id": 5376768
}

// 6
{
    "uid": 1056066,
    "event_start_time": 1677217374833,
    "event_end_time": 1677217374833,
    "article_id": 4341130
}

I want calculate the same article_id’s count as a field into the each doc, Then i want take the first doc which have the same article_id (by event_end_time desc)

I’ve written those codes:

db.event_log.aggregate([
	{$match: {uid: 1056066}},
	{$project: {
			_id: false, 
			uid: true, 
			article_id: '$event_info.article_id', 
			event_start_time: true, event_end_time: true
		}
	},
	{$sort: {end_time: -1}}
])

this is goal data:

{
    "uid": 1056066,
    "event_start_time": 1677210902918,
    "event_end_time": 1677210902918,
    "article_id": 5760884
    "size": 2
}

{
    "uid": 1056066,
    "event_start_time": 1677211072966,
    "event_end_time": 1677211072966,
    "article_id": 5763688
    "size": 1
}

{
    "uid": 1056066,
    "event_start_time": 1677217109856,
    "event_end_time": 1677217109856,
    "article_id": 5234061
    "size": 1
}

{
    "uid": 1056066,
    "event_start_time": 1677217227239,
    "event_end_time": 1677217227239,
    "article_id": 5376768
    "size": 1
}

{
    "uid": 1056066,
    "event_start_time": 1677217374833,
    "event_end_time": 1677217374833,
    "article_id": 4341130
    "size": 1
}

Hope someone can help me. Thanks :slight_smile:

Finally i’ve got the goal data i wanted, but the aggregate steps is seems to be so complex and redundance.so i wondering if there is a simply way to get this data :slight_smile:

There are codes:

db.event_log.aggregate([
	{$match: {uid: 1056066}},
	{$project: {
			_id: false, 
			uid: true, 
			article_id: '$event_info.article_id', 
			event_start_time: true, 
			event_end_time: true
		}
	},
	{$sort: {end_time: -1}},
	{
		$group: {
			_id: '$article_id',
			total: {$sum: 1},
			uid: {
				$last: '$$ROOT.uid'
			},
			article_id: {
				$last: '$$ROOT.article_id'
			},
			event_start_time: {
				$last: '$$ROOT.event_start_time'
			},
			event_end_time: {
				$last: '$$ROOT.event_end_time'
			}
		}
	}
])
_id   total uid   article_id   event_start_time   event_end_time      
4300316	1	1056066	4300316	1677467266155	1677467266155
5343571	1	1056066	5343571	1677228247711	1677228247711
3632323	1	1056066	3632323	1677467620237	1677467620237
5002174	1	1056066	5002174	1677639036940	1677639036940
2334732	1	1056066	2334732	1677652265659	1677652265659
5264876	1	1056066	5264876	1677652331254	1677652331254
5763688	1	1056066	5763688	1677211072966	1677211072966
5197303	1	1056066	5197303	1677224565511	1677224565511
4338164	1	1056066	4338164	1677639003483	1677639003483
4595799	4	1056066	4595799	1677570628426	1677570628426
5361738	1	1056066	5361738	1677463603175	1677463603175
5768443	1	1056066	5768443	1677637687937	1677637687937
5689085	1	1056066	5689085	1677462042636	1677462042636
4186307	3	1056066	4186307	1677227656475	1677227656475
669526	1	1056066	669526	1677463695072	1677463695072
5241195	1	1056066	5241195	1677549155401	1677549155401
5167535	1	1056066	5167535	1677462079466	1677462079466
5363806	1	1056066	5363806	1677217613371	1677217613371
5179192	1	1056066	5179192	1677467302756	1677467302756
5343345	4	1056066	5343345	1677570486194	1677570486194

I see a bug in your code.

Your $sort on end_time but end_time is not projected in the previous. So it means yous $sort on a non existing field.

:joy:Oh it’s a rename field from event_end_time i earlier defined, i mush forgotten to turn it back. It’s must be the reason the data is order by default. so i use the $last instead to $first
Thank u~