Consider that I have a mongodb collection called chatMessages
with these properties (using mongoose on nodejs):
const schema = {
_id: ObjectID,
chatID: String,
type: String,
message: String,
senderID: String,
date: Date
}
Imagine that I have 5 documents in the collection:
[
{
_id: '18c9bb49-03f4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'txt',
message: 'first message',
senderID: 'first_sender_id',
date: new Date('0001-01-02T01:01:11.001Z')
},
{
_id: '18c9bb4w-03f4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'groupedMsgs',
message: 'second message',
senderID: 'first_sender_id',
date: new Date('0001-01-03T01:01:11.001Z')
},
{
_id: '18c9bb4w-03g4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'groupedMsgs',
message: 'third message',
senderID: 'first_sender_id',
date: new Date('0001-01-04T01:01:11.001Z')
},
{
_id: '18c9bb4w-03g4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'txt',
message: 'fourth message',
senderID: 'first_sender_id',
date: new Date('0001-01-05T01:01:11.001Z')
},
{
_id: '18c9bb4w-03g4-4dc4-10ba-a3c45c324347',
chatID: 'alternateChat',
type: 'groupedMsgs',
message: 'fifth message',
senderID: 'first_sender_id',
date: new Date('0001-01-06T01:01:11.001Z')
}
]
I want to query these documents such that consecutive rows (when ranked by date) with type of groupedMsgs
are represented as one, as well as the number of consecutive rows that are present for each unique groupedMsgs
in the final output. Concretely, I would like output as shown below:
[
{
_id: '18c9bb49-03f4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'txt',
message: 'first message',
senderID: 'first_sender_id',
date: new Date('0001-01-02T01:01:11.001Z')
},
{
_id: '18c9bb4w-03f4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'groupedMsgs',
message: 'second message',
senderID: 'first_sender_id',
date: new Date('0001-01-03T01:01:11.001Z'),
numConsecutiveItems: 2
},
{
_id: '18c9bb4w-03g4-4dc4-b0ba-a3c45c324347',
chatID: 'alternateChat',
type: 'txt',
message: 'fourth message',
senderID: 'first_sender_id',
date: new Date('0001-01-05T01:01:11.001Z')
},
{
_id: '18c9bb4w-03g4-4dc4-10ba-a3c45c324347',
chatID: 'alternateChat',
type: 'groupedMsgs',
message: 'fifth message',
senderID: 'first_sender_id',
date: new Date('0001-01-06T01:01:11.001Z'),
numConsecutiveItems: 1
}
]
Notice that the third message
is not in the final output because it has type of groupedMsgs
and consecutively follows another message with type groupedMsgs
, and the second message
has numConsecutiveItems
of 2
for the same reason. More so, the fifth message
is present because it doesn’t immediately follow another groupedMsgs
message, and its value of numConsecutiveItems
is 1
for the same reason. What is an aggregation pipeline that can do this for me? My preference would be to avoid using $accumulator
, $function
, $where
, and $accumulator
to avoid running javascript during the query as that can slow down the query operation, but I’m open to all answers nevertheless.