I have a face recognition collection with two different kinds of files:
1) Broadcasting files:
{
"_id": {"$oid": "5e870200adbe1d000183fa4d"},
"data":
{
"begin": "2020-03-30 10:20:29",
"end": "2020-03-30 10:20:32",
"file": "salvamento4.mp4",
"type": "video"
},
"idSensor": 3,
"idDevice": 5
}
2) Reaction (to broadcasting) files:
{
"_id": {"$oid": "5e86fe50adbe1d0001472c0f"},
"data":
{
"Trackings":
[{
"BeginTime": "2020-03-30T08:23:42.034893+00:00",
"FaceInfo":
{
"Age": 26.34,
"Emotion": "NEUTRAL",
"IsDetected": true,
"MaleProbability": 0.71,
"gazeTime": 2.37,
"numGazes": 71
},
"ImageSize":
{
"Height": 1080,
"Width": 1920
},
"LookingDuration": 2.37,
"PersonID": "P-2020-03-30_2749",
"ReIDInfo": {"NumReIDs": 1},
"RoiInfo": {"RoiDuration": 0.17},
"SensorID": 0,
"TrackingDuration": 2.77,
"Trajectory": null,
"direction": null,
"id": 1,
"roiName": 0,
"roiType": 1
}],
"timestamp": "2020-03-30T08:23:52.327678"
},
"idSensor": 2,
"idDevice": 5
}
Join field is idDevice
: a device broadcasts videos and records reactions at the same time.
I need to cross both kind of files to determine which emissions are watched by which people, in order to estimate with a BI software if some videos have greater impact on audience than others. There are tons of emissions but only a little amount of different videos; different reactions might also come from recurrent customers (that’s why there is a PersonID
).
The idea is to check overlapping between broadcasting time (that starts at data.begin
and finishes at data.end
) and reaction time (that starts at data.Trackings.BeginTime
and finishes at data.Trackings.BeginTime + data.Trackings.TrackingDuration
) in order to get a table similar to this (just a simple example for one video that provokes three different reactions; ultimate outcome would include also other parameters like Emotion
, Age
, etc.):
idDevice idBroadcast dtBrBeginTime dtBrEndTime idReaction dtReBeginTime dtReEndTime
1 1 2020-07-03 10:00 2020-07-03 10:03 1 2020-07-03 09:58 2020-07-03 10:02
1 1 2020-07-03 10:00 2020-07-03 10:03 2 2020-07-03 10:01 2020-07-03 10:07
1 1 2020-07-03 10:00 2020-07-03 10:03 3 2020-07-03 10:01 2020-07-03 10:02
In this simple example, 1 emission has been watched by 3 people (or has triggered 3 different reactions); i.e., 1 broadcasting file is related to 3 reaction files. How do we know this? I think the simplest way (correct me if you think there’s a better solution) is to verify these two conditions:
-
data.Trackings.BeginTime
(ordtReBeginTime
) not gtedata.end
(ordtBrEndTime
) -
data.Trackings.BeginTime + data.Trackings.TrackingDuration
(ordtReEndTime
) not
ltedata.begin
(ordtBrBeginTime
)
My expertise in MongoDb is limited to making very simple A-F queries ($match
, $project
, I also have used $unwind
for breaking data.Trackings
into parts), so I have almost no idea about how to address this issue… maybe with $lookup
? I’d appreciate any kind of help.
Thanks a lot in advance.