In a MongoDB time series collection, get the latest document from each device

Hi we have a time series based collection. Devices continually send their data into this collection. It has a designated time field (ts) and a metaFiled by name metadata.serial_number. There are currently about 4000 device contributing documents (records) to this collection. I want to get the latest document from each of these devices. This is proving to be quite slow, takes about 4 minutes!! Any suggestions on how to speed this up is greatly appreciated. I am currently running this query as these steps:

  1. Get a list of unique serial numbers. This is reasonably fast takes about a couple of seconds.
    Query line in python: db.command({ “distinct”: “collection”, “key”: “metadata.serial_number”})
  2. For each serial number run an aggregate to get that serial number’s last document. This takes a long time (almost 4 minutes). Below is the aggregate pipeline I am using:
    pipeline = [
    { “$match”: { “metadata.serial_number”: serial_number} },
    { “$sort”: { “ts”: -1 }},
    { “$limit”: 1 }
    ]

Again, any suggestions on how to speed this up is greatly appreciated!!

Thanks & regards,
Sreenath

1 Like