`updateOne` consumes a lot more RPU than WPU proportional to document size in a bucket pattern

Iuliu_Teodor_Radu · September 22, 2022, 5:37pm

Overview

My question concerns the amount of read vs write units that register on MongoDB Atlas after updateOne operations on documents in a bucket pattern.

Background

I have a stream of steady IoT data for measurements at minute intervals. I decided to bucket these measurements daily and so they are stored in documents that have a startDate and endDate field compartmentalising each day. Here’s a representative schema:

{
  "_id": "6323bcf78fd1d0c6dc571110",
  "endDate": "2022-09-16T23:59:59.000Z",
  "startDate": "2022-09-16T00:00:00.000Z",
  "batteryVoltageSum": 2455.2400000000002,
  "count": 678,
  "packetRSSISum": -34791,
  "measurements": [
    {
        "timestamp" : ISODate("2022-09-16T01:00:01.000+01:00"),
        "temperature" : 29.19889502762431,
        "batteryVoltage" : 3.64,
    }
    ... (up to 1439 more)
  ]
}

Every time I updateOne, I use the $push aggregation to add a measurement to the relevant array above (note that I am using upsert: true). If the bucket does not exist, I insert it, and use the $setOnInsert aggregation to set some field.
Here’s the full query:

        _id: someId
    }, {
      $push: {
        measurements: newMeasurement
      },
      $inc: {
        batteryVoltageSum: measurement.batteryVoltage,
        packetRSSISum: measurement.packetRSSI,
        count: 1,
      },
      $setOnInsert: {
        startDate,
        endDate,
      },
    }, {
      upsert: true
    })

Expected Behaviour

Since updateOne with upsert does not retrieve the document that it updates or inserts, there should only be write units registered on MongoDB Atlas.

Actual Behaviour

Every time updateOne is triggered, the expected write units are registered, but , I also see a much bigger amount of read units that seem to grow proportionally with the size of the measurements array. This is proof that the updateOne operation is in fact also triggering reads for some reason.

Here’s a screenshot of this from my logs. As you can see, at the beginning of each day when the bucket has no measurements, the read units are much less and they can be seen to grow throughout the day as measurements are recorded at minute intervals.

Concern

Why are reads metered that are proportional to the size of a nested array when I am only performing an updateOne (with upsert) operation on such a document?

kevinadi · September 23, 2022, 5:39am

Hi @Iuliu_Teodor_Radu welcome to the community!

Since updateOne with upsert does not retrieve the document that it updates or inserts, there should only be write units registered on MongoDB Atlas.

The updateOne operation still needs to read the document. This is because it needs to apply the update operations on an existing document, which it cannot do if it doesn’t have the document to start with.

From the pricing page, an RPU is multiplies of 4KB in terms of document size:

If a document exceeds 4KB or an index exceeds 256B, Atlas covers each excess chunk of 4KB or 256B with an additional RPU.

Thus if you can isolate the RPU for this update operation, I think you should find that the number of RPU consumed corresponds to the size of the document divided by 4KB.

Best regards
Kevin

Iuliu_Teodor_Radu · September 26, 2022, 7:39pm

Hi @kevinadi , thank you for your answer and welcoming me to the community!
Your answer clears out my issue that I wanted to confirm. I was unaware that RPUs get clocked at that low level as this isn’t fully clear from the documentation. It seems that probably Serverless is not the way to go for my use case. I was only using it for staging some devices, but even that won’t be scalable.
Thanks again.

All the best,
Teodor

system · October 1, 2022, 7:39pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.