Bulk write operations order inverted

hi,

I have a collection with nested documents array “categories”. to update a category, i do a bulk write with two operations :

  • pull, to remove previous item (if any) in “categories” array
  • push, to insert the updated category

For an unknown reason, it appears that the pull has been performed after the push, despite the correct instruction (see oplog.rs screenshot).
If i’m right, the lsnid and txnNumber are equal which means they are part of the same request (the bulk write), the stmtId is in the corret order (0 for the pull, 1 for the push), but the timestamp shows that the pull has been performed after the push (25sec later !)

I don’t know if i misunderstood something or if this is a bug. Any help would be appreciated to better understand what’s going on :slight_smile:

Edit : DB version is MongoDB 5.0.9 Community

1 Like

The bulk write operations can be ordered or not.

If you did your operations with the ordered:true then we will need to see the exact code you are using. Sample documents in text json will also be needed. Use $redact to hide sensible information.

Hi, thanks for your answer.

I confirm the ordered:true is passed. I’m using .Net MongDB.Driver package (2.18.0) to call the database.

The “full” advertiser document looks like

{ advertiserId: 123456, categories: [{ categoryId: 9 }] }

The document state when performing the request was

{ advertiserId: 123456 }

I can’t give you the exact code but here is the .Net equivalent code with hardcoded values :

mongoClient.GetDatabase("MyDatabase")
    .GetCollection<BsonDocument>("advertisers")
    .BulkWriteAsync(new WriteModel<BsonDocument>[]
    {
        new UpdateOneModel<BsonDocument>(
            filter: Builders<BsonDocument>.Filter.Eq("advertiserId", 123456),
            update: Builders<BsonDocument>.Update.Pull(
                field: "categories",
                value: new Dictionary<string, object>{ { "categoryId", 9 } }))
            { IsUpsert = false },
        new UpdateOneModel<BsonDocument>(
            filter: Builders<BsonDocument>.Filter.Eq("advertiserId", 123456),
            update: Builders<BsonDocument>.Update.Push(
                field: "categories",
                value: new Dictionary<string, object>{ { "categoryId", 9 } }))
            { IsUpsert = false }
    },
    options: new BulkWriteOptions { IsOrdered = true },
    cancellationToken: cancellationToken);

I logged the driver Command in the Console to see what is generated and here is what i get

{
    "update": "advertisers",
    "ordered": true,
    "$db": "MyDatabase",
    "lsid": {
        "id": CSUUID("03d3ef41-18a7-4d1b-8c52-e6ef8ee687a8")
    },
    "updates": [{
        "q": {
            "advertiserId": 123456
        },
        "u": {
            "$pull": {
                "categories": {
                    "categoryId": 9
                }
            }
        }
    }, {
        "q": {
            "advertiserId": 123456
        },
        "u": {
            "$push": {
                "categories": {
                    "categoryId": 9
                }
            }
        }
    }]
}

hi again,

We have an investigation in progress in our side and we recently found that a Primary node switch occured at the same time (#unlucky).

Maybe this kind of query can be corrupted in such situation ? Is it a possible explanation ?

But even in the situation of a primary switch, i would expect queries to be canceled, but not to be mixed up like that.

1 Like

Thanks for the update. The thread was in my bookmark but could not come up with anything of value to write. Hopefully, someone with a more intimate knowledge could enlighten us.

Hi @Adrien_DESMOULES welcome to the community!

I’m thinking that the important question here is: does it result in the correct result, or does it produce an unexpected result?

In other words, do you see the data to be in an unexpected state after the bulkwrite is finished?

If everything is as expected, I wouldn’t put too much worry on the exact detail on how the operation is performed exactly, especially by looking at the oplog. I would note that the oplog is for internal use, the format may change with no warning, and it may not record things in the manner you expect.

Since MongoDB is a distributed database, it is expected that during normal operation, the topology may change due to network issues or maintenance. This is by design to give you high availability for your data. I would recommend you to check out Build a Resilient Application with MongoDB to cater for this situation.

Also, have you tried this bulkwrite scenario on a steady state replica set, e.g. one with no unexpected topology change, and see the same or a different result?

Best regards
Kevin

Hi Kevin, thanks for your post.

Unfortunately the result was inconsistent. This inconsistency triggered a sensor and resulted in an oplog investigation to better understand what append exactly on the replica set.

I tried to reproduce locally without any issue, and when I check the oplog, I only see the array insert and not the delete (which seems ok as i guess deleting a non existing value results in a noop, so no oplog entry).

The bulkwrite scenario also happens several times a day and we weren’t able to reproduce the issue so far.

This makes me really think that it may come from the “unpredictable collision” between bulkwrite and primary switch.

Without any reproductible case on our side, I totally understand this is almost impossible to investigate further and could remain an inexplicable issue.

Best regards,