Oplog does not record operations within a transaction

If you run the below code, outside a transaction object will appear in the oplog as an insert operation but oplog will have no record of inside a transaction but they will be both saved properly to the collection.

I tried to look up the problem with no luck, although I was able to get a confirmation that in my mongodb server version, oplog will create a separate entry for each operation, although I don’t get either the 4.0 style, nor the 4.2!

    const { MongoClient } = require('mongodb');

    const dbName = "db-name";
    const collectionName = "collection-name";
    const dbUri = "mongodb+srv://<user>:<pass>@<cluster-url>/<db-name>?retryWrites=true&w=majority";
 
    const client = await new MongoClient(dbUri, { useUnifiedTopology: true }).connect();
    const db = client.db(dbName);
    const collection = db.collection(collectionName);

    const session = client.startSession();
    const transactionOptions = {
        readConcern: { level: 'snapshot' },
        writeConcern: { w: 'majority' }
    };

    await collection.insertOne({ name: "outside a transaction" });
    await session.withTransaction(async () => {
        await collection.insertOne({ name: "inside a transaction" }, { session: session});
    }, transactionOptions);

    await session.endSession();
    await client.close();

Hi @Mazen_El-Kashef welcome to the community!

Interesting test that you did. As I understand it, you’re trying to find the oplog operation where a document was inserted, and found out that while a regular non-transaction insert produced the oplog entry you expected, the same cannot be said if the insert was inside a transaction.

However as you mentioned, both documents are actually inserted, so the insert operation within the transaction must exist in the oplog, although it’s just not in the form that you’re familiar with. If that is the case and my understanding is correct, I don’t think this is a problem :slight_smile:

As the oplog is for MongoDB internal use to serve replication, the format and implementation details may change unexpectedly between versions, so I wouldn’t recommend you to depend on previous knowledge regarding the oplog.

If your goal is to follow changes in a collection, I would recommend you to use a change stream instead. The change stream API is a better solution in all cases rather than looking directly into the oplog, since the oplog only concerns a single replica set. A change stream can be opened against a collection, a database, or a whole deployment, even if it’s a sharded cluster.

In terms of a transaction, the change event that is emitted by a change stream will include information if an operation is part of a transaction or not, by way of the txnNumber field.

Best regards,
Kevin

2 Likes

Hello @kevinadi, thank for the welcome and the help.

I think you got everything right, except for the part of the transaction format in oplog, I didn’t have a problem with it’s format, it just didn’t show up at all and after talking to support they referred me to this hidden note that applyOps is not supported on my test database, when we checked the production instance that is a dedicated instance, it worked just fine, now it’s up to my ETL provider to support/read/parse that format.

The change stream, sounds like a cool feature, however my ETL provider (Stitch Data) uses the oplog to capture changes and replicate data but maybe I could suggest that they support change stream as an alternative method of replication.

Thanks again for taking the time to help :slight_smile:

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.