Mongomirror & oplog issue

Hello,

I’ve been testing mongomirror on a 3.4.2 linux environment (3 node rs,not sharded) to do a data migration to Atlas. It worked initially, but took 4 days to move 400 Gb so we made some revisions to the destination config and are trying again, but can never get it started as there are oplog errors. I’ve resized the oplog across the cluster, but still cannot start mongomirror.

Here’s my errors:

2020-10-05T09:25:52.839-0500    Read timestamp from bookmark file: {1598788862 3}
2020-10-05T09:25:52.839-0500    Proceeding to tail oplog.
2020-10-05T09:25:52.841-0500    Current lag from source: 866h24m47s
2020-10-05T09:25:52.841-0500    NewOplogReader start time is greater than the final buffered timestamp
2020-10-05T09:25:52.841-0500    Tailing the oplog on the source cluster starting at timestamp: {1598788862 3}
2020-10-05T09:25:52.877-0500    Oplog tailer has shut down. Oplog applier will exit.
2020-10-05T09:25:52.877-0500    Waiting for new oplog entries to apply.
2020-10-05T09:25:52.877-0500    Fatal error while tailing the oplog on the source: Checkpoint not available in oplog! expected: {1598788862 3}; got: {1601235
193 1}
2020-10-05T09:25:52.877-0500    Timestamp file written to /var/lib/mongo/mongomirror-linux-x86_64-rhel70-0.9.1/bin/mongomirror.timestamp.
2020-10-05T09:25:52.877-0500    Failed: error while tailing the oplog on the source: Checkpoint not available in oplog! expected: {1598788862 3}; got: {16012
35193 1}

Is there some way to reset the oplog so that it does not cause this ? Is the mongomirror for one-time only use ?

Here’s the oplog info :

rs.printReplicationInfo()
configured oplog size:   77824MB
log length start to end: 677656secs (188.24hrs)
oplog first event time:  Sun Sep 27 2020 14:33:13 GMT-0500 (CDT)
oplog last event time:   Mon Oct 05 2020 10:47:29 GMT-0500 (CDT)
now:                     Mon Oct 05 2020 10:47:37 GMT-0500 (CDT)

It is too far behind to pickup from the oplog 866h behind. You’ll have to restart.

Monogomirror is resumable after initial sync and catchup, but you have to remain in the oplog window.

MongoDB support were a fantastic help when I went through this.

Thanks Chris, I figured I’d have to restart , just still a little unsure on how the oplog side of things works. Back in my relational db days, the equivalent of the oplog would basically cycle out based on a commit point, so I was thinking that rerunning mongomirror would not try to continue from the endpoint.

So, If i resized my oplog in the intervening period, shouldn’t it have ‘reset’ the oplog ? As part of the steps i dropped the oplog on each of the 3 nodes, so ithought that would’ve cleared/reset the oplog window. Is that the case ?

You’ll have to remove the bookmark file from mongomirror. It is trying to restart from that point.

You’ll want to drop all the databases in the target cluster manually or with thte --drop option.

The oplog will remove the oldest items when it hits the configured limit (size in GB or time(4.4_)). Not based on any commit.

1 Like

Chris , you gave me the exact hint that I needed, I’m rerunning mongomirror as we speak, having renamed the bookmark file and specifying the forceDump option
Thanks !

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.