New in version 1.9.
mongosync includes an embedded verifier to perform a series
of checks on the destination cluster to verify the sync of
supported collections. mongosync enables the verifier by
default on replica set clusters.
Starting in version 1.10, mongosync enables the verifier by
default on sharded clusters.
Note
mongosync reads using primary read preference, so it
preserves document field order from the source cluster's primary node. The
embedded verifier also checks documents based on the source cluster’s primary
node, but at a different time from when mongosync reads them. Because of
this, in rare cases, discrepancies in document field order between the source
cluster’s nodes can cause the embedded verifier to fail the migration, even
if mongosync copied the documents correctly.
About this Task
Compatibility
The embedded verifier is not available in mongosync 1.8 and earlier.
For alternative verification methods, see Verify Data Transfer.
Limitations
The embedded verifier has the following limitations:
mongosyncstores the verifier state in memory, which can result in a significant memory overhead. To run the verifier,mongosyncrequires approximately 10 GB of memory, plus an additional 500 MB for every 1 million documents.The verifier cannot be resumed. If a user stops or pauses sync and then starts
mongosyncagain for any reason, the verification process restarts from the beginning. This can cause verification to fall substantially behind the migration.When migrating from a replica set to a sharded cluster, you cannot rename source collections that you specify in the sharding options. If you rename a collection included in the sharding options during the CEA phase, the verifier reports a sharding mismatch.
If you start sync with verification enabled and
buildIndexesset tonever, the migration will fail ifmongosyncfinds a TTL collection on the source cluster. This can happen after you call the/startendpoint or much later, such as where a user creates a TTL index on the source cluster while a migration is in progress.To sync TTL collections without building indexes on the destination cluster, you must start sync with the verifier disabled.
Unsupported Verification Checks
The verifier doesn't check the following namespaces:
Capped collections
Collections with TTL indexes, including TTL indexes that are added or dropped during migration
Collections that don't use the default collation
To verify unsupported collections, add additional script code to examine the collections. For more information, see Verify Data Transfer.
Note
Starting in version 1.10, the verifier checks for data inconsistencies from a DDL event that occurred on the pre-6.0 source cluster during migration. This is because pre-6.0 migrations do not support DDL events.
To learn more, see Pre-6.0 Migration Limitations.
Steps
Initialize mongosync
Initialize the mongosync process:
./bin/mongosync \ --logPath /var/log/mongosync \ --cluster0 "mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020" \ --cluster1 "mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020"
Start the Sync
To start syncing data from the source cluster to the destination, use the /start endpoint.
curl localhost:27182/api/v1/start -XPOST \ --data ' { "source": "cluster0", "destination": "cluster1", } '
Example output:
{"success":true}
Examine Progress
To examine the status of the sync, use the /progress endpoint:
curl localhost:27182/api/v1/progress -XGET
Example output:
{ "progress": { "state":"RUNNING", "canCommit":true, "canWrite":false, "info":"change event application", "lagTimeSeconds":0, "collectionCopy": { "estimatedTotalBytes":694, "estimatedCopiedBytes":694 }, "directionMapping": { "Source":"cluster0: localhost:27017", "Destination":"cluster1: localhost:27018" }, "source": { "pingLatencyMs":250 }, "destination": { "pingLatencyMs":-1 }, "verification": { "source": { "estimatedDocumentCount": 42, "hashedDocumentCount": 42, "lagTimeSeconds": 2, "totalCollectionCount": 42, "scannedCollectionCount": 10, "phase": "stream hashing" }, "destination": { "estimatedDocumentCount": 42, "hashedDocumentCount": 42, "lagTimeSeconds": 2, "totalCollectionCount": 42, "scannedCollectionCount": 10, "phase": "stream hashing" } } }, "success": true }
Examine the verification response field for
information on the status of the embedded verifier.
Behavior
Verification Checks
The embedded verifier performs a series of checks on the
destination cluster. It checks all supported collections to
confirm that mongosync was successful in transferring
documents from the source cluster to the destination.
If the verifier encounters errors, it fails the migration with
an error. If the verifier finds no errors, the /progress
endpoint returns canWrite: true. To learn more about the canWrite field,
see canWrite and COMMITTED.
Starting in version 1.15, the embedded verifier examines collection metadata, indexes, and views. If the verifier finds a mismatch during metadata verification, it returns an error that contains the mismatch types and a count of their occurrences.
Please contact support to investigate verification issues.
Memory Requirements
Verification requires 10 GB of memory plus an additional 500 MB for every 1 million documents in the migration.
If the available memory is insufficient, the /start endpoint
returns an error. If this occurs, to use mongosync with the
verifier you must first increase the memory of the server and
resume the migration.
If increasing server memory isn't an option, restart
mongosync with the verifier disabled.