For AI agents: a documentation index is available at https://www.mongodb.com/docs/llms.txt — markdown versions of all pages are available by appending .md to any URL path.
Docs Menu

Data Validation for Cluster Consistency

Atlas automatically runs data validation to proactively detect data inconsistencies across all clusters in a project. Data validation helps identify silent data corruption before it impacts your applications.

Silent data corruption occurs when data differs across replica set nodes without triggering errors or warnings. This can happen due to hardware failures, network issues, or other system-level problems. Examples include:

  • Missing documents: A document exists on some nodes, but is missing from others.

  • Content differences: A document exists on all nodes, but the content differs between them.

  • Index inconsistencies: Index entries differ across nodes.

Without validation, these inconsistencies can remain undetected and cause application errors, data loss, or incorrect query results.

Atlas validates data across replica set nodes to detect silent data corruption by comparing data across replica set nodes.

The validation process involves the following steps:

  1. Creates temporary validation instances in the same cloud provider and region as your cluster.

  2. Restores node snapshots to the validation instances.

  3. Compares data across node snapshots to detect inconsistencies.

  4. Stores validation results in Atlas systems (AWS us-east-1) when inconsistencies are detected.

Data validation runs automatically on all clusters in your project. If you need to, you can disable validation at the project level.

During validation, Atlas reads database and collection data to compute hashes and detect inconsistencies. Atlas does not modify your cluster data or store decrypted document content. Decrypted data exists only temporarily during the validation process and is not persisted.

When validation detects inconsistencies, Atlas stores only the following metadata:

Data Type
Retention Period
Description

Run metadata

3 years

Validation run status, timestamps, and inconsistency summary including counts by database, collection, and inconsistency type.

Inconsistency details

90 days

Database name, collection name, document IDs, and inconsistency types for documents that failed validation.

Validation logs

90 days

Detailed validation output stored in S3. Available through the Atlas UI for troubleshooting.

Drill-down results

21 days

Intermediate hash results used during the validation process.

Atlas retains only inconsistency metadata for analysis and troubleshooting.

For clusters with encryption at rest using customer-managed keys, validation requires additional access to your key management service.

Validation instances must decrypt data using your customer-managed key to perform validation checks. This results in additional KMS API requests and associated costs.

For detailed information about KMS usage, costs, security considerations, and IP allowlist configuration requirements, see Data Validation KMS Usage.

You can disable data validation for all clusters in a project.

Important

Disabling data validation means Atlas cannot proactively detect data inconsistencies in your clusters. Only disable validation if you have specific requirements that prevent validation from running.

1
  1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

  2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

  3. In the sidebar, click the icon next to Project Overview.

The Project Settings page displays.

2
3
4

Disabling validation affects all clusters in the project, including both encrypted and non-encrypted clusters.