Docs 菜单
Docs 主页
/ /
Atlas 架构中心
/

Atlas 备份指导

MongoDB Atlas 提供完全托管和可定制的备份,以确保数据保留和恢复:

  • 云备份:使用云提供商的原生快照功能拍摄,支持全副本快照和本地化快照存储。这些快照始终具有增量性质,并利用云提供商的底层备份快照机制,实现低费用和快速恢复。您选择的备份策略指定了一定数量的每日、每周和每月的快照。

  • 持续云备份:这是云备份的一项附加功能,提供给定时间点恢复。该功能允许您通过备份 oplog 并捕获快照之间的数据变化,在恢复过程中恢复到特定的分钟。该功能允许您将数据恢复到故障或事件发生前的确切时刻(给定时间点),以满足最短 1 分钟的恢复点目标 (RPO)。

我们不建议为开发和测试环境启用备份。对于预发布和生产环境,建议您开发包含本页所述建议的自动化部署模板。

Atlas提供完全托管的数据备份,包括时间点数据恢复以及所有集群(包括分片的集群)的一致的全集群快照。在Atlas中,您可以选择四种快照频率:每小时、每天、每周和每月,每种频率都有自己的保留期。

云备份

该功能使用集群云服务提供商的原生快照功能提供本地化备份存储。优势包括 12 个月的强大的默认备份保留安排,具备完全灵活的自定义快照及保留方案,并且能够设置不同的快照频率(例如,每小时快照用于快速恢复,每周或每月快照用于长期保留),以满足行业法规要求。您可以即时访问备份数据,这对于 Atlas 审核、合规或数据恢复非常有用,您也可以直接对备份数据运行查询,从而节省时间和资源。

持续的云备份

此功能提供给定时间点 (PIT) 恢复,使您可以恢复到任何时间戳。这使您可以将数据恢复到故障或事件(例如网络攻击)发生之前的精确点(给定时间点)。您还可以设置一个自定义的恢复窗口,指定您希望能够恢复到给定时间点的天数。

多区域快照分发

该功能可通过跨地理区域自动分发备份快照和 oplog,而不是仅仅将其存储在主区域,从而提高韧性。您可以满足将备份存储在不同且物理隔离的地理位置的合规要求,以确保在发生区域性中断时能够实现灾难恢复。

要了解更多信息,请参阅 快照分发。

备份合规策略

此功能使您能够通过防止存储在 Atlas 中的所有快照和 oplog 在您指定的预定义保留期内被修改或删除,进一步保护关键业务数据,确保备份完全符合 WORM(一次写入多次读取)标准。只有指定的授权用户在完成与 MongoDB 支持的验证过程后,才能关闭此保护功能。此功能增加了强制性的手动延迟和冷却期,这样攻击者就无法更改备份策略和导出数据。要了解更多信息,请参阅配置备份合规策略

您必须将备份策略与特定的恢复点目标 (RPO) 和恢复时间目标 (RTO) 保持一致,以满足业务连续性要求,尤其是对于关键应用程序,其中近乎即时的恢复时间目标和快速恢复时间至关重要。RPO 定义了事件期间可接受的最大数据丢失量,而 RTO 定义了应用程序恢复的速度。由于数据的重要性不同,您必须为每个应用程序单独评估 RPO 和 RTO。例如,关键任务数据可能会有与点击流分析不同的要求。您对 RTO、RPO 和备份保留期的要求将影响维护备份的成本和性能考量。在开发和测试环境中,我们建议您禁用备份以节省成本。在预发布和生产环境中,请确保在部署模板中启用备份,并且您已成功测试备份和恢复的程序和流程。

从备份中恢复大型副本集(和分片)需要更长的时间。在暂存和生产环境中,我们建议您通过测试技术确定副本集大小或分片大小限制,以确保您的大小符合 RTO 要求。确保快照计划和保留策略满足任何 RPO 要求。

在生产环境中,除了 Atlas 云备份之外,我们建议您默认启用持续云备份,恢复窗口为七天。根据工作负载的重要性,将此时间范围调整为更长的设置。这使您可以重播 oplog,以从特定时间点恢复集群并满足您的 RTO。

提示

Atlas 提供预定义的备份快照计划,包括快照的频率和保留期。长时间保留备份快照可能会导致高昂的成本。我们建议您根据数据和环境的规模和重要性(开发、测试、预发布、生产)来构建符合您需求的自动化部署模板。对于快照的频率和保留,我们建议如下:

层级
RTO
RPO
推荐的频率和保留时间
Atlas 备份快照总数

1 层级

30 分钟

接近零(7 天内)

Hourly: Every 12 hours, retain for 7 days = 14 snapshots
Daily: Once a day, retain for 7 days = 7 snapshots
Weekly: Saturday, retain for 4 weeks = 4 snapshots
Monthly: Last day of month, retain for 3 months = 6 snapshots

31

2 层级

12 小时

接近零(7 天内)

Daily: Once a day, retain for 7 days = 7 snapshots
Weekly: Saturday, retain for 4 weeks = 4 snapshots
Monthly: Last day of month, retain for 3 months = 3 snapshots

14

3 层级

3 天

接近零(2 天内)

Daily: Once a day, retain for 7 days = 7 snapshots
Weekly: Saturday, retain for 4 weeks = 4 snapshots
Monthly: Last day of month, retain for 3 months = 3 snapshots

14

Atlas 提供备份位置的选项。为了进一步增强韧性,我们建议将备份分发到本地区域和外部灾难恢复区域,确保即使在区域服务中断期间也能恢复数据。对于位于三个区域的 Atlas 集群,多区域快照分发将备份复制到两个从节点区域,从而可以使用备份副本进行恢复。您还可以将关键备份及其给定时间点数据复制到云提供商在 Atlas 中提供的任何从节点区域。

当您配置快照频率、保留和分发时,我们建议在可用性和成本之间取得平衡。然而,您的关键工作负载可能需要在不同位置拥有多个快照副本。

我们建议实施 Atlas 的备份合规策略,以防止未经授权的备份修改或删除,从而维护数据完整性并支持强大的灾难恢复。

持续云备份能够实现精确的给定时间点 (PIT) 恢复,从而最大限度地减少故障期间的数据丢失。Atlas 可以快速恢复到故障事件发生前的准确时间戳,即使在主区域服务中断的情况下,利用优化的恢复功能,也能为您提供至少 1 分钟的 RPO 和少于 15 分钟的 RTO 。这是因为 Atlas 会恢复所需给定时间点之前的最新快照,然后重放 oplog 更改以恢复到该特定点。恢复时间可能会因云提供商磁盘预热以及恢复过程中必须重放 oplog 的数量而有所不同。在云提供商的磁盘预热完成之前,恢复后的集群性能可能会较慢。如果您能够灵活地满足恢复需求,我们建议设计模板,在合理的恢复选项和成本之间找到最佳折中方案。

要优化Atlas备份成本,您必须调整备份频率和保留策略以与数据关键程度保持一致,从而减少不必要的存储费用。示例,您应在较低环境中禁用备份,并确保在具有高可用性要求的较高环境中将备份分发到部署Atlas集群的每个地区。您还可以通过仅捕获增量更改的快照和内置压缩来使用增量备份,以最大限度地减少存储的数据量。通过战略性地选择备份区域,可以避免跨区域数据传输费用,并根据工作负载选择合适的集群磁盘大小以防止超支。通过实施这些策略,您可以有效管理成本,同时保持安全可靠的备份。

请参阅 Terraform 示例,在 Github 的一个位置跨所有支柱实施 Staging/Prod 建议,涵盖所有支柱。

以下示例使用 Atlas 工具启用备份和恢复操作的自动化。

这些示例仅适用于启用了备份的集群的预发布环境和生产环境。

运行以下命令,为名为 myDemo 的集群拍摄备份快照,并将该快照保留 7 天:

atlas backups snapshots create myDemo --desc "my backup snapshot" --retention 7

为您的项目启用备份合规性策略,指定的授权用户 (governance@example.org) 在完成与 MongoDB 支持的验证过程后,才可以关闭此保护。

atlas backups compliancePolicy enable \
--projectId 67212db237c5766221eb6ad9 \
--authorizedEmail governance@example.org \
--authorizedUserFirstName john \
--authorizedUserLastName doe

运行以下命令,为计划的备份快照创建合规策略,该策略强制执行必须拍摄快照的次数(设立为每 6 小时)和保留快照的持续时间(设立为 1 个月) 。

atlas backups compliancePolicy policies scheduled create \
--projectId 67212db237c5766221eb6ad9 \
--frequencyInterval 6 \
--frequencyType hourly \
--retentionValue 1 \
--retentionUnit months

以下示例演示如何在部署过程中配置备份。在使用 Terraform 创建资源之前,您必须:

  • 创建您的付款组织并为该付款组织创建一个 API 密钥。请在终端中运行以下命令,将您的 API 密钥存储为环境变量:

    export MONGODB_ATLAS_PUBLIC_KEY="<insert your public key here>"
    export MONGODB_ATLAS_PRIVATE_KEY="<insert your private key here>"
  • 安装 Terraform。

您必须为每个示例创建以下文件。将每个示例的文件放在各自的目录中。更改 ID 和名称以使用您的值。然后运行命令以初始化 Terraform、查看 Terraform 计划并应用更改。

variable "org_id" {
description = "Atlas organization ID"
type = string
}
variable "project_name" {
description = "Atlas project name"
type = string
}
variable "cluster_name" {
description = "Atlas Cluster Name"
type = string
}
variable "point_in_time_utc_seconds" {
description = "PIT in UTC"
default = 0
type = number
}

使用以下内容为集群配置层级1备份安排。

locals {
atlas_clusters = {
"cluster_1" = { name = "m10-aws-1e", region = "US_EAST_1" },
"cluster_2" = { name = "m10-aws-2e", region = "US_EAST_2" },
}
}
resource "mongodbatlas_project" "atlas-project" {
org_id = var.org_id
name = var.project_name
}
resource "mongodbatlas_advanced_cluster" "automated_backup_test_cluster" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
name = each.value.name
cluster_type = "REPLICASET"
replication_specs {
region_configs {
electable_specs {
instance_size = "M10"
node_count = 3
}
analytics_specs {
instance_size = "M10"
node_count = 1
}
provider_name = "AWS"
region_name = each.value.region
priority = 7
}
}
backup_enabled = true # enable cloud backup snapshots
pit_enabled = true
}
resource "mongodbatlas_cloud_backup_schedule" "test" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
cluster_name = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].name
reference_hour_of_day = 3 # backup start hour in UTC
reference_minute_of_hour = 45 # backup start minute in UTC
restore_window_days = 7 # Restore window for near-zero RPO
copy_settings {
cloud_provider = "AWS"
frequencies = ["HOURLY",
"DAILY",
"WEEKLY",
"MONTHLY",
"YEARLY",
"ON_DEMAND"]
region_name = "US_WEST_1"
zone_id = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].replication_specs.*.zone_id[0]
should_copy_oplogs = true
}
policy_item_hourly {
frequency_interval = 12 # backup every 12 hours, accepted values = 1, 2, 4, 6, 8, 12 -> every n hours
retention_unit = "days"
retention_value = 7 # retain for 7 days
}
policy_item_daily {
frequency_interval = 1 # backup every day, accepted values = 1 -> every 1 day
retention_unit = "days"
retention_value = 7 # retain for 7 days
}
policy_item_weekly {
frequency_interval = 7 # every Sunday, accepted values = 1 to 7 -> every 1=Monday,2=Tuesday,3=Wednesday,4=Thursday,5=Friday,6=Saturday,7=Sunday day of the week
retention_unit = "weeks"
retention_value = 4 # retain for 4 weeks
}
policy_item_monthly {
frequency_interval = 28 # accepted values = 1 to 28 -> 1 to 28 every nth day of the month
retention_unit = "months"
retention_value = 3 # retain for 3 months
}
depends_on = [
mongodbatlas_advanced_cluster.automated_backup_test_cluster
]
}

使用以下内容为集群配置层级2备份安排。

locals {
atlas_clusters = {
"cluster_1" = { name = "m10-aws-1e", region = "US_EAST_1" },
"cluster_2" = { name = "m10-aws-2e", region = "US_EAST_2" },
}
}
resource "mongodbatlas_project" "atlas-project" {
org_id = var.org_id
name = var.project_name
}
resource "mongodbatlas_advanced_cluster" "automated_backup_test_cluster" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
name = each.value.name
cluster_type = "REPLICASET"
replication_specs {
region_configs {
electable_specs {
instance_size = "M10"
node_count = 3
}
analytics_specs {
instance_size = "M10"
node_count = 1
}
provider_name = "AWS"
region_name = each.value.region
priority = 7
}
}
backup_enabled = true # enable cloud backup snapshots
pit_enabled = true
}
resource "mongodbatlas_cloud_backup_schedule" "test" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
cluster_name = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].name
reference_hour_of_day = 3 # backup start hour in UTC
reference_minute_of_hour = 45 # backup start minute in UTC
restore_window_days = 7 # Restore window for near-zero RPO
copy_settings {
cloud_provider = "AWS"
frequencies = ["HOURLY",
"DAILY",
"WEEKLY",
"MONTHLY",
"YEARLY",
"ON_DEMAND"]
region_name = "US_WEST_1"
zone_id = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].replication_specs.*.zone_id[0]
should_copy_oplogs = true
}
policy_item_daily {
frequency_interval = 1 # backup every day, accepted values = 1 -> every 1 day
retention_unit = "days"
retention_value = 7 # retain for 7 days
}
policy_item_weekly {
frequency_interval = 7 # every Sunday, accepted values = 1 to 7 -> every 1=Monday,2=Tuesday,3=Wednesday,4=Thursday,5=Friday,6=Saturday,7=Sunday day of the week
retention_unit = "weeks"
retention_value = 4 # retain for 4 weeks
}
policy_item_monthly {
frequency_interval = 28 # accepted values = 1 to 28 -> 1 to 28 every nth day of the month
# accepted values = 40 -> every last day of the month
retention_unit = "months"
retention_value = 3 # retain for 3 months
}
depends_on = [
mongodbatlas_advanced_cluster.automated_backup_test_cluster
]
}

使用以下内容为集群配置层级3备份安排。

locals {
atlas_clusters = {
"cluster_1" = { name = "m10-aws-1e", region = "US_EAST_1" },
"cluster_2" = { name = "m10-aws-2e", region = "US_EAST_2" },
}
}
resource "mongodbatlas_project" "atlas-project" {
org_id = var.org_id
name = var.project_name
}
resource "mongodbatlas_advanced_cluster" "automated_backup_test_cluster" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
name = each.value.name
cluster_type = "REPLICASET"
replication_specs {
region_configs {
electable_specs {
instance_size = "M10"
node_count = 3
}
analytics_specs {
instance_size = "M10"
node_count = 1
}
provider_name = "AWS"
region_name = each.value.region
priority = 7
}
}
backup_enabled = true # enable cloud backup snapshots
pit_enabled = true
}
resource "mongodbatlas_cloud_backup_schedule" "test" {
for_each = local.atlas_clusters
project_id = mongodbatlas_project.atlas-project.id
cluster_name = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].name
reference_hour_of_day = 3 # backup start hour in UTC
reference_minute_of_hour = 45 # backup start minute in UTC
restore_window_days = 7 # Restore window for near-zero RPO
copy_settings {
cloud_provider = "AWS"
frequencies = ["HOURLY",
"DAILY",
"WEEKLY",
"MONTHLY",
"YEARLY",
"ON_DEMAND"]
region_name = "US_WEST_1"
zone_id = mongodbatlas_advanced_cluster.automated_backup_test_cluster[each.key].replication_specs.*.zone_id[0]
should_copy_oplogs = true
}
policy_item_daily {
frequency_interval = 1 # backup every day, accepted values = 1 -> every 1 day
retention_unit = "days"
retention_value = 7 # retain for 7 days
}
policy_item_weekly {
frequency_interval = 7 # every Sunday, accepted values = 1 to 7 -> every 1=Monday,2=Tuesday,3=Wednesday,4=Thursday,5=Friday,6=Saturday,7=Sunday day of the week
retention_unit = "weeks"
retention_value = 4 # retain for 4 weeks
}
policy_item_monthly {
frequency_interval = 28 # accepted values = 1 to 28 -> 1 to 28 every nth day of the month
# accepted values = 40 -> every last day of the month
retention_unit = "months"
retention_value = 3 # retain for 3 months
}
depends_on = [
mongodbatlas_advanced_cluster.automated_backup_test_cluster
]
}

使用以下内容配置云备份快照和 PIT 恢复作业。

# Create a project
resource "mongodbatlas_project" "project_test" {
name = var.project_name
org_id = var.org_id
}
# Create a cluster with 3 nodes
resource "mongodbatlas_advanced_cluster" "cluster_test" {
project_id = mongodbatlas_project.project_test.id
name = var.cluster_name
cluster_type = "REPLICASET"
backup_enabled = true # enable cloud provider snapshots
pit_enabled = true
retain_backups_enabled = true # keep the backup snapshopts once the cluster is deleted
replication_specs {
region_configs {
priority = 7
provider_name = "AWS"
region_name = "US_EAST_1"
electable_specs {
instance_size = "M10"
node_count = 3
}
}
}
}
# Specify number of days to retain backup snapshots
resource "mongodbatlas_cloud_backup_snapshot" "test" {
project_id = mongodbatlas_advanced_cluster.cluster_test.project_id
cluster_name = mongodbatlas_advanced_cluster.cluster_test.name
description = "My description"
retention_in_days = "1"
}
# Specify the snapshot ID to use to restore
resource "mongodbatlas_cloud_backup_snapshot_restore_job" "test" {
count = (var.point_in_time_utc_seconds == 0 ? 0 : 1)
project_id = mongodbatlas_cloud_backup_snapshot.test.project_id
cluster_name = mongodbatlas_cloud_backup_snapshot.test.cluster_name
snapshot_id = mongodbatlas_cloud_backup_snapshot.test.id
delivery_type_config {
point_in_time = true
target_cluster_name = mongodbatlas_advanced_cluster.cluster_test.name
target_project_id = mongodbatlas_advanced_cluster.cluster_test.project_id
point_in_time_utc_seconds = var.point_in_time_utc_seconds
}
}

后退

弹性

在此页面上