Join us Sept 17 at .local NYC! Use code WEB50 to save 50% on tickets. Learn more >
MongoDB Event
Docs Menu
Docs Home
/
Atlas
/ /

Import Archives

Note

This feature is not available for M0 Free clusters and Flex clusters. To learn more about which features are unavailable, see Atlas M0 (Free Cluster) Limits.

You can restore data archived to S3, Azure, or Google Cloud Storage using mongoimport and mongorestore. This page has a sample procedure to import archived data and rebuild indexes using the AWS, azopy, or gcloud CLI depending on the data source, and the MongoDB Database Tools.

Before you begin, you must:

1
aws s3 cp s3://<bucketName>/<prefix> <downloadFolder> --recursive
gunzip -r <downloadFolder>

where:

<bucketName>

Name of the AWS S3 bucket.

<prefix>

Path to archived data in the bucket. The path has the following format:

/exported_snapshots/<orgId>/<projectId>/<clusterName>/<initiationDateOfSnapshot>/<timestamp>/

<downloadFolder>

Path to the local folder where you want to copy the archived data.

For example, run a command similar to the following:

Example

aws s3 cp
s3://export-test-bucket/exported_snapshots/1ab2cdef3a5e5a6c3bd12de4/12ab3456c7d89d786feba4e7/myCluster/2021-04-24T0013/1619224539
mybucket --recursive
gunzip -r mybucket
2
#!/bin/bash
regex='/(.+)/(.+)/.+'
dir=${1%/}
connstr=$2
# iterate through the subdirectories of the downloaded and
# extracted snapshot export and restore the docs with mongoimport
find $dir -type f -not -path '*/\.*' -not -path '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mongoimport --uri "$connstr" --mode=upsert -d $db_name -c $col_name --file $line --type json
done
# create the required directory structure and copy/rename files
# as needed for mongorestore to rebuild indexes on the collections
# from exported snapshot metadata files and feed them to mongorestore
find $dir -type f -name '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mkdir -p ${dir}/metadata/${db_name}/
cp $line ${dir}/metadata/${db_name}/${col_name}.metadata.json
done
mongorestore "$connstr" ${dir}/metadata/
# remove the metadata directory because we do not need it anymore and this returns
# the snapshot directory in an identical state as it was prior to the import
rm -rf ${dir}/metadata/

Here:

  • --mode=upsert enables mongoimport to handle duplicate documents from an archive.

  • --uri specifies the connection string for the Atlas cluster.

3
sh massimport.sh <downloadFolder> "mongodb+srv://<connectionString>"

where:

<downloadFolder>

Path to the local folder where you copied the archived data.

<connectionString>

Connection string for the Atlas cluster.

For example, run a command similar to the following:

Example

sh massimport.sh mybucket "mongodb+srv://<myConnString>"
1
azcopy copy "https://<storageAccountName>.blob.core.windows.net/<containerName>/<prefix>/*" "<downloadFolder>" --recursive

where:

<storageAccountName>

Name of the Azure account to which the blob storage container belongs.

<containerName>

Name of the Azure blob storage container.

<prefix>

Path to archived data in the bucket.

<downloadFolder>

Path to the local folder where you want to copy the archived data.

Example

azcopy copy "https://mystorageaccount.blob.core.windows.net/mycontainer/myTextFile.txt" "~/downloads" --recursive
2
#!/bin/bash
regex='/(.+)/(.+)/.+'
dir=${1%/}
connstr=$2
# iterate through the subdirectories of the downloaded and
# extracted snapshot export and restore the docs with mongoimport
find $dir -type f -not -path '*/\.*' -not -path '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mongoimport --uri "$connstr" --mode=upsert -d $db_name -c $col_name --file $line --type json
done
# create the required directory structure and copy/rename files
# as needed for mongorestore to rebuild indexes on the collections
# from exported snapshot metadata files and feed them to mongorestore
find $dir -type f -name '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mkdir -p ${dir}/metadata/${db_name}/
cp $line ${dir}/metadata/${db_name}/${col_name}.metadata.json
done
mongorestore "$connstr" ${dir}/metadata/
# remove the metadata directory because we do not need it anymore and this returns
# the snapshot directory in an identical state as it was prior to the import
rm -rf ${dir}/metadata/

Here:

  • --mode=upsert enables mongoimport to handle duplicate documents from an archive.

  • --uri specifies the connection string for the Atlas cluster.

3
sh massimport.sh <downloadFolder> "mongodb+srv://<connectionString>"

where:

<downloadFolder>

Path to the local folder where you copied the archived data.

<connectionString>

Connection string for the Atlas cluster.

Example

sh massimport.sh "~/downloads" "mongodb+srv://<myConnString>"
1
gsutil -m cp -r "gs://<bucketName>/<prefix> <downloadFolder>" --recursive
gunzip -r <downloadFolder>

where:

<bucketName>

Name of the Google Cloud bucket.

<prefix>

Path to archived data in the bucket. The path has the following format:

/exported_snapshots/<orgId>/<projectId>/<clusterName>/<initiationDateOfSnapshot>/<timestamp>/

<downloadFolder>

Path to the local folder where you want to copy the archived data.

Example

gsutil -m cp -r
gs://export-test-bucket/exported_snapshots/1ab2cdef3a5e5a6c3bd12de4/12ab3456c7d89d786feba4e7/myCluster/2021-04-24T0013/1619224539
mybucket --recursive
gunzip -r mybucket
2
#!/bin/bash
regex='/(.+)/(.+)/.+'
dir=${1%/}
connstr=$2
# iterate through the subdirectories of the downloaded and
# extracted snapshot export and restore the docs with mongoimport
find $dir -type f -not -path '*/\.*' -not -path '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mongoimport --uri "$connstr" --mode=upsert -d $db_name -c $col_name --file $line --type json
done
# create the required directory structure and copy/rename files
# as needed for mongorestore to rebuild indexes on the collections
# from exported snapshot metadata files and feed them to mongorestore
find $dir -type f -name '*metadata\.json' | while read line ; do
[[ $line =~ $regex ]]
db_name=${BASH_REMATCH[1]}
col_name=${BASH_REMATCH[2]}
mkdir -p ${dir}/metadata/${db_name}/
cp $line ${dir}/metadata/${db_name}/${col_name}.metadata.json
done
mongorestore "$connstr" ${dir}/metadata/
# remove the metadata directory because we do not need it anymore and this returns
# the snapshot directory in an identical state as it was prior to the import
rm -rf ${dir}/metadata/

Here:

  • --mode=upsert enables mongoimport to handle duplicate documents from an archive.

  • --uri specifies the connection string for the Atlas cluster.

3

Run the massimport.sh utility to import the archived data into the Atlas cluster.

sh massimport.sh <downloadFolder> "mongodb+srv://<connectionString>"

where:

<downloadFolder>

Path to the local folder where you copied the archived data.

<connectionString>

Connection string for the Atlas cluster.

Example

sh massimport.sh mybucket "mongodb+srv://<myConnString>"

Back

Restore from Cloud Manager Snapshot

On this page