Atlas on Day One, Importing Data

MongoDB

#Cloud
Update:: We recently released a live migration tool for MongoDB Atlas called mongomirror. Learn more about mongomirror on our documentation.

MongoDB Atlas brings the ability for you as the end user to no longer concern yourself with the day to day aspects of system administration of your MongoDB Cluster. Like many databases, Atlas exists to ensure your data is always available with little overhead to your organization.

On day one you may be concerned on how to import your existing data and take a test drive of Atlas. There are numerous ways to copy your data over from one MongoDB service to another, today we’ll focus on a simple export and import using mongodump and mongorestore.

mongodump

The mongodump binary is a utility for creating a binary export of the contents of a database. mongodump can export data from either mongod or mongos instances.

Exporting your data from mongodump can be done with a command that exports the data to the system you run the command on.

In today’s case let’s think we are working with a standalone we’ve been testing on our local laptop for a while.

MongoDB shell version: 3.2.7
connecting to: test
> show databases
local  0.000GB
test   0.070GB
> show collections
testData

We’re going to export the test database that contains our testData collection. My local computer has enough disk space to handle this export, but when working with large datasets you may want to concern yourself with available disk.

bash-3.2$ df -h
Filesystem      Size   Used  Avail Capacity  iused    ifree %iused  Mounted on
/dev/disk1     465Gi   96Gi  369Gi    21% 25216209 96623405   21%   /

Indeed we have the space, so let’s go ahead and export this database:

bash-3.2$ mongodump -d test
2016-06-13T10:43:52.147-0400    writing test.testData to
2016-06-13T10:43:55.147-0400    [##########..............]  test.testData  1326267/2900790  (45.7%)
2016-06-13T10:43:58.147-0400    [######################..]  test.testData  2666589/2900790  (91.9%)
2016-06-13T10:43:58.670-0400    [########################]  test.testData  2900790/2900790  (100.0%)
2016-06-13T10:43:58.670-0400    done dumping test.testData (2900790 documents)

We are now left with two files which contain both the binary document data in BSON format along with a json file containing metadata about your collection:

bash-3.2$ cd dump/test/
bash-3.2$ ls -al
total 186976
drwxr-xr-x  4 jaygordon  staff       136 Jun 13 10:43 .
drwxr-xr-x  3 jaygordon  staff       102 Jun 13 10:43 ..
-rw-r--r--  1 jaygordon  staff  95726070 Jun 13 10:43 testData.bson
-rw-r--r--  1 jaygordon  staff        85 Jun 13 10:43 testData.metadata.json
bash-3.2$ cat dump/test/testData.metadata.json
{"options":{},"indexes":[{"v":1,"key":{"_id":1},"name":"_id_","ns":"test.testData"}]}

Important note: Since MongoDB Atlas will be managing your users for you from here on in, make sure to remove any files called system.users.bson and system.users.metadata.bson to prevent any issues with your import.

mongorestore

The mongorestore program writes data from a binary database dump created by mongodump to a MongoDB instance.

With mongorestore we should only need to create our Atlas cluster and then ensure we are whitelisted to connect. Here’s a quick one line command to confirm what IP you are currently using (including DCHP/NAT networks) according to the rest of the world. (Below IP is just an example)

bash-3.2$ curl icanhazip.com
1.2.3.4

Now we know our IP, we can add it into our collection of IPs we use for our cluster, go to the Security tab and “ADD IP ADDRESS:”

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--0d2f0d76c05d8f89db2bbc3c0a9d8e9a--tumblr_o9wa3o86Ok1sdaytmo1_1280.png

Now let’s validate we can connect to our Atlas Cluster from the laptop containing our export. Go to your Atlas Custer deployment page and find your connection string:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--57f1878e5d68172aae3dd7312b254412--tumblr_o9wa3o86Ok1sdaytmo2_1280.png

Click on Connect:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--0569ebb31fb06ca3e942da70e223bba2--tumblr_o9wa3o86Ok1sdaytmo4_1280.png

We have our info, let’s see if it works:

bash-3.2$ mongo mongodb://cluster0-shard-00-00-cbei2.mongodb.net:27017,cluster0-shard-00-01-cbei2.mongodb.net:27017,cluster0-shard-00-02-cbei2.mongodb.net:27017/admin?replicaSet=Cluster0-shard-0 --ssl --username jay --password
MongoDB shell version: 3.2.7
Enter password:
connecting to: mongodb://cluster0-shard-00-00-cbei2.mongodb.net:27017,cluster0-shard-00-01-cbei2.mongodb.net:27017,cluster0-shard-00-02-cbei2.mongodb.net:27017/admin?replicaSet=Cluster0-shard-0
2016-06-13T11:34:53.235-0400 I NETWORK  [thread1] Starting new replica set monitor for Cluster0-shard-0/cluster0-shard-00-00-cbei2.mongodb.net:27017,cluster0-shard-00-01-cbei2.mongodb.net:27017,cluster0-shard-00-02-cbei2.mongodb.net:27017
2016-06-13T11:34:53.235-0400 I NETWORK  [ReplicaSetMonitorWatcher] starting
Cluster0-shard-0:PRIMARY>

Great, we are ready to import into Atlas!

Let’s make sure we have a user ready for the admin database:

https://webassets.mongodb.com/_com_assets/blog/tblr/67.media.tumblr.com--c22d63580ace89123092e3f7403ebe72--tumblr_o9wa3o86Ok1sdaytmo3_1280.png

We’ll modify our connection string so our restore command should look something like this (note, the --host option has a different format than before):

bash-3.2$ mongorestore --ssl --host Cluster0-shard-0/cluster0-shard-00-00-cbei2.mongodb.net:27017,cluster0-shard-00-01-cbei2.mongodb.net:27017,cluster0-shard-00-02-cbei2.mongodb.net:27017 --authenticationDatabase admin
 --dir=dump/test -u jay --password $PASSWORD
2016-06-13T11:46:00.071-0400    building a list of collections to restore from dump/test dir
2016-06-13T11:46:00.081-0400    reading metadata for test.testData from dump/test/testData.metadata.json
2016-06-13T11:46:00.099-0400    restoring test.testData from dump/test/testData.bson

The restore will continue till it gets to 100% and notify you when it’s done:

2016-06-13T11:48:36.073-0400    [#######################.]  test.testData  88.3 MB/91.3 MB  (96.8%)
2016-06-13T11:48:39.075-0400    [#######################.]  test.testData  90.3 MB/91.3 MB  (98.9%)
2016-06-13T11:48:40.701-0400    [########################]  test.testData  91.3 MB/91.3 MB  (100.0%)
2016-06-13T11:48:40.701-0400    restoring indexes for collection test.testData from metadata
2016-06-13T11:48:40.710-0400    finished restoring test.testData (2900790 documents)
2016-06-13T11:48:40.710-0400    done

Great, let’s log into Atlas and verify our data made it into our cluster:

Cluster0-shard-0:PRIMARY> use test
switched to db test
Cluster0-shard-0:PRIMARY> show databases
admin  0.000GB
local  0.098GB
test   0.070GB

Now you’re ready to start using your application along with MongoDB Atlas! Start building something GIANT today!

Jay Gordon is a Technical Account Manager with MongoDB and is available via our chat to discuss MongoDB Cloud Products at https://cloud.mongodb.com.