MongoDB Developer
Atlas
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

ELT MongoDB Data Using Airbyte

Robert WaltersPublished Nov 16, 2022 • Updated Nov 16, 2022
Atlas
Copy Link
facebook icontwitter iconlinkedin icon
random alt
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Airbyte is an open source data integration platform that provides an easy and quick way to ELT (Extract, Load, and Transform) your data between a plethora of data sources. AirByte can be used as part of a workflow orchestration solution like Apache Airflow to address data movement. In this post, we will install Airbyte and replicate the sample database, “sample_restaurants,” found in MongoDB Atlas out to a CSV file.

Getting started

Airbyte is available as a cloud service or can be installed self-hosted using Docker containers. In this post, we will deploy Airbyte locally using Docker.
When the containers are ready, you will see the logo printed in the compose logs as follows:
command line shell text showing the airbyte logo
Navigate to http://localhost:8000 to launch the Airbyte portal. Note that the default username is “admin” and the password is “password.”

Creating a connection

To create a source connector, click on the Sources menu item on the left side of the portal and then the “Connect to your first source” button. This will launch the New Source page as follows:
New Source dialog showing MongoDB as an option for the source type
Type “mongodb” and select “MongoDb.”
The MongoDB Connector can be used with both self-hosted and MongoDB Atlas clusters.
New Source dialog showing the MongoDB instance type options
Select the appropriate MongoDB instance type and fill out the rest of the configuration information. In this post, we will be using MongoDB Atlas and have set our configuration as follows:
MongoDB Instance TypeMongoDB Atlas
Cluster URLdemo.ikyil.mongodb.net
Database Namesample_restaurants
Usernameab_user
Password**********
Authentication Sourceadmin
Note: If you’re using MongoDB Atlas, be sure to create the user and allow network access. By default, MongoDB Atlas does not access remote connections.
Click “Setup source” and Airbyte will test the connection. If it’s successful, you’ll be sent to the Add destination page. Click the “Add destination” button and select “Local CSV” from the drop-down.
Next, provide a destination name, “restaurant-samples,” and destination path, “/local.” The Airbyte portal provides a setup guide for the Local CSV connector on the right side of the page. This is useful for a quick reference on connector configuration.
New destination dialog showing Local CSV connection Setup Guide
Click “Set up destination” and Airbyte will test the connection with the destination. Upon success, you’ll be redirected to a page where you can define the details of the stream you’d like to sync.
Setting up a connection
Airbyte provides a variety of sync options, including full refresh and incremental.
Source sync options dialog
Select “Full Refresh | Overwrite” and then click “Set up sync.”
Airbyte will kick off the sync process and if successful, you’ll see the Sync Succeeded message.
Sync succeeded dialog box

Exploring the data

Let’s take a look at the CSV files created. The CSV connector writes to the /local docker mount on the airbyte server. By default, this mount is defined as /tmp/airbyte_local and can be changed by defining the LOCAL_ROOT docker environment variable.
To view the CSV files, launch bash from the docker exec command as follows:
docker exec -it airbyte-server bash
Once connected, navigate to the /local folder and view the CSV files:
bash-4.2# cd /tmp/airbyte_local/ bash-4.2# ls _airbyte_raw_neighborhoods.csv _airbyte_raw_restaurants.csv

Summary

In today’s data-rich world, building data pipelines to collect and transform heterogeneous data is an essential part of many business processes. Whether the goal is deriving business insights through analytics or creating a single view of the customer, Airbyte makes it easy to move data between MongoDB and many other data sources.

Copy Link
facebook icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Use Custom Archival Rules and Partitioning on MongoDB Atlas Online Archive


Jun 07, 2022
Tutorial

How to Deploy MongoDB on Heroku


Oct 26, 2022
Tutorial

How to Set Up HashiCorp Vault KMIP Secrets Engine with MongoDB CSFLE or Queryable Encryption


Nov 14, 2022
Tutorial

How to Write Unit Tests for MongoDB Atlas Functions


Sep 23, 2022
Table of Contents
  • Getting started