Deploy a Data Lake for an Atlas Cluster Data Store
On this page
This page describes how to deploy a Data Lake for accessing data in an Atlas cluster.
Prerequisites
Before you begin, you will need to:
- Create a MongoDB Atlas account, if you do not have one already.
Create an Atlas Cluster, if you do not have one already. Atlas Data Lake supports Atlas clusters deployed to AWS, Azure, or GCP.
NoteTo use your Atlas cluster as a data store, you must deploy it to the same project as your Data Lake.
- Add data to at least one collection on your Atlas cluster if you have not already.
Procedure
Log in to MongoDB Atlas.
Create the virtual databases, collections, and views and map the databases, collections, and views to your data store.
(Optional) Click the for the:
- Data Lake to specify a name for your Data Lake.
Defaults to
Data Lake[n]
. Database to edit the database name. Defaults to
Database[n]
.Corresponds to
databases.[n].name
JSON configuration setting.Collection to edit the collection name. Defaults to
Collection[n]
.Corresponds to
databases.[n].collections.name
JSON configuration setting.- View to edit the view name.
You can click:
- Create Database to add databases and collections.
- associated with the database to add collections to the database.
associated with the collection to add views on the collection. To create a view, you must specify:
- The name of the view.
The pipeline to apply to the view.
NoteThe view definition pipeline cannot include the
$out
or the$merge
stage. If the view definition includes nested pipeline stages such as$lookup
or$facet
, this restriction applies to those nested pipelines as well.
To learn more about views, see:
- associated with the database, collection, or view to remove it.
- Data Lake to specify a name for your Data Lake.
Defaults to
Drag and drop the data store to map with the collection.
Corresponds to
databases.[n].collections.[n].dataSources
JSON configuration setting.