Docs Menu

Atlas Data Lake

On this page

  • About Atlas Data Lake
  • Sample Uses
  • Data Lake Access
  • Privilege Actions
  • Authentication Options
  • Atlas Data Lake Regions

MongoDB Atlas Data Lake allows you to natively query, transform, and move data across AWS S3 and MongoDB Atlas clusters. You can query your richly structured data stored in JSON, BSON, CSV, TSV, Avro, ORC, and Parquet formats using the mongo shell, MongoDB Compass, or any MongoDB driver.

You can use Atlas Data Lake to:

  • Convert richly structured MongoDB data into columnar Parquet or CSV files.
  • Query across multiple Atlas clusters to get a holistic view of your data.
  • Materialize aggregations from MongoDB or S3 data.
  • Automatically import data from your S3 bucket into an Atlas cluster.

When you create a Data Lake, you grant Atlas either read only or read and write access to S3 buckets in your AWS account. To access your Atlas clusters, Atlas uses your existing Role Based Access Controls. You can view and edit the generated data storage configuration that maps data from your S3 buckets and Atlas clusters to virtual databases and collections.

A database user must have one of the following roles to query an Atlas Data Lake:

Privilege actions define the operations that you can perform on your Data Lake. You can grant the following Atlas Data Lake privileges:

  • When you create or modify custom roles from the Atlas User Interface
  • In the actions.action request body parameter when you create or update a custom role from the Atlas API
sqlGetSchema

Retrieve the schema stored for a collection or view using the sqlGetSchema command.

sqlSetSchema

Set or delete the schema for a collection or view using the sqlSetSchema command.

viewAllHistory

Retrieve details about the queries that were run in the past 24 hours using $queryHistory.

outToS3

Write data from any one of the supported data stores or multiple supported data stores to your S3 bucket using $out.

storageGetConfig

Retrieve your Data Lake storage configuration using the storageGetConfig command.

storageSetConfig

Set or update your Data Lake storage configuration using the storageSetConfig command.

Data Lake uses SCRAM-SHA or x509 for authentication. It doesn't support LDAP.

Note

To prevent excessive charges on your bill, create your Atlas Data Lake in the same AWS region as your S3 data source.

Atlas Data Lake routes your Data Lake requests through one of the following regions:

Data Lake Regions
AWS Regions
Virginia, USA
us-east-1
Oregon, USA
us-west-2
Sao Paulo, Brazil
sa-east-1
Ireland
eu-west-1
London, England
eu-west-2
Frankfurt, Germany
eu-central-1
Mumbai, India
ap-south-1
Sydney, Australia
ap-southeast-2
Note

You will incur charges when running Atlas Data Lake queries. For more information, see Atlas Data Lake Costs.

Getting Started →
Give Feedback
© 2022 MongoDB, Inc.

About

  • Careers
  • Investor Relations
  • Legal Notices
  • Privacy Notices
  • Security Information
  • Trust Center
© 2022 MongoDB, Inc.