Atlas Data Federation Overview
About Atlas Data Federation
Atlas Data Federation is a distributed query engine that allows you to natively query, transform, and move data across various sources inside & outside of MongoDB Atlas.
Sample Uses
You can use Atlas Data Federation to:
- Copy Atlas cluster data into Parquet or CSV files written to AWS S3 buckets.
- Query across multiple Atlas clusters to get a holistic view of your data.
- Materialize aggregations across Atlas clusters and AWS S3 data.
- Read and import data from your AWS S3 buckets into an Atlas cluster.
Key Concepts
Federated Database Instance
A federated database instance is a deployment of Atlas Data Federation. Each federated database instance contains virtual databases and collections that map to data in your data stores.
Atlas Data Federation Regions
To prevent excessive charges on your bill, create your Atlas Data Federation in the same AWS region as your S3 data source.
Atlas Data Federation routes your federated database requests through one of the following regions:
Data Federation Regions | AWS Regions |
---|---|
Virginia, USA | us-east-1 |
Oregon, USA | us-west-2 |
Sao Paulo, Brazil | sa-east-1 |
Ireland | eu-west-1 |
London, England | eu-west-2 |
Frankfurt, Germany | eu-central-1 |
Mumbai, India | ap-south-1 |
Sydney, Australia | ap-southeast-2 |
You will incur charges when running federated queries. For more information, see Data Federation Costs.
Atlas Data Federation Costs
You incur Atlas Data Federation costs for the following items:
- Data processed by federated database instance
- Data returned by federated database instance
Total Data Processed
Atlas charges for the total number of bytes that Atlas Data Federation processes from your AWS S3 buckets, rounded up to the nearest megabyte. Atlas charges $5.00 per GB of processed data per query.
You incur "Data Processed" costs for the amount of data that Atlas Data Federation processes to return results for your queries in addition to the "Data Returned" cost for the amount of data that Atlas Data Federation returns. For example, for a 10 GB file, you incur the following "Data Processed" cost in addition to the "Data Returned" cost:
- If you have no partitions, Atlas Data Federation reads the entire file to return results for the query. Therefore, you incur 10 GB of "Data Processed" cost.
- If you have 10 partitions of 1 GB each, Atlas Data Federation targets and reads a single partition. Therefore, you incur 1 GB of "Data Processed" cost.
You can use partitioning strategies and compression in AWS S3 to reduce the amount of processed data.
Total Data Returned and Transferred
Atlas charges for the total number of bytes returned and transferred by your federated database instance. This total is the sum of all the following data transfers:
- The number of bytes returned and transferred within the AWS region
- The number of bytes returned to the client
The cost of data transfer depends on the Cloud Service Provider charges for same-region, region-to-region, or region-to-internet data transfer. AWS charges $0.01 per GB for the number of bytes returned and transferred within AWS region and for the number of bytes returned to the client.