Create One Data Lake
On this page
Groups and projects are synonymous terms. Your {GROUP-ID}
is the
same as your project ID. For existing groups, your group/project ID
remains the same. The resource and corresponding endpoints use the
term groups
.
The Atlas API uses HTTP Digest Authentication. Provide a programmatic API public key and corresponding private key as the username and password when constructing the HTTP request.
For complete documentation on configuring API access for an Atlas project, see Configure Atlas API Access.
Base URL
https://cloud.mongodb.com/api/atlas/v1.0
Usage
Use this endpoint to create a specific Atlas Data Lake associated to an Atlas project. To create a Data Lake, specify a name for your Data Lake, the unique identifier of the role that Data Lake can use to access your AWS data store, and the S3 bucket where data is stored.
Syntax
POST /groups/{GROUP-ID}/dataLakes
Request Path Parameters
Path Element | Required/Optional | Description |
---|---|---|
GROUP-ID | Required. | The unique identifier for the project. |
Request Query Parameters
The following query parameters are optional:
Query Parameter | Type | Description | Default |
---|---|---|---|
pretty | boolean | Displays response in a prettyprint format. | false |
envelope | boolean | Specifies whether or not to wrap the response in an envelope. | false |
Request Body Parameters
Field | Required/Optional | Description |
---|---|---|
name | Required | Name of the Atlas Data Lake. |
cloudProviderConfig | Optional | Configuration information related to the cloud service where
Atlas Data Lake source data is stored. |
cloudProviderConfig.<provider> | Required | Name of the provider of the cloud service where Data Lake can access the S3 Bucket. Atlas Data Lake supports only Required if specifying |
cloudProviderConfig.aws.roleId | Required | Unique identifier of the role that Data Lake can use to access
the data stores. If necessary, use the Atlas
API to
retrieve the role ID. You must also specify the Required if specifying |
cloudProviderConfig.aws.
testS3Bucket | Required | Name of the S3 data bucket that the provided role ID is
authorized to access. You must also specify the Required if specifying |
Response
Name | Type | Description |
---|---|---|
cloudProviderConfig | object | Configuration information related to the cloud service where
Atlas Data Lake source data is stored. |
cloudProviderConfig.<provider> | object | Name of the provider of the cloud service where Data Lake can access the S3 Bucket data stores. Data Lake only supports |
cloudProviderConfig.externalId | string | Unique identifier associated with the IAM Role that
Data Lake assumes when accessing the
data stores. |
cloudProviderConfig.aws.
iamAssumedRoleARN | string | Amazon Resource Name (ARN) of the IAM Role that Data Lake assumes when accessing S3 Bucket data stores. The IAM Role must support the following actions against each S3 bucket:
For more information on S3 actions, see Actions, Resources, and Condition Keys for Amazon S3. |
cloudProviderConfig.aws.
iamUserARN | string | Amazon Resource Name (ARN) of the user that
Data Lake assumes when accessing S3 Bucket
data stores. |
cloudProviderConfig.aws.roleId | string | Unique identifier of the role that Data Lake uses to
access the data stores. |
dataProcessRegion | Optional | The cloud provider region to which Atlas Data Lake routes client connections for data processing. If |
dataProcessRegion.cloudProvider | Required | Name of the cloud service provider. Atlas Data Lake only supports |
dataProcessRegion.region | Required | Name of the region to which Atlas Data Lake routes client connections for data processing. Atlas Data Lake only supports the following regions:
|
groupId | string | The unique identifier for the project. |
hostnames | array | The list of hostnames assigned to the Atlas Data Lake. Each string
in the array is a hostname assigned to the Atlas Data Lake. |
name | string | Name of the Atlas Data Lake. |
state | string | Current state of the Atlas Data Lake:
|
storage | object | Configuration details for each data store and its
mapping to MongoDB database(s) and collection(s). |
storage.databases | object | Configuration details for mapping each data store
to queryable databases and collections. For complete
documentation on this object and its nested fields, see
An empty object indicates that the Data Lake has no mapping configuration for any data store. |
storage.stores | array | Each object in the array represents a data store.
Data Lake uses the An empty object indicates that the Data Lake has no configured data stores. |
Example
curl -u "{PUBLIC-KEY}:{PRIVATE-KEY}" --digest \ --header "Accept: application/json" \ --header "Content-Type: application/json" \ --request POST "https://cloud.mongodb.com/api/atlas/v1.0/groups/{GROUP-ID}/dataLakes?pretty=true" \ --data '{ \ "name" : "UserMetricData", \ "cloudProviderConfig" : { "aws" : { "roleId" : "1a234bcd5e67f89a12b345c6", "testS3Bucket" : "user-metric-data-bucket" } } }'
The preceding request returns the following:
{ "cloudProviderConfig": { "aws": { "externalId" : "12a3bc45-de6f-7890-12gh-3i45jklm6n7o", "iamAssumedRoleARN": "arn:aws:iam::123456789012:role/ReadS3BucketRole", "iamUserARN": "arn:aws:iam::1234567890123:root", "roleId": "1a234bcd5e67f89a12b345c6" } }, "dataProcessRegion": null, "groupId": "1ab23c4567def890gh12ij34", "hostnames": [ "hardwaremetricdata.mongodb.example.net" ], "name": "UserMetricData", "state": "ACTIVE", "storage": { "databases": [], "stores": [] } }