Create One Federated Database Instance in One Project

POST /api/atlas/v2/groups/{groupId}/dataFederation

Creates one federated database instance in the specified project. To use this resource, the requesting Service Account or API Key must have the Project Owner or Project Charts Admin roles.

Path parameters

  • groupId string Required

    Unique 24-hexadecimal digit string that identifies your project. Use the /groups endpoint to retrieve all projects to which the authenticated user has access.

    NOTE: Groups and projects are synonymous terms. Your group id is the same as your project id. For existing groups, your group/project id remains the same. The resource and corresponding endpoints use the term groups.

    Format should match the following pattern: ^([a-f0-9]{24})$.

Query parameters

  • envelope boolean

    Flag that indicates whether Application wraps the response in an envelope JSON object. Some API clients cannot access the HTTP response headers or status code. To remediate this, set envelope=true in the query. Endpoints that return a list of results use the results object as an envelope. Application adds the status parameter to the response body.

    Default value is false.

  • pretty boolean

    Flag that indicates whether the response body should be in the prettyprint format.

    Default value is false.

    Prettyprint
  • skipRoleValidation boolean

    Flag that indicates whether this request should check if the requesting IAM role can read from the S3 bucket. AWS checks if the role can list the objects in the bucket before writing to it. Some IAM roles only need write permissions. This flag allows you to skip that check.

    Default value is false.

application/vnd.atlas.2023-01-01+json

Body Required

Details to create one federated database instance in the specified project.

  • cloudProviderConfig object

    Cloud provider where this Federated Database Instance is hosted.

    Hide cloudProviderConfig attributes Show cloudProviderConfig attributes object
    • aws object

      Configuration for running Data Federation in AWS.

      Hide aws attributes Show aws attributes object
      • roleId string Required

        Unique identifier of the role that the data lake can use to access the data stores.Required if specifying cloudProviderConfig.

        Format should match the following pattern: ^([a-f0-9]{24})$.

      • testS3Bucket string Required

        Name of the S3 data bucket that the provided role ID is authorized to access.Required if specifying cloudProviderConfig.

    • azure object

      Configuration for running Data Federation in Azure.

      Hide azure attribute Show azure attribute object
      • roleId string Required

        Unique identifier of the role that Data Federation can use to access the data stores. Required if specifying cloudProviderConfig.

        Format should match the following pattern: ^([a-f0-9]{24})$.

    • gcp object

      Configuration for running Data Federation in GCP.

      Hide gcp attribute Show gcp attribute object
      • roleId string Required

        Unique identifier of the role that Data Federation can use to access the data stores. Required if specifying cloudProviderConfig.

        Format should match the following pattern: ^([a-f0-9]{24})$.

  • dataProcessRegion object

    Information about the cloud provider region to which the Federated Database Instance routes client connections.

    Hide dataProcessRegion attributes Show dataProcessRegion attributes object
    • cloudProvider string Required

      Name of the cloud service that hosts the Federated Database Instance's infrastructure.

      Values are AWS, AZURE, or GCP.

    • region string Required

      Name of the region to which the data lake routes client connections.

      One of:

      Atlas Data Federation AWS Regions.

      Values are SYDNEY_AUS, MUMBAI_IND, FRANKFURT_DEU, DUBLIN_IRL, LONDON_GBR, VIRGINIA_USA, OREGON_USA, SAOPAULO_BRA, MONTREAL_CAN, TOKYO_JPN, SEOUL_KOR, or SINGAPORE_SGP.

      Atlas Data Federation Azure Regions.

      Values are VIRGINIA_USA, AMSTERDAM_NLD, or SAOPAULO_BRA.

      Atlas Data Federation GCP Regions.

      Values are IOWA_USA or BELGIUM_EU.

  • name string

    Human-readable label that identifies the Federated Database Instance.

  • storage object

    Configuration information for each data store and its mapping to MongoDB Cloud databases.

    Hide storage attributes Show storage attributes object
    • databases array[object]

      Array that contains the queryable databases and collections for this data lake.

      Database associated with this data lake. Databases contain collections and views.

      Hide databases attributes Show databases attributes object
      • collections array[object]

        Array of collections and data sources that map to a stores data store.

        A collection and data sources that map to a stores data store.

        Hide collections attributes Show collections attributes object
        • dataSources array[object]

          Array that contains the data stores that map to a collection for this data lake.

          Data store that maps to a collection for this data lake.

          Hide dataSources attributes Show dataSources attributes object
          • allowInsecure boolean

            Flag that validates the scheme in the specified URLs. If true, allows insecure HTTP scheme, doesn't verify the server's certificate chain and hostname, and accepts any certificate with any hostname presented by the server. If false, allows secure HTTPS scheme only.

            Default value is false.

          • collection string

            Human-readable label that identifies the collection in the database. For creating a wildcard (*) collection, you must omit this parameter.

          • collectionRegex string

            Regex pattern to use for creating the wildcard (*) collection. To learn more about the regex syntax, see Go programming language.

          • database string

            Human-readable label that identifies the database, which contains the collection in the cluster. You must omit this parameter to generate wildcard (*) collections for dynamically generated databases.

          • databaseRegex string

            Regex pattern to use for creating the wildcard (*) database. To learn more about the regex syntax, see Go programming language.

          • datasetName string

            Human-readable label that identifies the dataset that Atlas generates for an ingestion pipeline run or Online Archive.

          • datasetPrefix string

            Human-readable label that matches against the dataset names for ingestion pipeline runs or Online Archives.

          • defaultFormat string

            File format that MongoDB Cloud uses if it encounters a file without a file extension while searching storeName.

            Values are .avro, .avro.bz2, .avro.gz, .bson, .bson.bz2, .bson.gz, .bsonx, .csv, .csv.bz2, .csv.gz, .json, .json.bz2, .json.gz, .orc, .parquet, .tsv, .tsv.bz2, or .tsv.gz.

          • path string

            File path that controls how MongoDB Cloud searches for and parses files in the storeName before mapping them to a collection.Specify / to capture all files and folders from the prefix path.

          • provenanceFieldName string

            Name for the field that includes the provenance of the documents in the results. MongoDB Cloud returns different fields in the results for each supported provider.

          • storeName string

            Human-readable label that identifies the data store that MongoDB Cloud maps to the collection.

          • trimLevel integer(int32)

            Unsigned integer that specifies how many fields of the dataset name to trim from the left of the dataset name before mapping the remaining fields to a wildcard collection name.

          • urls array[string]

            URLs of the publicly accessible data files. You can't specify URLs that require authentication. Atlas Data Lake creates a partition for each URL. If empty or omitted, Data Lake uses the URLs from the store specified in the dataSources.storeName parameter.

        • name string

          Human-readable label that identifies the collection to which MongoDB Cloud maps the data in the data stores.

      • maxWildcardCollections integer(int32)

        Maximum number of wildcard collections in the database. This only applies to S3 data sources.

        Minimum value is 1, maximum value is 1000. Default value is 100.

      • name string

        Human-readable label that identifies the database to which the data lake maps data.

      • views array[object]

        Array of aggregation pipelines that apply to the collection. This only applies to S3 data sources.

        An aggregation pipeline that applies to the collection.

        Hide views attributes Show views attributes object
        • name string

          Human-readable label that identifies the view, which corresponds to an aggregation pipeline on a collection.

        • pipeline string

          Aggregation pipeline stages to apply to the source collection.

          Aggregation Pipelines
        • source string

          Human-readable label that identifies the source collection for the view.

    • stores array[object]

      Array that contains the data stores for the data lake.

      One of:

Responses

  • 200 application/vnd.atlas.2023-01-01+json

    OK

    Hide response attributes Show response attributes object
    • cloudProviderConfig object

      Cloud provider where this Federated Database Instance is hosted.

      Hide cloudProviderConfig attributes Show cloudProviderConfig attributes object
      • aws object

        Configuration for running Data Federation in AWS.

        Hide aws attributes Show aws attributes object
        • externalId string

          Unique identifier associated with the Identity and Access Management (IAM) role that the data lake assumes when accessing the data stores.

        • iamAssumedRoleARN string

          Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role that the data lake assumes when accessing data stores.

          Minimum length is 20, maximum length is 2048.

        • iamUserARN string

          Amazon Resource Name (ARN) of the user that the data lake assumes when accessing data stores.

        • roleId string Required

          Unique identifier of the role that the data lake can use to access the data stores.Required if specifying cloudProviderConfig.

          Format should match the following pattern: ^([a-f0-9]{24})$.

      • azure object

        Configuration for running Data Federation in Azure.

        Hide azure attributes Show azure attributes object
        • atlasAppId string

          The App ID generated by Atlas for the Service Principal's access policy.

        • roleId string Required

          Unique identifier of the role that Data Federation can use to access the data stores. Required if specifying cloudProviderConfig.

          Format should match the following pattern: ^([a-f0-9]{24})$.

        • servicePrincipalId string

          The ID of the Service Principal for which there is an access policy for Atlas to access Azure resources.

        • tenantId string

          The Azure Active Directory / Entra ID tenant ID associated with the Service Principal.

      • gcp object

        Configuration for running Data Federation in GCP.

        Hide gcp attributes Show gcp attributes object
        • gcpServiceAccount string

          The email address of the Google Cloud Platform (GCP) service account created by Atlas which should be authorized to allow Atlas to access Google Cloud Storage.

        • roleId string Required

          Unique identifier of the role that Data Federation can use to access the data stores. Required if specifying cloudProviderConfig.

          Format should match the following pattern: ^([a-f0-9]{24})$.

    • dataProcessRegion object

      Information about the cloud provider region to which the Federated Database Instance routes client connections.

      Hide dataProcessRegion attributes Show dataProcessRegion attributes object
      • cloudProvider string Required

        Name of the cloud service that hosts the Federated Database Instance's infrastructure.

        Values are AWS, AZURE, or GCP.

      • region string Required

        Name of the region to which the data lake routes client connections.

        One of:

        Atlas Data Federation AWS Regions.

        Values are SYDNEY_AUS, MUMBAI_IND, FRANKFURT_DEU, DUBLIN_IRL, LONDON_GBR, VIRGINIA_USA, OREGON_USA, SAOPAULO_BRA, MONTREAL_CAN, TOKYO_JPN, SEOUL_KOR, or SINGAPORE_SGP.

        Atlas Data Federation Azure Regions.

        Values are VIRGINIA_USA, AMSTERDAM_NLD, or SAOPAULO_BRA.

        Atlas Data Federation GCP Regions.

        Values are IOWA_USA or BELGIUM_EU.

    • groupId string

      Unique 24-hexadecimal character string that identifies the project.

      Format should match the following pattern: ^([a-f0-9]{24})$.

    • hostnames array[string]

      List that contains the hostnames assigned to the Federated Database Instance.

    • name string

      Human-readable label that identifies the Federated Database Instance.

    • privateEndpointHostnames array[object]

      List that contains the sets of private endpoints and hostnames.

      Set of Private endpoint and hostnames.

      Hide privateEndpointHostnames attributes Show privateEndpointHostnames attributes object
      • hostname string

        Human-readable label that identifies the hostname.

      • privateEndpoint string

        Human-readable label that identifies private endpoint.

    • state string

      Label that indicates the status of the Federated Database Instance.

      Values are UNVERIFIED, ACTIVE, or DELETED.

    • storage object

      Configuration information for each data store and its mapping to MongoDB Cloud databases.

      Hide storage attributes Show storage attributes object
      • databases array[object]

        Array that contains the queryable databases and collections for this data lake.

        Database associated with this data lake. Databases contain collections and views.

        Hide databases attributes Show databases attributes object
        • collections array[object]

          Array of collections and data sources that map to a stores data store.

          A collection and data sources that map to a stores data store.

          Hide collections attributes Show collections attributes object
          • dataSources array[object]

            Array that contains the data stores that map to a collection for this data lake.

            Data store that maps to a collection for this data lake.

            Hide dataSources attributes Show dataSources attributes object
            • allowInsecure boolean

              Flag that validates the scheme in the specified URLs. If true, allows insecure HTTP scheme, doesn't verify the server's certificate chain and hostname, and accepts any certificate with any hostname presented by the server. If false, allows secure HTTPS scheme only.

              Default value is false.

            • collection string

              Human-readable label that identifies the collection in the database. For creating a wildcard (*) collection, you must omit this parameter.

            • collectionRegex string

              Regex pattern to use for creating the wildcard (*) collection. To learn more about the regex syntax, see Go programming language.

            • database string

              Human-readable label that identifies the database, which contains the collection in the cluster. You must omit this parameter to generate wildcard (*) collections for dynamically generated databases.

            • databaseRegex string

              Regex pattern to use for creating the wildcard (*) database. To learn more about the regex syntax, see Go programming language.

            • datasetName string

              Human-readable label that identifies the dataset that Atlas generates for an ingestion pipeline run or Online Archive.

            • datasetPrefix string

              Human-readable label that matches against the dataset names for ingestion pipeline runs or Online Archives.

            • defaultFormat string

              File format that MongoDB Cloud uses if it encounters a file without a file extension while searching storeName.

              Values are .avro, .avro.bz2, .avro.gz, .bson, .bson.bz2, .bson.gz, .bsonx, .csv, .csv.bz2, .csv.gz, .json, .json.bz2, .json.gz, .orc, .parquet, .tsv, .tsv.bz2, or .tsv.gz.

            • path string

              File path that controls how MongoDB Cloud searches for and parses files in the storeName before mapping them to a collection.Specify / to capture all files and folders from the prefix path.

            • provenanceFieldName string

              Name for the field that includes the provenance of the documents in the results. MongoDB Cloud returns different fields in the results for each supported provider.

            • storeName string

              Human-readable label that identifies the data store that MongoDB Cloud maps to the collection.

            • trimLevel integer(int32)

              Unsigned integer that specifies how many fields of the dataset name to trim from the left of the dataset name before mapping the remaining fields to a wildcard collection name.

            • urls array[string]

              URLs of the publicly accessible data files. You can't specify URLs that require authentication. Atlas Data Lake creates a partition for each URL. If empty or omitted, Data Lake uses the URLs from the store specified in the dataSources.storeName parameter.

          • name string

            Human-readable label that identifies the collection to which MongoDB Cloud maps the data in the data stores.

        • maxWildcardCollections integer(int32)

          Maximum number of wildcard collections in the database. This only applies to S3 data sources.

          Minimum value is 1, maximum value is 1000. Default value is 100.

        • name string

          Human-readable label that identifies the database to which the data lake maps data.

        • views array[object]

          Array of aggregation pipelines that apply to the collection. This only applies to S3 data sources.

          An aggregation pipeline that applies to the collection.

          Hide views attributes Show views attributes object
          • name string

            Human-readable label that identifies the view, which corresponds to an aggregation pipeline on a collection.

          • pipeline string

            Aggregation pipeline stages to apply to the source collection.

            Aggregation Pipelines
          • source string

            Human-readable label that identifies the source collection for the view.

      • stores array[object]

        Array that contains the data stores for the data lake.

        One of:
  • 400 application/json

    Bad Request.

    Hide response attributes Show response attributes object
    • badRequestDetail object

      Bad request detail.

      Hide badRequestDetail attribute Show badRequestDetail attribute object
      • fields array[object]

        Describes all violations in a client request.

        Hide fields attributes Show fields attributes object
        • description string Required

          A description of why the request element is bad.

        • field string Required

          A path that leads to a field in the request body.

    • detail string

      Describes the specific conditions or reasons that cause each type of error.

    • error integer(int32) Required

      HTTP status code returned with this error.

      External documentation
    • errorCode string Required

      Application error code returned with this error.

    • parameters array[object]

      Parameters used to give more information about the error.

    • reason string

      Application error message returned with this error.

  • 401 application/json

    Unauthorized.

    Hide response attributes Show response attributes object
    • badRequestDetail object

      Bad request detail.

      Hide badRequestDetail attribute Show badRequestDetail attribute object
      • fields array[object]

        Describes all violations in a client request.

        Hide fields attributes Show fields attributes object
        • description string Required

          A description of why the request element is bad.

        • field string Required

          A path that leads to a field in the request body.

    • detail string

      Describes the specific conditions or reasons that cause each type of error.

    • error integer(int32) Required

      HTTP status code returned with this error.

      External documentation
    • errorCode string Required

      Application error code returned with this error.

    • parameters array[object]

      Parameters used to give more information about the error.

    • reason string

      Application error message returned with this error.

  • 403 application/json

    Forbidden.

    Hide response attributes Show response attributes object
    • badRequestDetail object

      Bad request detail.

      Hide badRequestDetail attribute Show badRequestDetail attribute object
      • fields array[object]

        Describes all violations in a client request.

        Hide fields attributes Show fields attributes object
        • description string Required

          A description of why the request element is bad.

        • field string Required

          A path that leads to a field in the request body.

    • detail string

      Describes the specific conditions or reasons that cause each type of error.

    • error integer(int32) Required

      HTTP status code returned with this error.

      External documentation
    • errorCode string Required

      Application error code returned with this error.

    • parameters array[object]

      Parameters used to give more information about the error.

    • reason string

      Application error message returned with this error.

  • 404 application/json

    Not Found.

    Hide response attributes Show response attributes object
    • badRequestDetail object

      Bad request detail.

      Hide badRequestDetail attribute Show badRequestDetail attribute object
      • fields array[object]

        Describes all violations in a client request.

        Hide fields attributes Show fields attributes object
        • description string Required

          A description of why the request element is bad.

        • field string Required

          A path that leads to a field in the request body.

    • detail string

      Describes the specific conditions or reasons that cause each type of error.

    • error integer(int32) Required

      HTTP status code returned with this error.

      External documentation
    • errorCode string Required

      Application error code returned with this error.

    • parameters array[object]

      Parameters used to give more information about the error.

    • reason string

      Application error message returned with this error.

  • 500 application/json

    Internal Server Error.

    Hide response attributes Show response attributes object
    • badRequestDetail object

      Bad request detail.

      Hide badRequestDetail attribute Show badRequestDetail attribute object
      • fields array[object]

        Describes all violations in a client request.

        Hide fields attributes Show fields attributes object
        • description string Required

          A description of why the request element is bad.

        • field string Required

          A path that leads to a field in the request body.

    • detail string

      Describes the specific conditions or reasons that cause each type of error.

    • error integer(int32) Required

      HTTP status code returned with this error.

      External documentation
    • errorCode string Required

      Application error code returned with this error.

    • parameters array[object]

      Parameters used to give more information about the error.

    • reason string

      Application error message returned with this error.

POST /api/atlas/v2/groups/{groupId}/dataFederation
atlas api dataFederation createDataFederation --help
import (
	"os"
	"context"
	"log"
	sdk "go.mongodb.org/atlas-sdk/v20231001001/admin"
)

func main() {
	ctx := context.Background()
	clientID := os.Getenv("MONGODB_ATLAS_CLIENT_ID")
	clientSecret := os.Getenv("MONGODB_ATLAS_CLIENT_SECRET")

	// See https://dochub.mongodb.org/core/atlas-go-sdk-oauth
	client, err := sdk.NewClient(sdk.UseOAuthAuth(clientID, clientSecret))

	if err != nil {
		log.Fatalf("Error: %v", err)
	}

	params = &sdk.CreateGroupDataFederationApiParams{}
	sdkResp, httpResp, err := client.DataFederationApi.
		CreateGroupDataFederationWithParams(ctx, params).
		Execute()
}
curl --include --header "Authorization: Bearer ${ACCESS_TOKEN}" \
  --header "Accept: application/vnd.atlas.2023-10-01+json" \
  --header "Content-Type: application/json" \
  -X POST "https://cloud.mongodb.com/api/atlas/v2/groups/{groupId}/dataFederation" \
  -d '{ <Payload> }'
curl --user "${PUBLIC_KEY}:${PRIVATE_KEY}" \
  --digest --include \
  --header "Accept: application/vnd.atlas.2023-10-01+json" \
  --header "Content-Type: application/json" \
  -X POST "https://cloud.mongodb.com/api/atlas/v2/groups/{groupId}/dataFederation" \
  -d '{ <Payload> }'
Request examples
{
  "cloudProviderConfig": {
    "aws": {
      "roleId": "32b6e34b3d91647abb20e7b8",
      "testS3Bucket": "string"
    },
    "azure": {
      "roleId": "32b6e34b3d91647abb20e7b8"
    },
    "gcp": {
      "roleId": "32b6e34b3d91647abb20e7b8"
    }
  },
  "dataProcessRegion": {
    "cloudProvider": "AWS",
    "region": "SYDNEY_AUS"
  },
  "name": "string",
  "storage": {
    "databases": [
      {
        "collections": [
          {
            "dataSources": [
              {
                "allowInsecure": false,
                "collection": "string",
                "collectionRegex": "string",
                "database": "string",
                "databaseRegex": "string",
                "datasetName": "v1$atlas$snapshot$Cluster0$myDatabase$myCollection$19700101T000000Z",
                "datasetPrefix": "string",
                "defaultFormat": ".avro",
                "path": "string",
                "provenanceFieldName": "string",
                "storeName": "string",
                "trimLevel": 42,
                "urls": [
                  "string"
                ]
              }
            ],
            "name": "string"
          }
        ],
        "maxWildcardCollections": 100,
        "name": "string",
        "views": [
          {
            "name": "string",
            "pipeline": "string",
            "source": "string"
          }
        ]
      }
    ],
    "stores": [
      {
        "name": "string",
        "provider": "s3",
        "additionalStorageClasses": [
          "STANDARD"
        ],
        "bucket": "string",
        "delimiter": "string",
        "includeTags": false,
        "prefix": "string",
        "public": false,
        "region": "US_GOV_WEST_1"
      }
    ]
  }
}
Response examples (200)
{
  "cloudProviderConfig": {
    "aws": {
      "externalId": "string",
      "iamAssumedRoleARN": "arn:aws:iam::123456789012:root",
      "iamUserARN": "string",
      "roleId": "32b6e34b3d91647abb20e7b8"
    },
    "azure": {
      "atlasAppId": "string",
      "roleId": "32b6e34b3d91647abb20e7b8",
      "servicePrincipalId": "string",
      "tenantId": "string"
    },
    "gcp": {
      "gcpServiceAccount": "string",
      "roleId": "32b6e34b3d91647abb20e7b8"
    }
  },
  "dataProcessRegion": {
    "cloudProvider": "AWS",
    "region": "SYDNEY_AUS"
  },
  "groupId": "32b6e34b3d91647abb20e7b8",
  "hostnames": [
    "string"
  ],
  "name": "string",
  "privateEndpointHostnames": [
    {
      "hostname": "string",
      "privateEndpoint": "string"
    }
  ],
  "state": "UNVERIFIED",
  "storage": {
    "databases": [
      {
        "collections": [
          {
            "dataSources": [
              {
                "allowInsecure": false,
                "collection": "string",
                "collectionRegex": "string",
                "database": "string",
                "databaseRegex": "string",
                "datasetName": "v1$atlas$snapshot$Cluster0$myDatabase$myCollection$19700101T000000Z",
                "datasetPrefix": "string",
                "defaultFormat": ".avro",
                "path": "string",
                "provenanceFieldName": "string",
                "storeName": "string",
                "trimLevel": 42,
                "urls": [
                  "string"
                ]
              }
            ],
            "name": "string"
          }
        ],
        "maxWildcardCollections": 100,
        "name": "string",
        "views": [
          {
            "name": "string",
            "pipeline": "string",
            "source": "string"
          }
        ]
      }
    ],
    "stores": [
      {
        "name": "string",
        "provider": "s3",
        "additionalStorageClasses": [
          "STANDARD"
        ],
        "bucket": "string",
        "delimiter": "string",
        "includeTags": false,
        "prefix": "string",
        "public": false,
        "region": "US_GOV_WEST_1"
      }
    ]
  }
}
Response examples (400)
{
  "error": 400,
  "detail": "(This is just an example, the exception may not be related to this endpoint) No provider AWS exists.",
  "reason": "Bad Request",
  "errorCode": "VALIDATION_ERROR"
}
Response examples (401)
{
  "error": 401,
  "detail": "(This is just an example, the exception may not be related to this endpoint)",
  "reason": "Unauthorized",
  "errorCode": "NOT_ORG_GROUP_CREATOR"
}
Response examples (403)
{
  "error": 403,
  "detail": "(This is just an example, the exception may not be related to this endpoint)",
  "reason": "Forbidden",
  "errorCode": "CANNOT_CHANGE_GROUP_NAME"
}
Response examples (404)
{
  "error": 404,
  "detail": "(This is just an example, the exception may not be related to this endpoint) Cannot find resource AWS",
  "reason": "Not Found",
  "errorCode": "RESOURCE_NOT_FOUND"
}
Response examples (500)
{
  "error": 500,
  "detail": "(This is just an example, the exception may not be related to this endpoint)",
  "reason": "Internal Server Error",
  "errorCode": "UNEXPECTED_ERROR"
}