Atlas Cluster
Overview
Atlas Data Federation supports Atlas clusters as federated database instance stores. You must define mappings in your federated database instance to your Atlas cluster to run queries against your data.
Important
Information in your storage configuration is visible internally at MongoDB and stored as operational data to monitor and improve the performance of Atlas Data Federation. So, we recommend that you do not use PII in your configurations.
Configuration Format
To define a federated database instance store for an Atlas cluster, you can specify the following JSON configuration parameters in your Federated Database Instance Configuration File. The configuration contains the Atlas cluster and maps it to virtual collections that you can query.
The following JSON configuration shows the format of the stores and databases configuration fields, which you must set in your Federated Database Instance configuration file to define an Atlas cluster as a federated database instance store:
1 { 2 "stores" : [ 3 { 4 "name" : "<string>", 5 "provider": "<string>", 6 "clusterName": "<string>", 7 "projectId": "<string>", 8 "readPreference": { 9 "mode": "<string>", 10 "tagSets": [ 11 [{"name": "<string>", "value": "<string>"},...], 12 ... 13 ], 14 "maxStalenessSeconds": <int> 15 } 16 } 17 ], 18 "databases" : [ 19 { 20 "name" : "<string>", 21 "collections" : [ 22 { 23 "name" : "<string>", 24 "dataSources" : [ 25 { 26 "storeName" : "<string>", 27 "database" : "<string>", 28 "databaseRegex": "<string>", 29 "collection" : "<string>", 30 "collectionRegex" : "<string>", 31 "provenanceFieldName": "<string>" 32 } 33 ] 34 } 35 ], 36 "views" : [ 37 { 38 "name" : "<string>", 39 "source" : "<string>", 40 "pipeline" : "<string>" 41 } 42 ] 43 } 44 ] 45 }
The JSON configuration for an Atlas cluster contains two top-level objects: stores and databases
stores
The stores object defines each data store associated with the federated database instance. This store captures documents in the Atlas cluster. Federated Database Instances can only access data stores defined in the stores object.
1 "stores" : [ 2 { 3 "name" : "<string>", 4 "provider" : "<string>", 5 "clusterName" : "<string>", 6 "projectId": "<string>" 7 "readPreference": { 8 "mode": "<string>", 9 "tagSets": [ 10 [{"name": "<string>", "value": "<string>"},...], 11 ... 12 ], 13 "maxStalenessSeconds": <int> 14 }, 15 "readConcern": { 16 "level": "<string>" 17 } 18 } 19 ]
The following table describes the fields in the stores object:
Field | Type | Necessity | Description | |||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
array | required | Array of objects where each object represents a data store to associate with the federated database instance. The federated database instance store captures documents in the Atlas cluster. Atlas Data Federation can only access data stores defined in the | ||||||||||||||||||||||||
string | required | Name of the federated database instance store. The | ||||||||||||||||||||||||
string | required | Defines where the data is stored. Value must be | ||||||||||||||||||||||||
string | required | Name of the Atlas cluster on which the store is based. The cluster must exist in the same project as your federated database instance. The | ||||||||||||||||||||||||
string | required | Unique identifier of the project that contains the Atlas cluster on which the store is based. | ||||||||||||||||||||||||
boolean | optional | Cluster read preference, which describes how to route read requests to the cluster. For example: The following | ||||||||||||||||||||||||
string | optional | Read preference mode that specifies which replica set member to route the read requests to. Value can be one of the following:
If omitted, defaults to | ||||||||||||||||||||||||
array | optional | Arrays of tag sets or tag specification documents that contain name and value pairs for the replica set member. If specified, Atlas Data Federation routes read requests to replica set member or members that are associated with the specified tags. To learn more, Read Preference Tag Sets. IMPORTANT: Atlas Data Federation doesn't support | ||||||||||||||||||||||||
string | optional | Maximum replication lag, or "staleness", for reads from secondaries. To learn more about | ||||||||||||||||||||||||
string | optional | Consistency and isolation properties of the data read from an Atlas cluster. To learn more, see Read Concern. The value for the level of consistency and availability can be one of the following:
|
databases
The databases object defines the mapping between each federated database instance store defined in stores and MongoDB collections in the databases.
1 "databases" : [ 2 { 3 "name" : "<string>", 4 "collections" : [ 5 { 6 "name" : "<string>", 7 "dataSources" : [ 8 { 9 "storeName" : "<string>", 10 "database" : "<string>", 11 "databaseRegex": "<string>", 12 "collection" : "<string>", 13 "collectionRegex" : "<string>", 14 "provenanceFieldName": "<string>" 15 } 16 ] 17 } 18 ] 19 } 20 ]
The following table describes the fields in the databases object:
Field | Type | Necessity | Description | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
array | required | Array of objects where each object represents a database, its collections, and, optionally, any views on the collections. Each database can have multiple | |||||||||||||||||||||||||||||
string | required | Name of the database to which Atlas Data Federation maps the data contained in the data store. You can generate databases dynamically by specifying
| |||||||||||||||||||||||||||||
array | required | Array of objects where each object represents a collection and data sources that map to a | |||||||||||||||||||||||||||||
string | required | Name of the collection to which Atlas Data Federation maps the data contained in each You can generate collection names dynamically by specifying For wildcard ( | |||||||||||||||||||||||||||||
array | required | Array of objects where each object represents a | |||||||||||||||||||||||||||||
string | required | ||||||||||||||||||||||||||||||
string | required | Name of the database on the Atlas cluster that contains the collection. You must omit this setting to:
| |||||||||||||||||||||||||||||
string | optional | Regex pattern to use for globbing databases to combine multiple collections. If you specify this option, the federated database instance instance contains a single database with collections from multiple databases. For globbing databases, you must do the following:
Suppose you have 2 databases named If you specify this option, you must specify the name of the collection. You can't specify this option for wildcard collections. | |||||||||||||||||||||||||||||
string | required | Name of the collection in the Atlas cluster on which the federated database instance store is based. You must omit this setting for:
| |||||||||||||||||||||||||||||
string | Optional for wildcard collections Required for combining collections in a database | Regex pattern to use for creating the wildcard ( To use regex patterns for wildcard (
If you specify this field for generating wildcard collections, the federated database instance instance only contains collections with names that match the specified regular expression. The collections in the federated database instance storage configuration use their original names in the Atlas cluster. To use regex patterns for combining multiple collections in a database, you must do the following:
If you specify this field for combining multiple collections, the collection in the federated database instance contains data from all the Atlas collections with names that match the specified regular expression. The collection in the federated database instance storage configuration uses the name that you specify as value for To learn more about the regex syntax, see Go programming language. | |||||||||||||||||||||||||||||
string | required | Name for the field that includes the provenance of the documents in the results. If you specify this setting in the storage configuration, Atlas Data Federation returns the following fields for each document in the result:
You can't configure this setting using the Visual Editor in the Atlas UI. | |||||||||||||||||||||||||||||
array | required | Array of objects where each object represents an aggregation pipeline on a collection. To learn more about views, see Views. | |||||||||||||||||||||||||||||
string | required | Name of the view. | |||||||||||||||||||||||||||||
string | required | Name of the source collection for the view. If you want to create a view with a $sql stage, you must omit this field as the SQL statement will specify the source collection. | |||||||||||||||||||||||||||||
string | required | Aggregation pipeline stage(s) to apply to the |
Example Configuration for Atlas Data Store
Consider a M10 or higher Atlas cluster named myDataCenter containing data in the metrics.hardware collection. The metrics.hardware collection contains JSON documents with metrics derived from the hardware in a datacenter. The following configuration:
Specifies the Atlas cluster named
myDataCenterin the specified project as a federated database instance store.Maps documents from the
metrics.hardwarecollection in the Atlas cluster to thedataCenter.inventorycollection in the storage configuration.
{ "stores" : [ { "name" : "atlasClusterStore", "provider" : "atlas", "clusterName" : "myDataCenter", "projectId" : "5e2211c17a3e5a48f5497de3" } ], "databases" : [ { "name" : "dataCenter", "collections" : [ { "name" : "inventory", "dataSources" : [ { "storeName" : "atlasClusterStore", "database" : "metrics", "collection" : "hardware" } ] } ] } ] } Atlas Data Federation maps all the documents in the ``metrics.hardware`` collection to the ``dataCenter.inventory`` collection in the storage configuration. Users connected to the federated database instance can use the MongoDB Query Language and supported aggregations to analyze data in the |service| cluster through the ``dataCenter.inventory`` collection. When you run queries, the query first goes to Atlas Data Federation. Therefore, if you run aggregation queries that are supported by your |service| cluster but not by Atlas Data Federation, the queries will fail. To learn more about supported and unsupported commands in Data Federation, see :ref:`adf-mql-support`.