Using datalake data source name in Charts

We have numerous databases set up in atlas using the same collections structure. I have a datalake set up which is combining each of those databases to query together in Charts. Which database a specific document originated from is important for analysis. I would like to use the data lake data source name as a variable for, say, Series variable on a bar chart.

I’ve tried using $collStats in a Charts data source pipeline, and setting up the data lake collections to be single collection per database as well as single data lake collection containing all of one collection type across all databases.

Is this possible?

Hi @Meag_Tessmann ,

It sounds like you might need to add a new field to each chart data source with its specific origin. Example :
database1

[{ $addFields : {origin : "database1"}}]

datalake

[{ $addFields : {origin : "datalake"}}]

Will that work for you?

Thanks
Pavel

The original databases are being combined in a datalake, so the charts datasource is a single datalake database. I’m having trouble finding a way to pull the source database names that are already combined in the datalake to add within the charts aggregation pipeline.

Is it possible to use addField within the datalake config to take the value of the datasource’s database name?

@Meag_Tessmann ,

Ok so I think possible solution :

  1. Create a view in atlas for each database adding the field with identifier
  2. Read the view in datalake for each database
  3. Use in charts.

Thanks
Pavel

Thanks @Pavel_Duchovny, I’m trying to find documentation or examples of how to do this - are you aware of any by chance?

Hi @Meag_Tessmann ,

You mean create views?