Use relational migrator to decompose mysql table by column?

Hi folks,
I’ve started looking into Relational Migrator as an ETL tool to bring data from an existing MariaDB into an AtlasDB environment. The existing data in MariaDB is already somewhat denormalized: It all lives in one table. I would like to map that to a document schema where certain columns then form an embedded object.

As a fictional example, let’s say the relational table holds data of tactical units, including e.g. their geographical deployment. Like so:

id unit_name image_1 image_2 image_3 latitude longitude deployment_date
1 Alpha img1.jpg img2.jpg img3.jpg 34.05 -118.25 2024-08-04
2 Bravo img4.jpg img5.jpg img6.jpg 40.71 -74.01 2024-08-05

And I would like to group values of the columns pertaining to the geography details into an embedded object, and image related data into an embedded array.

So the target structure would resemble this:

{
  "_id": ObjectId("..."),
  "unit_name": "Alpha",
  "image_gallery": ["img1.jpg", "img2.jpg", "img3.jpg"],
  "deployment": {
    "location": {
      "latitude": 34.05,
      "longitude": -118.25
    },
    "date": "2024-08-04"
  }
}

It looks to me so far that the relational migrator allows such mappings, but only if the source data comes from different relational tables.

What would be the preferred approach to achieve this ETL task?

Thanks for some guidance
Patrick

Hi Patrick, thanks for using Relational Migrator!

Yes it should be possible to do what you are asking about, although the approach differs for embedded documents and embedded arrays.

To add structure to a mapping rule from a single table, you can rename the fields with a dotted path. For example you could change the name of the latitude field to deployment.location.latitude. You don’t need to create a separate Embedded Document mapping in this scenario.

To convert multiple fields into a single array, you can exclude the original fields and add a new calculated field (image_gallery) which returns an array result, i.e.:

new Array( columns["image_1"], columns["image_2"], columns["image_3"] )

Hi Tom, thanks for the clarification. I have configured this and will evaluate as soon as I establish the database connections.