Transforming and exporting data

I have a dataset of about 20,000 records which follows a similar, but not exact, data structure.

The documents are constantly added to and updated from a third party source, but it is not in the format which I require for search indexing (it needs to be flat).

Every document in the collection needs to be transformed, so I’m wondering the best way to do this.

Hi @NFTX_Tech and welcome in the MongoDB Community :muscle: !

Atlas Search doesn’t require flat data as far as I know and can leverage well designed documents (arrays, sub documents, etc).

If you aren’t on Atlas and you want to go down that path, I can suggest 2 solutions that are pretty similar as I think simple to set up.

  1. You can use an aggregation pipeline with a $project to transform your collection and reshape the documents the way you want them. Then you can $out your new collection into a new one and fix the data feed so it now writes the documents correctly in the future in this new collection.

  2. If you can’t fix the data import, you can also create a view using the same aggregation pipeline. In the end, you can query the “version” of the collection you want. The one you currently have or the view which isn’t stored by default but will contain the docs in the shape you want - according to your aggregation pipeline.

Cheers,
Maxime.