Connection between S3 and mongodb atlas

Harsh_Wani · March 14, 2022, 6:11am

Hi,
We´d like to export an existing collection data (most probably after a simple filter query) to csv and upload the file to an AWS S3 bucket, every night.It is around 40000 documents but should scale well as the data is expected grow.

So can anyone guide me how should I do this task.

Benjamin_Flast · March 14, 2022, 10:23pm

Hey Harsh,

This is actually something you can do quite easily using Atlas Data Lake.

In fact we have a tutorial on how to do it right here: How to Automate Continuous Data Copying from MongoDB to S3 | MongoDB

In the tutorial we are writing the data to Parquet files, but in this case you can just use the type “CSV”.

The process will look at bit like this:

Create a data lake that points a virtual Data Lake Collection at the source collection in your Atlas Cluster
Create a $out aggregation that matches the appropriate documents and writes them out to the correct location on your bucket
Put that aggregation into a Atlas Scheduled Trigger that runs once a day

That should be about it.

Feel free to reach out to me directly if you have any questions on this at Benjamin.Flast@mongodb.com

Best,
Ben

system · March 19, 2022, 10:23pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.