Automate continuous copying from s3 to mongodb

I have seen articles for the opposite but this is something I have struggled to find.
I have data in S3. I can query it on atlas using a federated database.

Here was my plan:

  • Use federated database to query the data
  • Use this federated database to copy all the s3 data to a separate collection.
  • I wanted to use a trigger to automate this process. But can’t use database trigger for federated database and scheduled trigger would copy everything again.

I wanted to ask if there is some bookmark sort of feature where we can know where to copy from and to. My time stamp variable is not distinct enough to use it.

Any help on this will be much appreciated. I am really stuck.
Thank you in Advance!

Hey @Anirudh_S ,

As you noticed, you can definitely write data from S3 to Atlas on a recurring basis using Data Federation.

I think there are probably a few different ways that you could approach this, but I think it depends on how the data is structured in yourself bucket. Some things that come to mind might be a timestamp or tag on the object itself, or maybe having the file deleted or moved after it has been imported with some other function.

I think it might be easiest to brainstorm on a call though, would you like to put some time on my calendar here: Calendly - Benjamin Flast

Best,
Ben

@Benjamin_Flast I kind of faced a similar problem where we tons of s3 docs in json formats and trying to write to the mongodb. Any solutions or best practices suggested would be great.