From MongoDB cloud instance to strings

Hi!

I’m running several managed MongoDB cloud instances.

I now want to run some analyze scripts on the data. However, I cannot run them on the database. Instead, I want the scripts to run on files on my local computer. (Not that much data, should be feasible computing-wise.)

Hence, I need to extract all the documents in all collections of my instances to strings and save them on file system of my local computer.

What is the most efficient way of doing so?

I tried to download backups and run them locally via Docker Compose, but things are not that easy due to replica set setups etc. etc.
Was just wondering if there is an easier way.

Thanks!

Hi @Christian_Schwaderer and welcome to the MongoDB community forum!!

It would be very helpful to understand the requirement, if you would share the following details:

Can you help me with an example on how would you like your documents to look like ?

If transferring data between MongoDB instances is the goal, there are different methods to do the same irrespective of whether you are on a cloud instance or not.

For example, using standard tooling, you can use mongodump and mongorestore to dump and restore the data from a remote deployment into a local MongoDB deployment.

If you wish to store your documents in different format, mongoexport and mongoimport is the way to go for it.

The MongoDB deployment topology should not matter when transferring data is the main goal. You can dump data from one deployment type (e.g. a replica set) to another deployment type (e.g. a standalone or a sharded cluster) using those standard tools. If this is not the main goal, could you elaborate with more examples?

Let us know if you have any further queries.

Best regards
Aasawari

2 Likes

Hi!

Thanks for your reply!

No, the goal is not transferring data from one instance to another one.

Let me try to explain with an example.

Say I have in any collection a document like this:

{
  "_id": {
    "$oid": "5ea0350ed62b9e002019c53d"
  },
  "foo": "mÖeow_123"
}

Now we assume that having an Ö plus a some decimals with add up to the number 6 in any string in any document in any collection of any instance would be somehow problematic for my application.
So, my goal would now be to find out if I that’s the case somewhere. (Acting upon it would be different problem which we can leave aside for now.)

One way of solving that would be to run a script against all data. However, I do not want that.

So, my approach is: I want all the documents as JSON files on the file system. E.g. I would have a file called bla_whatever_5ea0350ed62b9e002019c53d.json with the content:

{
  "_id": {
    "$oid": "5ea0350ed62b9e002019c53d"
  },
  "foo": "mÖeow_123"
}

on my file system.

Then I could run a script on all those local files - checking for Ös and decimals which add up to 6.

I do not intend to put the data back to Mongo afterwards.

Hi @Christian_Schwaderer

So, my approach is: I want all the documents as JSON files on the file system.

If I understand correctly, you wanted to dump all data in your collection as JSON and then run some scripts on them. If yes, then I think mongoexport is the tool that does that (Aasawari mentioned this as well in her post). It can export whole collections into JSON or CSV format.

However I’m curious about why you don’t want to execute those queries on the database itself. Is there a particular reason why it’s undesirable? I mean, searching for things is what databases do :slight_smile:

Best regards
Kevin

1 Like