Retrieve Word Docs from MongoDB

I have a large MongoDB of word documents and PDFs which I would like to save from the DB to my computer, retaining the directory structure if possible.

The website who stored these files and folders provided a MongoDB dump. I am completely new to this and am learning as I go. I have installed the MongoDB on Ubuntu, used Mongorestore to add the dumped files (fs.chunks.bson, fs.files etc.) to this local database and I can see the collection in Compass.

I would appreciate any thoughts on how to get the Docs/PDFs etc. onto my hard disk so I can open and view them in their respective applications. Retaining the folder struction as it was on the website is vital, if that is possible.

Thanks.

Hello @GreenLeaf, welcome to the MongoDB Community forum!

The data you had restored into the local (your computer) MongoDB is stored as GridFS collections. GridFS is a way of storing large files in MongoDB database. To get your DOC/PDF documents from the database you need to use the GridFS tools as specified here: Use GridFS. You can work with mongofiles command-line tool or use GridFS API of a programming language (like Python, Java, NodeJS, etc.) with its respective MongoDB Driver.

Thank you for your response. I have figured out a way to save files to disk from a GridFS bucket using Studio 3T, but this writes 10,174 files (the DOC/PDF files) to one folder without retaining the folder structure.

Is there a way, using mongofiles or Studio 3T (or another way), to save the files to disk while maintaining the folder structure? If it is relevant, there is a collection in the DB called _hierarchy.

Like I said before, I am learning as I go and any simple explanations or example code would be greatly appreciated.

This collection is not part of the GridFS collections. In case you have the directory structures somehow stored in this collection then, you need to figure to use it to build the directories. What does the document look like in this hierarchy collection? Also, the GridFS files collection has some meta data stored in it - you can query and see if directory information is in it.

Edit Add:

This Stack Overflow post says that “GridFS does not store files as a structure like file system hierarchy.”: Save file to GridFS with given path

Okay. Thank you. I will go through the other collection and metadata in the hope of being able to rebuilt the folder structure.