Can someone help me with how to read PDF documents from GridFS using the Python library (PyPDF2)?
GridFS stores the name, contents, and optional metadata for a file and is agnostic to the type of file. Storing and reading a PDF file is the same as any other file. To upload and read a file:
my_db = MongoClient().test
fs = GridFSBucket(my_db)
# Upload a file:
with open('my.pdf', 'rb') as file:
file_id = fs.upload_from_stream('my.pdf', file)
# Read file by _id:
with open('my-copy.pdf', 'wb+') as file:
fs.download_to_stream(file_id, file)
# Read file by name:
with open('my-copy2.pdf', 'wb+') as file:
fs.download_to_stream_by_name('my.pdf', file)
You can also add tags via the “metadata” argument to the various GridFSBucket upload methods.
3 Likes
Thank you @Shane for the examples.
This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.