EventGet 50% off your ticket to MongoDB.local London on October 2. Use code WEB50Learn more >>
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Storing Binary Data with MongoDB and C++

Rishabh Bisht6 min read • Published Sep 18, 2023 • Updated Sep 18, 2023
C++Atlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
In modern applications, storing and retrieving binary files efficiently is a crucial requirement. MongoDB enables this with binary data type in the BSON which is a binary serialization format used to store documents in MongoDB. A BSON binary value is a byte array and has a subtype (like generic binary subtype, UUID, MD5, etc.) that indicates how to interpret the binary data. See BSON Types — MongoDB Manual for more information.
In this tutorial, we will write a console application in C++, using the MongoDB C++ driver to upload and download binary data.
Note:
  • When using this method, remember that the BSON document size limit in MongoDB is 16 MB. If your binary files are larger than this limit, consider using GridFS for more efficient handling of large files. See GridFS example in C++ for reference.
  • Developers often weigh the trade-offs and strategies when storing binary data in MongoDB. It's essential to ensure that you have also considered different strategies to optimize your data management approach.

Prerequisites

  1. MongoDB Atlas account with a cluster created.
  2. IDE (like Microsoft Visual Studio or Microsoft Visual Studio Code) setup with the MongoDB C and C++ Driver installed. Follow the instructions in Getting Started with MongoDB and C++ to install MongoDB C/C++ drivers and set up the dev environment in Visual Studio. Installation instructions for other platforms are available.
  3. Compiler with C++17 support (for using std::filesystem operations).
  4. Your machine’s IP address whitelisted. Note: You can add 0.0.0.0/0 as the IP address, which should allow access from any machine. This setting is not recommended for production use.

Building the application

Source code available here.
As part of the different BSON types, the C++ driver provides the b_binary struct that can be used for storing binary data value in a BSON document. See the API reference.
We start with defining the structure of our BSON document. We have defined three keys: name, path, and data. These contain the name of the file being uploaded, its full path from the disk, and the actual file data respectively. See a sample document below:
Sample document with binary data.
In the code, these are defined with a #define so that it’s easy to modify them from a single place.

Helper functions

Let’s add a helper function, upload, which accepts a file path and a MongoDB collection as inputs. Its primary purpose is to upload the file to the specified MongoDB collection by converting the file into a BSON binary value and constructing a BSON document to represent the file's metadata and content. Here are the key steps within the upload function:
  1. Open the file at the given path and get its size.
    1. The file's size is determined by moving the file pointer to the end of the file and then retrieving the current position, which corresponds to the file's size.
    2. The file pointer is then reset to the beginning of the file to read the content later.
  2. Read File Content into a Buffer: A std::vector<char> buffer is created with a size equal to the file's size to hold the file's binary data.
  3. Create the BSON binary value.
    1. To represent the file content as BSON binary value, the code creates a bsoncxx::types::b_binary object.
    2. The b_binary object includes the binary subtype (set to bsoncxx::binary_sub_type::k_binary), the file's size, and data.
  4. Create a BSON document with three fields: name, path, and data.
  5. Insert the document into the collection.
Let’s write a similar helper function to perform the download. The code below takes the file name, destination folder, and a MongoDB collection as inputs. This function searches for a file by its name in the specified MongoDB collection, extracts its binary data, and saves it to the specified destination folder.
Here are the key steps within the download function:
  1. Create a filter query to find the file.
  2. Use the query to find the document in the collection.
  3. Extract and save binary data — the binary data is accessed using bsoncxx::document::view and then retrieved from the document using binaryDocView[FILE_DATA].get_binary().
  4. Create a file in the destination folder and write the binary content into the file.

The main() function

With the helper functions in place to perform upload and download, let’s write the main function that will drive this application. Here are the key steps within the main function:
  1. Connect to MongoDB: Establish a connection to MongoDB by creating a mongocxx::client instance.
  2. Fetch the database (fileStorage) and collection (files) to store the files.
  3. Upload all files found in the specified uploadFolder: Recursively iterate through the folder using std::filesystem::recursive_directory_iterator. For each file found, call the upload function to upload the file to the MongoDB collection.
  4. Download specific files with known filenames (fileName1 and fileName2) by calling download function to retrieve and save the files to the downloadFolder.
  5. Similarly, download all files in the collection by calling find({}) to get a cursor and iterate through each document in the collection, extracting the file name and then calling download function to download and save the file to the downloadFolder.
    Note: In a real-world situation, calling find({}) should be done with some kind of filtering/pagination to avoid issues with memory consumption and performance.
Make sure to get the connection string (URI), update it to mongoURIStr, and set the different path and filenames to the ones on your disk.

Application in action

Before executing this application, add some files (like images or audios) under the uploadFolder directory.
Files to be uploaded from local disk to MongoDB.
Execute the application and you’ll observe output like this, signifying that the files are successfully uploaded and downloaded.
Application output showing successful uploads and downloads.
You can see the collection in Atlas or MongoDB Compass reflecting the files uploaded via the application.
Collection with binary data in MongoDB Compass.
You will observe the files getting downloaded into the specified downloadFolder directory.
Files downloaded from MongoDB to local disk.

Conclusion

With this article, we covered storing and retrieving binary data from a MongoDB database, using the MongoDB C++ driver. MongoDB's robust capabilities, combined with the ease of use provided by the C++ driver, offer a powerful solution for handling file storage in C++ applications. We can't wait to see what you build next! Share your creation with the community and let us know how it turned out!

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

MongoDB Charts Embedding SDK with React


Sep 09, 2024 | 5 min read
Tutorial

Adding Autocomplete To Your NextJS Applications With Atlas Search


Feb 28, 2023 | 11 min read
Code Example

EHRS-Peru


Sep 11, 2024 | 3 min read
Tutorial

Learn to Build AI-Enhanced Retail Search Solutions with MongoDB and Databricks


Sep 18, 2024 | 14 min read
Table of Contents
  • Prerequisites