Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Storing Binary Data with MongoDB and C++

Rishabh Bisht6 min read • Published Sep 18, 2023 • Updated Sep 18, 2023
C++Atlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
In modern applications, storing and retrieving binary files efficiently is a crucial requirement. MongoDB enables this with binary data type in the BSON which is a binary serialization format used to store documents in MongoDB. A BSON binary value is a byte array and has a subtype (like generic binary subtype, UUID, MD5, etc.) that indicates how to interpret the binary data. See BSON Types — MongoDB Manual for more information.
In this tutorial, we will write a console application in C++, using the MongoDB C++ driver to upload and download binary data.
Note:
  • When using this method, remember that the BSON document size limit in MongoDB is 16 MB. If your binary files are larger than this limit, consider using GridFS for more efficient handling of large files. See GridFS example in C++ for reference.
  • Developers often weigh the trade-offs and strategies when storing binary data in MongoDB. It's essential to ensure that you have also considered different strategies to optimize your data management approach.

Prerequisites

  1. MongoDB Atlas account with a cluster created.
  2. IDE (like Microsoft Visual Studio or Microsoft Visual Studio Code) setup with the MongoDB C and C++ Driver installed. Follow the instructions in Getting Started with MongoDB and C++ to install MongoDB C/C++ drivers and set up the dev environment in Visual Studio. Installation instructions for other platforms are available.
  3. Compiler with C++17 support (for using std::filesystem operations).
  4. Your machine’s IP address whitelisted. Note: You can add 0.0.0.0/0 as the IP address, which should allow access from any machine. This setting is not recommended for production use.

Building the application

Source code available here.
As part of the different BSON types, the C++ driver provides the b_binary struct that can be used for storing binary data value in a BSON document. See the API reference.
We start with defining the structure of our BSON document. We have defined three keys: name, path, and data. These contain the name of the file being uploaded, its full path from the disk, and the actual file data respectively. See a sample document below:
Sample document with binary data.
In the code, these are defined with a #define so that it’s easy to modify them from a single place.

Helper functions

Let’s add a helper function, upload, which accepts a file path and a MongoDB collection as inputs. Its primary purpose is to upload the file to the specified MongoDB collection by converting the file into a BSON binary value and constructing a BSON document to represent the file's metadata and content. Here are the key steps within the upload function:
  1. Open the file at the given path and get its size.
    1. The file's size is determined by moving the file pointer to the end of the file and then retrieving the current position, which corresponds to the file's size.
    2. The file pointer is then reset to the beginning of the file to read the content later.
  2. Read File Content into a Buffer: A std::vector<char> buffer is created with a size equal to the file's size to hold the file's binary data.
  3. Create the BSON binary value.
    1. To represent the file content as BSON binary value, the code creates a bsoncxx::types::b_binary object.
    2. The b_binary object includes the binary subtype (set to bsoncxx::binary_sub_type::k_binary), the file's size, and data.
  4. Create a BSON document with three fields: name, path, and data.
  5. Insert the document into the collection.
1#include <mongocxx/client.hpp>
2#include <bsoncxx/builder/basic/document.hpp>
3#include <mongocxx/uri.hpp>
4#include <mongocxx/instance.hpp>
5
6#include <iostream>
7#include <fstream>
8#include <vector>
9#include <filesystem>
10
11#define FILE_NAME "name"
12#define FILE_PATH "path"
13#define FILE_DATA "data"
14
15using bsoncxx::builder::basic::kvp;
16using bsoncxx::builder::basic::make_document;
17
18// Upload a file to the collection.
19bool upload(const std::string& filePath, mongocxx::collection& collection)
20{
21 // Open the binary file
22 std::ifstream file(filePath, std::ios::binary | std::ios::ate);
23 if (!file)
24 {
25 std::cout << "Failed to open the file: " << filePath << std::endl;
26 return false;
27 }
28
29 // Get the file size.
30 std::streamsize fileSize = file.tellg();
31 file.seekg(0, std::ios::beg);
32
33 // Read the file content into a buffer
34 std::vector<char> buffer(fileSize);
35 if (!file.read(buffer.data(), fileSize))
36 {
37 std::cout << "Failed to read the file: " << filePath << std::endl;
38 return false;
39 }
40
41 // Create the binary object for bsoncxx.
42 bsoncxx::types::b_binary data{bsoncxx::binary_sub_type::k_binary, static_cast<std::uint32_t>(fileSize), reinterpret_cast<const std::uint8_t*>(buffer.data())};
43
44 // Create a document with the file name and file content.
45
46 auto doc = make_document(
47 kvp(FILE_NAME, std::filesystem::path(filePath).filename()),
48 kvp(FILE_PATH, filePath),
49 kvp(FILE_DATA, data));
50
51 // Insert the document into the collection.
52 collection.insert_one(doc.view());
53
54 std::cout << "Upload successful for: " << filePath << std::endl;
55 return true;
56}
Let’s write a similar helper function to perform the download. The code below takes the file name, destination folder, and a MongoDB collection as inputs. This function searches for a file by its name in the specified MongoDB collection, extracts its binary data, and saves it to the specified destination folder.
Here are the key steps within the download function:
  1. Create a filter query to find the file.
  2. Use the query to find the document in the collection.
  3. Extract and save binary data — the binary data is accessed using bsoncxx::document::view and then retrieved from the document using binaryDocView[FILE_DATA].get_binary().
  4. Create a file in the destination folder and write the binary content into the file.
1// Download a file from a collection to a given folder.
2bool download(const std::string& fileName, const std::string& destinationFolder, mongocxx::collection& collection)
3{
4 // Create a query to find the file by filename
5 auto filter = make_document(kvp(FILE_NAME, fileName));
6
7 // Find the document in the collection
8 auto result = collection.find_one(filter.view());
9
10 if (result)
11 {
12 // Get the binary data from the document
13 bsoncxx::document::view binaryDocView = result->view();
14 auto binaryData = binaryDocView[FILE_DATA].get_binary();
15
16 // Create a file to save the binary data
17 std::ofstream file(destinationFolder + fileName, std::ios::binary);
18 if (!file)
19 {
20 std::cout << "Failed to create the file: " << fileName << " at " << destinationFolder << std::endl;
21 return false;
22 }
23
24 // Write the binary data to the file
25 file.write(reinterpret_cast<const char*>(binaryData.bytes), binaryData.size);
26
27 std::cout << "Download successful for: " << fileName << " at " << destinationFolder << std::endl;
28 return true;
29 }
30 else
31 {
32 std::cout << "File not found in the collection: " << fileName << std::endl;
33 return false;
34 }
35}

The main() function

With the helper functions in place to perform upload and download, let’s write the main function that will drive this application. Here are the key steps within the main function:
  1. Connect to MongoDB: Establish a connection to MongoDB by creating a mongocxx::client instance.
  2. Fetch the database (fileStorage) and collection (files) to store the files.
  3. Upload all files found in the specified uploadFolder: Recursively iterate through the folder using std::filesystem::recursive_directory_iterator. For each file found, call the upload function to upload the file to the MongoDB collection.
  4. Download specific files with known filenames (fileName1 and fileName2) by calling download function to retrieve and save the files to the downloadFolder.
  5. Similarly, download all files in the collection by calling find({}) to get a cursor and iterate through each document in the collection, extracting the file name and then calling download function to download and save the file to the downloadFolder.
    Note: In a real-world situation, calling find({}) should be done with some kind of filtering/pagination to avoid issues with memory consumption and performance.
Make sure to get the connection string (URI), update it to mongoURIStr, and set the different path and filenames to the ones on your disk.
1int main()
2{
3 try
4 {
5 auto mongoURIStr = "<Insert MongoDB Connection String>";
6 static const mongocxx::uri mongoURI = mongocxx::uri{ mongoURIStr };
7
8 // Create an instance.
9 mongocxx::instance inst{};
10
11 mongocxx::options::client client_options;
12 auto api = mongocxx::options::server_api{ mongocxx::options::server_api::version::k_version_1 };
13 client_options.server_api_opts(api);
14 mongocxx::client conn{ mongoURI, client_options};
15
16 const std::string dbName = "fileStorage";
17 const std::string collName = "files";
18
19 auto fileStorageDB = conn.database(dbName);
20 auto filesCollection = fileStorageDB.collection(collName);
21 // Drop previous data.
22 filesCollection.drop();
23
24 // Upload all files in the upload folder.
25 const std::string uploadFolder = "/Users/bishtr/repos/fileStorage/upload/";
26 for (const auto & filePath : std::filesystem::directory_iterator(uploadFolder))
27 {
28 if(std::filesystem::is_directory(filePath))
29 continue;
30
31 if(!upload(filePath.path().string(), filesCollection))
32 {
33 std::cout << "Upload failed for: " << filePath.path().string() << std::endl;
34 }
35 }
36
37 // Download files to the download folder.
38 const std::string downloadFolder = "/Users/bishtr/repos/fileStorage/download/";
39
40 // Search with specific filenames and download it.
41 const std::string fileName1 = "image-15.jpg", fileName2 = "Hi Seed Shaker 120bpm On Accents.wav";
42 for ( auto fileName : {fileName1, fileName2} )
43 {
44 if (!download(fileName, downloadFolder, filesCollection))
45 {
46 std::cout << "Download failed for: " << fileName << std::endl;
47 }
48 }
49
50 // Download all files in the collection.
51 auto cursor = filesCollection.find({});
52 for (auto&& doc : cursor)
53 {
54 auto fileName = std::string(doc[FILE_NAME].get_string().value);
55 if (!download(fileName, downloadFolder, filesCollection))
56 {
57 std::cout << "Download failed for: " << fileName << std::endl;
58 }
59 }
60 }
61 catch(const std::exception& e)
62 {
63 std::cout << "Exception encountered: " << e.what() << std::endl;
64 }
65
66 return 0;
67}

Application in action

Before executing this application, add some files (like images or audios) under the uploadFolder directory.
Files to be uploaded from local disk to MongoDB.
Execute the application and you’ll observe output like this, signifying that the files are successfully uploaded and downloaded.
Application output showing successful uploads and downloads.
You can see the collection in Atlas or MongoDB Compass reflecting the files uploaded via the application.
Collection with binary data in MongoDB Compass.
You will observe the files getting downloaded into the specified downloadFolder directory.
Files downloaded from MongoDB to local disk.

Conclusion

With this article, we covered storing and retrieving binary data from a MongoDB database, using the MongoDB C++ driver. MongoDB's robust capabilities, combined with the ease of use provided by the C++ driver, offer a powerful solution for handling file storage in C++ applications. We can't wait to see what you build next! Share your creation with the community and let us know how it turned out!

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Choose the Right Chunking Strategy for Your LLM Application


Jun 17, 2024 | 16 min read
Tutorial

How to Migrate PostgreSQL to MongoDB With Confluent Kafka


Aug 30, 2024 | 10 min read
Tutorial

Building Generative AI Applications Using MongoDB: Harnessing the Power of Atlas Vector Search and Open Source Models


Sep 18, 2024 | 10 min read
Tutorial

How to Develop a Web App With Netlify Serverless Functions and MongoDB


Aug 30, 2024 | 6 min read
Table of Contents
  • Prerequisites