Storing Binary Data with MongoDB and C++
Rate this tutorial
In modern applications, storing and retrieving binary files efficiently is a crucial requirement. MongoDB enables this with binary data type in the BSON which is a binary serialization format used to store documents in MongoDB. A BSON binary value is a byte array and has a subtype (like generic binary subtype, UUID, MD5, etc.) that indicates how to interpret the binary data. See BSON Types — MongoDB Manual for more information.
In this tutorial, we will write a console application in C++, using the MongoDB C++ driver to upload and download binary data.
Note:
- When using this method, remember that the BSON document size limit in MongoDB is 16 MB. If your binary files are larger than this limit, consider using GridFS for more efficient handling of large files. See GridFS example in C++ for reference.
- Developers often weigh the trade-offs and strategies when storing binary data in MongoDB. It's essential to ensure that you have also considered different strategies to optimize your data management approach.
- MongoDB Atlas account with a cluster created.
- IDE (like Microsoft Visual Studio or Microsoft Visual Studio Code) setup with the MongoDB C and C++ Driver installed. Follow the instructions in Getting Started with MongoDB and C++ to install MongoDB C/C++ drivers and set up the dev environment in Visual Studio. Installation instructions for other platforms are available.
- Compiler with C++17 support (for using
std::filesystem
operations). - Your machine’s IP address whitelisted. Note: You can add 0.0.0.0/0 as the IP address, which should allow access from any machine. This setting is not recommended for production use.
As part of the different BSON types, the C++ driver provides the b_binary struct that can be used for storing binary data value in a BSON document. See the API reference.
We start with defining the structure of our BSON document. We have defined three keys:
name
, path
, and data
. These contain the name of the file being uploaded, its full path from the disk, and the actual file data respectively. See a sample document below:In the code, these are defined with a
#define
so that it’s easy to modify them from a single place.Let’s add a helper function,
upload
, which accepts a file path and a MongoDB collection as inputs. Its primary purpose is to upload the file to the specified MongoDB collection by converting the file into a BSON binary value and constructing a BSON document to represent the file's metadata and content. Here are the key steps within the upload
function:- Open the file at the given path and get its size.
- The file's size is determined by moving the file pointer to the end of the file and then retrieving the current position, which corresponds to the file's size.
- The file pointer is then reset to the beginning of the file to read the content later.
- Read File Content into a Buffer: A
std::vector<char>
buffer is created with a size equal to the file's size to hold the file's binary data. - Create the BSON binary value.
- To represent the file content as BSON binary value, the code creates a
bsoncxx::types::b_binary
object. - The b_binary object includes the binary subtype (set to
bsoncxx::binary_sub_type::k_binary
), the file's size, and data.
- Create a BSON document with three fields:
name
,path
, anddata
. - Insert the document into the collection.
1 #include <mongocxx/client.hpp> 2 #include <bsoncxx/builder/basic/document.hpp> 3 #include <mongocxx/uri.hpp> 4 #include <mongocxx/instance.hpp> 5 6 #include <iostream> 7 #include <fstream> 8 #include <vector> 9 #include <filesystem> 10 11 #define FILE_NAME "name" 12 #define FILE_PATH "path" 13 #define FILE_DATA "data" 14 15 using bsoncxx::builder::basic::kvp; 16 using bsoncxx::builder::basic::make_document; 17 18 // Upload a file to the collection. 19 bool upload(const std::string& filePath, mongocxx::collection& collection) 20 { 21 // Open the binary file 22 std::ifstream file(filePath, std::ios::binary | std::ios::ate); 23 if (!file) 24 { 25 std::cout << "Failed to open the file: " << filePath << std::endl; 26 return false; 27 } 28 29 // Get the file size. 30 std::streamsize fileSize = file.tellg(); 31 file.seekg(0, std::ios::beg); 32 33 // Read the file content into a buffer 34 std::vector<char> buffer(fileSize); 35 if (!file.read(buffer.data(), fileSize)) 36 { 37 std::cout << "Failed to read the file: " << filePath << std::endl; 38 return false; 39 } 40 41 // Create the binary object for bsoncxx. 42 bsoncxx::types::b_binary data{bsoncxx::binary_sub_type::k_binary, static_cast<std::uint32_t>(fileSize), reinterpret_cast<const std::uint8_t*>(buffer.data())}; 43 44 // Create a document with the file name and file content. 45 46 auto doc = make_document( 47 kvp(FILE_NAME, std::filesystem::path(filePath).filename()), 48 kvp(FILE_PATH, filePath), 49 kvp(FILE_DATA, data)); 50 51 // Insert the document into the collection. 52 collection.insert_one(doc.view()); 53 54 std::cout << "Upload successful for: " << filePath << std::endl; 55 return true; 56 }
Let’s write a similar helper function to perform the download. The code below takes the file name, destination folder, and a MongoDB collection as inputs. This function searches for a file by its name in the specified MongoDB collection, extracts its binary data, and saves it to the specified destination folder.
Here are the key steps within the
download
function:- Create a filter query to find the file.
- Use the query to find the document in the collection.
- Extract and save binary data — the binary data is accessed using
bsoncxx::document::view
and then retrieved from the document usingbinaryDocView[FILE_DATA].get_binary()
. - Create a file in the destination folder and write the binary content into the file.
1 // Download a file from a collection to a given folder. 2 bool download(const std::string& fileName, const std::string& destinationFolder, mongocxx::collection& collection) 3 { 4 // Create a query to find the file by filename 5 auto filter = make_document(kvp(FILE_NAME, fileName)); 6 7 // Find the document in the collection 8 auto result = collection.find_one(filter.view()); 9 10 if (result) 11 { 12 // Get the binary data from the document 13 bsoncxx::document::view binaryDocView = result->view(); 14 auto binaryData = binaryDocView[FILE_DATA].get_binary(); 15 16 // Create a file to save the binary data 17 std::ofstream file(destinationFolder + fileName, std::ios::binary); 18 if (!file) 19 { 20 std::cout << "Failed to create the file: " << fileName << " at " << destinationFolder << std::endl; 21 return false; 22 } 23 24 // Write the binary data to the file 25 file.write(reinterpret_cast<const char*>(binaryData.bytes), binaryData.size); 26 27 std::cout << "Download successful for: " << fileName << " at " << destinationFolder << std::endl; 28 return true; 29 } 30 else 31 { 32 std::cout << "File not found in the collection: " << fileName << std::endl; 33 return false; 34 } 35 }
With the helper functions in place to perform upload and download, let’s write the main function that will drive this application. Here are the key steps within the
main
function:- Connect to MongoDB: Establish a connection to MongoDB by creating a mongocxx::client instance.
- Fetch the database (
fileStorage
) and collection (files
) to store the files. - Upload all files found in the specified uploadFolder: Recursively iterate through the folder using
std::filesystem::recursive_directory_iterator
. For each file found, call the upload function toupload
the file to the MongoDB collection. - Download specific files with known filenames (
fileName1
andfileName2
) by callingdownload
function to retrieve and save the files to thedownloadFolder
. - Similarly, download all files in the collection by calling
find({})
to get a cursor and iterate through each document in the collection, extracting the file name and then callingdownload
function to download and save the file to thedownloadFolder
.Note: In a real-world situation, callingfind({})
should be done with some kind of filtering/pagination to avoid issues with memory consumption and performance.
Make sure to get the connection string (URI), update it to
mongoURIStr
, and set the different path and filenames to the ones on your disk.1 int main() 2 { 3 try 4 { 5 auto mongoURIStr = "<Insert MongoDB Connection String>"; 6 static const mongocxx::uri mongoURI = mongocxx::uri{ mongoURIStr }; 7 8 // Create an instance. 9 mongocxx::instance inst{}; 10 11 mongocxx::options::client client_options; 12 auto api = mongocxx::options::server_api{ mongocxx::options::server_api::version::k_version_1 }; 13 client_options.server_api_opts(api); 14 mongocxx::client conn{ mongoURI, client_options}; 15 16 const std::string dbName = "fileStorage"; 17 const std::string collName = "files"; 18 19 auto fileStorageDB = conn.database(dbName); 20 auto filesCollection = fileStorageDB.collection(collName); 21 // Drop previous data. 22 filesCollection.drop(); 23 24 // Upload all files in the upload folder. 25 const std::string uploadFolder = "/Users/bishtr/repos/fileStorage/upload/"; 26 for (const auto & filePath : std::filesystem::directory_iterator(uploadFolder)) 27 { 28 if(std::filesystem::is_directory(filePath)) 29 continue; 30 31 if(!upload(filePath.path().string(), filesCollection)) 32 { 33 std::cout << "Upload failed for: " << filePath.path().string() << std::endl; 34 } 35 } 36 37 // Download files to the download folder. 38 const std::string downloadFolder = "/Users/bishtr/repos/fileStorage/download/"; 39 40 // Search with specific filenames and download it. 41 const std::string fileName1 = "image-15.jpg", fileName2 = "Hi Seed Shaker 120bpm On Accents.wav"; 42 for ( auto fileName : {fileName1, fileName2} ) 43 { 44 if (!download(fileName, downloadFolder, filesCollection)) 45 { 46 std::cout << "Download failed for: " << fileName << std::endl; 47 } 48 } 49 50 // Download all files in the collection. 51 auto cursor = filesCollection.find({}); 52 for (auto&& doc : cursor) 53 { 54 auto fileName = std::string(doc[FILE_NAME].get_string().value); 55 if (!download(fileName, downloadFolder, filesCollection)) 56 { 57 std::cout << "Download failed for: " << fileName << std::endl; 58 } 59 } 60 } 61 catch(const std::exception& e) 62 { 63 std::cout << "Exception encountered: " << e.what() << std::endl; 64 } 65 66 return 0; 67 }
Before executing this application, add some files (like images or audios) under the
uploadFolder
directory.Execute the application and you’ll observe output like this, signifying that the files are successfully uploaded and downloaded.
You can see the collection in Atlas or MongoDB Compass reflecting the files uploaded via the application.
You will observe the files getting downloaded into the specified
downloadFolder
directory.With this article, we covered storing and retrieving binary data from a MongoDB database, using the MongoDB C++ driver. MongoDB's robust capabilities, combined with the ease of use provided by the C++ driver, offer a powerful solution for handling file storage in C++ applications. We can't wait to see what you build next! Share your creation with the community and let us know how it turned out!
Related
Article
Realm Triggers Treats and Tricks - Document-Based Trigger Scheduling
Sep 09, 2024 | 5 min read
Industry Event
LONDON, UNITED KINGDOM | HYBRID
MongoDB + Vodafone Hackathon Developer Resources
Nov 22, 2024 - Nov 25, 2024