Process of storing images in MongoDB

Welcome to the MongoDB Community @Maria_N_A!

Per the earlier suggestions, there are three common approaches for working with images and other binary assets:

  • GridFS: As suggested by @Sudhesh_Gnanasekaran, large images (or binary blobs) can be stored using the GridFS API. This API is supported by official MongoDB drivers: it splits large files into smaller chunks (255KiB by default) which are stored as separate documents in an fs.chunks collection with a reference document including metadata in an fs.files collection (note: the default fs.* namespace can be changed). The GridFS API is a client-side implementation – a MongoDB deployment doesn’t have any special configuration for the underlying collection data. For more info on the implementation, see the GridFS spec on GitHub.

  • Inline: As suggested by @Prasad_Saya, smaller images (within the 16MB document size limit) can be stored directly in a MongoDB document using the BinData (binary data) BSON type.

  • Reference: As suggested by @Andrew_W, images can be saved to an API or filesystem, with only the image reference stored in the database.

Storing binary files in a database can be convenient for distributing across multiple locations (via replication), for working around file system limitations (eg files per directory or file naming), for serving streaming or protected content, or for storing larger assets that aren’t going to be served directly to end users. Aside from the GridFS documentation page that has already been linked in an earlier comment, Building MongoDB Applications with Binary Files using GridFS (part 1 and part 2) may also be helpful reading.

If images or large binary assets are being served directly to end users, the Reference approach is usually most suitable because files can be pushed out to an API and/or CDN (Content Delivery Network) and cached/resized for better user experience. There is less overhead serving images directly from a web server versus going through an application server and database server for every request. A downside of using references is that they can get out of sync with the source document.

There are also hybrid use cases, such as storing large images (for example, raw images from a digital camera or phone) in the database and then passing those to an API or image processing library to create resized versions which will be served directly to end users.

Before deciding to store images in your database, I would make sure there is a clear benefit for the intended use case.

Regards,
Stennie

15 Likes