Store Large Files
On this page
Overview
In this guide, you can learn how to store and retrieve large files in MongoDB by using GridFS. The GridFS storage system splits files into chunks when storing them and reassembles those files when retrieving them. The driver's implementation of GridFS is an abstraction that manages the operations and organization of the file storage.
Use GridFS if the size of any of your files exceeds the BSON document size limit of 16 MB. For more detailed information about whether GridFS is suitable for your use case, see GridFS in the MongoDB Server manual.
How GridFS Works
GridFS organizes files in a bucket, a group of MongoDB collections that contain the chunks of files and information describing them. The bucket contains the following collections:
chunks
: Stores the binary file chunksfiles
: Stores the file metadata
The driver creates the GridFS bucket, if it doesn't already exist, when you first
write data to it. The bucket contains the chunks
and files
collections
prefixed with the default bucket name fs
, unless you specify a different
name. To ensure efficient retrieval of the files and related metadata, the driver
creates an index on each collection. The driver ensures that these indexes exist
before performing read and write operations on the GridFS bucket.
For more information about GridFS indexes, see GridFS Indexes in the MongoDB Server manual.
When using GridFS to store files, the driver splits the files into smaller
chunks, each represented by a separate document in the chunks
collection.
It also creates a document in the files
collection that contains
a file ID, file name, and other file metadata.
The following diagram shows how GridFS splits files when they are uploaded to a bucket:
When retrieving files, GridFS fetches the metadata from the files
collection in the specified bucket and uses the information to reconstruct
the file from documents in the chunks
collection.
Create a GridFS Bucket
To begin using GridFS to store or retrieve files, create a new instance of the
GridFSBucket
class, passing in an IMongoDatabase
object that represents your
database. This method accesses an existing bucket or creates
a new bucket if one does not exist.
The following example creates a new instance of the GridFSBucket
class for the
db
database:
var client = new MongoClient("<connection string>"); var database = client.GetDatabase("db"); // Creates a GridFS bucket or references an existing one var bucket = new GridFSBucket(database);
Customize the Bucket
You can customize the GridFS bucket configuration by passing an instance
of the GridFSBucketOptions
class to
the GridFSBucket()
constructor. The following table describes the properties in the
GridFSBucketOptions
class:
Field | Description |
---|---|
| The bucket name to use as a prefix for the files and chunks collections.
The default value is Data type: |
| The chunk size that GridFS splits files into. The default value is 255 KB. Data type: |
| The read concern to use for bucket operations. The default value is the database's read concern. Data type: ReadConcern |
| The read preference to use for bucket operations. The default value is the database's read preference. Data type: ReadPreference |
| The write concern to use for bucket operations. The default value is the database's write concern. Data type: WriteConcern |
The following example creates a bucket named "myCustomBucket"
by passing an instance
of the GridFSBucketOptions
class to the GridFSBucket()
constructor:
var options = new GridFSBucketOptions { BucketName = "myCustomBucket" }; var customBucket = new GridFSBucket(database, options);
Upload Files
You can upload files to a GridFS bucket by using the following methods:
OpenUploadStream()
orOpenUploadStreamAsync()
: Opens a new upload stream to which you can write file contentsUploadFromStream()
orUploadFromStreamAsync()
: Uploads the contents of an existing stream to a GridFS file
The following sections describe how to use these methods.
Write to an Upload Stream
Use the OpenUploadStream()
or OpenUploadStreamAsync()
method to create an upload
stream for a given file name. These methods accept the following parameters:
Parameter | Description |
---|---|
| The name of the file to upload. Data type: |
| Optional. An instance of the Data type: GridFSUploadOptions |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
This code example demonstrates how to open an upload stream by performing the following steps:
Calls the
OpenUploadStream()
method to open a writable GridFS stream for a file named"my_file"
Calls the
Write()
method to write data tomy_file
Calls the
Close()
method to close the stream that points tomy_file
Select the Synchronous or Asynchronous tab to see the corresponding code:
using (var uploader = bucket.OpenUploadStream("my_file")) { // ASCII for "HelloWorld" byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; uploader.Write(bytes, 0, bytes.Length); uploader.Close(); }
using (var uploader = await bucket.OpenUploadStreamAsync("my_file", options)) { // ASCII for "HelloWorld" byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 }; await uploader.WriteAsync(bytes, 0, bytes.Length); await uploader.CloseAsync(); }
To customize the upload stream configuration, pass an instance of the
GridFSUploadOptions
class to the OpenUploadStream()
or OpenUploadStreamAsync()
method. The GridFSUploadOptions
class contains the following properties:
Property | Description |
---|---|
| The number of chunks to upload in each batch. The default value is 16 MB
divided by the value of the Data type: |
| The size of each chunk except the last, which is smaller. The default value is 255 KB. Data type: |
| Metadata to store with the file, including the following elements:
The default value is Data type: BsonDocument |
The following example performs the same steps as the preceding example, but also uses
the ChunkSizeBytes
option to specify the size of each chunk. Select the
Synchronous or Asynchronous tab to see the corresponding
code.
Upload an Existing Stream
Use the UploadFromStream()
or UploadFromStreamAsync()
method to upload the
contents of a stream to a new GridFS file. These methods accept the following parameters:
Parameter | Description |
---|---|
| The name of the file to upload. Data type: |
| The stream from which to read the file contents. Data type: Stream |
| Optional. An instance of the Data type: GridFSUploadOptions |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
This code example demonstrates how to open an upload stream by performing the following steps:
Opens a file located at
/path/to/input_file
as a stream in binary read modeCalls the
UploadFromStream()
method to write the contents of the stream to a GridFS file named"new_file"
Select the Synchronous or Asynchronous tab to see the corresponding code.
using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read)) { bucket.UploadFromStream("new_file", fileStream); }
using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read)) { await bucket.UploadFromStreamAsync("new_file", fileStream); }
Download Files
You can download files from a GridFS bucket by using the following methods:
OpenDownloadStream()
orOpenDownloadStreamAsync()
: Opens a new download stream from which you can read file contentsDownloadToStream()
orDownloadToStreamAsync()
: Writes the contents of a GridFS file to an existing stream
The following sections describe these methods in more detail.
Read From a Download Stream
Use the OpenDownloadStream()
or OpenDownloadStreamAsync()
method to create a
download stream. These methods accept the following parameters:
Parameter | Description |
---|---|
| The Data type: BsonValue |
| Optional. An instance of the Data type: GridFSDownloadOptions |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
The following code example demonstrates how to open a download stream by performing the following steps:
Retrieves the
_id
value of the GridFS file named"new_file"
Calls the
OpenDownloadStream()
method and passes the_id
value to open the file as a readable GridFS streamCreates a
buffer
vector to store the file contentsCalls the
Read()
method to read the file contents from thedownloader
stream into the vector
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var doc = bucket.Find(filter).FirstOrDefault(); if (doc != null) { using (var downloader = bucket.OpenDownloadStream(doc.Id)) { var buffer = new byte[downloader.Length]; downloader.Read(buffer, 0, buffer.Length); // Process the buffer as needed } }
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var cursor = await bucket.FindAsync(filter); var fileInfoList = await cursor.ToListAsync(); var doc = fileInfoList.FirstOrDefault(); if (doc != null) { using (var downloader = await bucket.OpenDownloadStreamAsync(doc.Id)) { var buffer = new byte[downloader.Length]; await downloader.ReadAsync(buffer, 0, buffer.Length); // Process the buffer as needed } }
To customize the download stream configuration, pass an instance of the
GridFSDownloadOptions
class to the OpenDownloadStream()
method. The
GridFSDownloadOptions
class contains the following property:
Property | Description |
---|---|
| Indicates whether the stream supports seeking, the ability to query and
change the current position in a stream. The default value is Data type: |
The following example performs the same steps as the preceding example, but also sets
the Seekable
option to true
to specify that the stream is seekable.
Select the Synchronous or Asynchronous tab to see the corresponding code.
Download to an Existing Stream
Use the DownloadToStream()
or DownloadToStreamAsync()
method to download the
contents of a GridFS file to an existing stream. These methods accept the following parameters:
Parameter | Description |
---|---|
| The Data type: BsonValue |
| The stream that the .NET/C# Driver downloads the GridFS file to. This property's
value must be an object that implements the Data type: Stream |
| Optional. An instance of the Data type: GridFSDownloadOptions |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
The following code example demonstrates how to download to an existing stream by performing the following actions:
Opens a file located at
/path/to/output_file
as a stream in binary write modeRetrieves the
_id
value of the GridFS file named"new_file"
Calls the
DownloadToStream()
method and passes the_id
value to download the contents of"new_file"
to a stream
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var doc = bucket.Find(filter).FirstOrDefault(); if (doc != null) { using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write)) { bucket.DownloadToStream(doc.Id, outputFile); } }
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var cursor = await bucket.FindAsync(filter); var fileInfoList = await cursor.ToListAsync(); var doc = fileInfoList.FirstOrDefault(); if (doc != null) { using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write)) { await bucket.DownloadToStreamAsync(doc.Id, outputFile); } }
Find Files
To find files in a GridFS bucket, call the Find()
or FindAsync()
method
on your GridFSBucket
instance. These methods accept the following parameters:
Parameter | Description |
---|---|
| A query filter that specifies the entries to match in the Data type: |
| The stream from which to read the file contents. Data type: Stream |
| Optional. An instance of the Data type: GridFSFindOptions |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
The following code example shows how to retrieve and print file metadata
from files in a GridFS bucket. The Find()
method returns an
IAsyncCursor<GridFSFileInfo>
instance from
which you can access the results. It uses a foreach
loop to iterate through
the returned cursor and display the contents of the files uploaded in the
Upload Files examples.
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Empty; var files = bucket.Find(filter); foreach (var file in files.ToEnumerable()) { Console.WriteLine(file.ToJson()); }
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" : { "$date" : ... }, "filename" : "new_file" } { "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" : { "$date" : ... }, "filename" : "my_file" }
var filter = Builders<GridFSFileInfo>.Filter.Empty; var files = await bucket.FindAsync(filter); await files.ForEachAsync(file => Console.Out.WriteLineAsync(file.ToJson()))
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" : { "$date" : ... }, "filename" : "new_file" } { "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" : { "$date" : ... }, "filename" : "my_file" }
To customize the find operation, pass an instance of the
GridFSFindOptions
class to the Find()
or FindAsync()
method. The
GridFSFindOptions
class contains the following properties:
Property | Description |
---|---|
| The sort order of the results. If you don't specify a sort order, the method returns the results in the order in which they were inserted. Data type: |
Delete Files
To delete files from a GridFS bucket, call the Delete()
or DeleteAsync()
method
on your GridFSBucket
instance. This method removes a file's metadata collection and
its associated chunks from your bucket.
The Delete
and DeleteAsync()
methods accept the following parameters:
Parameter | Description |
---|---|
| The Data type: BsonValue |
| Optional. A token that you can use to cancel the operation. Data type: CancellationToken |
The following code example shows how to delete a file named "my_file"
passing its _id
value to delete_file()
:
Uses the
Builders
class to create a filter that matches the file named"my_file"
Uses the
Find()
method to find the file named"my_file"
Passes the
_id
value of the file to theDelete()
method to delete the file
Select the Synchronous or Asynchronous tab to see the corresponding code.
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var doc = bucket.Find(filter).FirstOrDefault(); if (doc != null) { bucket.Delete(doc.Id); }
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file"); var cursor = await bucket.FindAsync(filter); var fileInfoList = await cursor.ToListAsync(); var doc = fileInfoList.FirstOrDefault(); if (doc != null) { await bucket.DeleteAsync(doc.Id); }
Note
File Revisions
The Delete()
and DeleteAsync()
methods support deleting only one file at a time.
If you want to delete each file revision, or files with different upload
times that share the same file name, collect the _id
values of each revision.
Then, pass each _id
value in separate calls to the Delete()
or DeleteAsync()
method.
API Documentation
To learn more about the classes used on this page, see the following API documentation:
To learn more about the methods in the GridFSBucket
class used on this page, see the
following API documentation: