Docs Menu
Docs Home
/ / /
C#/.NET
/

Store Large Files

On this page

  • Overview
  • How GridFS Works
  • Create a GridFS Bucket
  • Customize the Bucket
  • Upload Files
  • Write to an Upload Stream
  • Upload an Existing Stream
  • Download Files
  • Read From a Download Stream
  • Download to an Existing Stream
  • Find Files
  • Delete Files
  • API Documentation

In this guide, you can learn how to store and retrieve large files in MongoDB by using GridFS. The GridFS storage system splits files into chunks when storing them and reassembles those files when retrieving them. The driver's implementation of GridFS is an abstraction that manages the operations and organization of the file storage.

Use GridFS if the size of any of your files exceeds the BSON document size limit of 16 MB. For more detailed information about whether GridFS is suitable for your use case, see GridFS in the MongoDB Server manual.

GridFS organizes files in a bucket, a group of MongoDB collections that contain the chunks of files and information describing them. The bucket contains the following collections:

  • chunks: Stores the binary file chunks

  • files: Stores the file metadata

The driver creates the GridFS bucket, if it doesn't already exist, when you first write data to it. The bucket contains the chunks and files collections prefixed with the default bucket name fs, unless you specify a different name. To ensure efficient retrieval of the files and related metadata, the driver creates an index on each collection. The driver ensures that these indexes exist before performing read and write operations on the GridFS bucket.

For more information about GridFS indexes, see GridFS Indexes in the MongoDB Server manual.

When using GridFS to store files, the driver splits the files into smaller chunks, each represented by a separate document in the chunks collection. It also creates a document in the files collection that contains a file ID, file name, and other file metadata.

The following diagram shows how GridFS splits files when they are uploaded to a bucket:

A diagram that shows how GridFS uploads a file to a bucket

When retrieving files, GridFS fetches the metadata from the files collection in the specified bucket and uses the information to reconstruct the file from documents in the chunks collection.

To begin using GridFS to store or retrieve files, create a new instance of the GridFSBucket class, passing in an IMongoDatabase object that represents your database. This method accesses an existing bucket or creates a new bucket if one does not exist.

The following example creates a new instance of the GridFSBucket class for the db database:

var client = new MongoClient("<connection string>");
var database = client.GetDatabase("db");
// Creates a GridFS bucket or references an existing one
var bucket = new GridFSBucket(database);

You can customize the GridFS bucket configuration by passing an instance of the GridFSBucketOptions class to the GridFSBucket() constructor. The following table describes the properties in the GridFSBucketOptions class:

Field
Description

BucketName

The bucket name to use as a prefix for the files and chunks collections. The default value is "fs".

Data type: string

ChunkSizeBytes

The chunk size that GridFS splits files into. The default value is 255 KB.

Data type: integer

ReadConcern

The read concern to use for bucket operations. The default value is the database's read concern.

Data type: ReadConcern

ReadPreference

The read preference to use for bucket operations. The default value is the database's read preference.

Data type: ReadPreference

WriteConcern

The write concern to use for bucket operations. The default value is the database's write concern.

Data type: WriteConcern

The following example creates a bucket named "myCustomBucket" by passing an instance of the GridFSBucketOptions class to the GridFSBucket() constructor:

var options = new GridFSBucketOptions { BucketName = "myCustomBucket" };
var customBucket = new GridFSBucket(database, options);

You can upload files to a GridFS bucket by using the following methods:

  • OpenUploadStream() or OpenUploadStreamAsync(): Opens a new upload stream to which you can write file contents

  • UploadFromStream() or UploadFromStreamAsync(): Uploads the contents of an existing stream to a GridFS file

The following sections describe how to use these methods.

Use the OpenUploadStream() or OpenUploadStreamAsync() method to create an upload stream for a given file name. These methods accept the following parameters:

Parameter
Description

filename

The name of the file to upload.

Data type: string

options

Optional. An instance of the GridFSUploadOptions class that specifies the configuration for the upload stream. The default value is null.

Data type: GridFSUploadOptions

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

This code example demonstrates how to open an upload stream by performing the following steps:

  • Calls the OpenUploadStream() method to open a writable GridFS stream for a file named "my_file"

  • Calls the Write() method to write data to my_file

  • Calls the Close() method to close the stream that points to my_file

Select the Synchronous or Asynchronous tab to see the corresponding code:

using (var uploader = bucket.OpenUploadStream("my_file"))
{
// ASCII for "HelloWorld"
byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 };
uploader.Write(bytes, 0, bytes.Length);
uploader.Close();
}
using (var uploader = await bucket.OpenUploadStreamAsync("my_file", options))
{
// ASCII for "HelloWorld"
byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 };
await uploader.WriteAsync(bytes, 0, bytes.Length);
await uploader.CloseAsync();
}

To customize the upload stream configuration, pass an instance of the GridFSUploadOptions class to the OpenUploadStream() or OpenUploadStreamAsync() method. The GridFSUploadOptions class contains the following properties:

Property
Description

BatchSize

The number of chunks to upload in each batch. The default value is 16 MB divided by the value of the ChunkSizeBytes property.

Data type: int?

ChunkSizeBytes

The size of each chunk except the last, which is smaller. The default value is 255 KB.

Data type: int?

Metadata

Metadata to store with the file, including the following elements:

  • The _id of the file

  • The name of the file

  • The length and size of the file

  • The upload date and time

  • A metadata document in which you can store other information

The default value is null.

Data type: BsonDocument

The following example performs the same steps as the preceding example, but also uses the ChunkSizeBytes option to specify the size of each chunk. Select the Synchronous or Asynchronous tab to see the corresponding code.

var options = new GridFSUploadOptions
{
ChunkSizeBytes = 1048576 // 1 MB
};
using (var uploader = bucket.OpenUploadStream("my_file", options))
{
// ASCII for "HelloWorld"
byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 };
uploader.Write(bytes, 0, bytes.Length);
uploader.Close();
}
var options = new GridFSUploadOptions
{
ChunkSizeBytes = 1048576 // 1 MB
};
using (var uploader = await bucket.OpenUploadStreamAsync("my_file", options))
{
// ASCII for "HelloWorld"
byte[] bytes = { 72, 101, 108, 108, 111, 87, 111, 114, 108, 100 };
await uploader.WriteAsync(bytes, 0, bytes.Length);
await uploader.CloseAsync();
}

Use the UploadFromStream() or UploadFromStreamAsync() method to upload the contents of a stream to a new GridFS file. These methods accept the following parameters:

Parameter
Description

filename

The name of the file to upload.

Data type: string

source

The stream from which to read the file contents.

Data type: Stream

options

Optional. An instance of the GridFSUploadOptions class that specifies the configuration for the upload stream. The default value is null.

Data type: GridFSUploadOptions

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

This code example demonstrates how to open an upload stream by performing the following steps:

  • Opens a file located at /path/to/input_file as a stream in binary read mode

  • Calls the UploadFromStream() method to write the contents of the stream to a GridFS file named "new_file"

Select the Synchronous or Asynchronous tab to see the corresponding code.

using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read))
{
bucket.UploadFromStream("new_file", fileStream);
}
using (var fileStream = new FileStream("/path/to/input_file", FileMode.Open, FileAccess.Read))
{
await bucket.UploadFromStreamAsync("new_file", fileStream);
}

You can download files from a GridFS bucket by using the following methods:

  • OpenDownloadStream() or OpenDownloadStreamAsync(): Opens a new download stream from which you can read file contents

  • DownloadToStream() or DownloadToStreamAsync(): Writes the contents of a GridFS file to an existing stream

The following sections describe these methods in more detail.

Use the OpenDownloadStream() or OpenDownloadStreamAsync() method to create a download stream. These methods accept the following parameters:

Parameter
Description

id

The _id value of the file to download.

Data type: BsonValue

options

Optional. An instance of the GridFSDownloadOptions class that specifies the configuration for the download stream. The default value is null.

Data type: GridFSDownloadOptions

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

The following code example demonstrates how to open a download stream by performing the following steps:

  • Retrieves the _id value of the GridFS file named "new_file"

  • Calls the OpenDownloadStream() method and passes the _id value to open the file as a readable GridFS stream

  • Creates a buffer vector to store the file contents

  • Calls the Read() method to read the file contents from the downloader stream into the vector

Select the Synchronous or Asynchronous tab to see the corresponding code.

var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var doc = bucket.Find(filter).FirstOrDefault();
if (doc != null)
{
using (var downloader = bucket.OpenDownloadStream(doc.Id))
{
var buffer = new byte[downloader.Length];
downloader.Read(buffer, 0, buffer.Length);
// Process the buffer as needed
}
}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var cursor = await bucket.FindAsync(filter);
var fileInfoList = await cursor.ToListAsync();
var doc = fileInfoList.FirstOrDefault();
if (doc != null)
{
using (var downloader = await bucket.OpenDownloadStreamAsync(doc.Id))
{
var buffer = new byte[downloader.Length];
await downloader.ReadAsync(buffer, 0, buffer.Length);
// Process the buffer as needed
}
}

To customize the download stream configuration, pass an instance of the GridFSDownloadOptions class to the OpenDownloadStream() method. The GridFSDownloadOptions class contains the following property:

Property
Description

Seekable

Indicates whether the stream supports seeking, the ability to query and change the current position in a stream. The default value is false.

Data type: bool?

The following example performs the same steps as the preceding example, but also sets the Seekable option to true to specify that the stream is seekable.

Select the Synchronous or Asynchronous tab to see the corresponding code.

var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var doc = bucket.Find(filter).FirstOrDefault();
if (doc != null)
{
var options = new GridFSDownloadOptions
{
Seekable = true
};
using (var downloader = bucket.OpenDownloadStream(id, options))
{
var buffer = new byte[downloader.Length];
downloader.Read(buffer, 0, buffer.Length);
// Process the buffer as needed
}
}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var cursor = await bucket.FindAsync(filter);
var fileInfoList = await cursor.ToListAsync();
var doc = fileInfoList.FirstOrDefault();
if (doc != null)
{
var options = new GridFSDownloadOptions
{
Seekable = true
};
using (var downloader = await bucket.OpenDownloadStreamAsync(doc.Id, options))
{
var buffer = new byte[downloader.Length];
await downloader.ReadAsync(buffer, 0, buffer.Length);
// Process the buffer as needed
}
}

Use the DownloadToStream() or DownloadToStreamAsync() method to download the contents of a GridFS file to an existing stream. These methods accept the following parameters:

Parameter
Description

id

The _id value of the file to download.

Data type: BsonValue

destination

The stream that the .NET/C# Driver downloads the GridFS file to. This property's value must be an object that implements the Stream class.

Data type: Stream

options

Optional. An instance of the GridFSDownloadOptions class that specifies the configuration for the download stream. The default value is null.

Data type: GridFSDownloadOptions

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

The following code example demonstrates how to download to an existing stream by performing the following actions:

  • Opens a file located at /path/to/output_file as a stream in binary write mode

  • Retrieves the _id value of the GridFS file named "new_file"

  • Calls the DownloadToStream() method and passes the _id value to download the contents of "new_file" to a stream

Select the Synchronous or Asynchronous tab to see the corresponding code.

var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var doc = bucket.Find(filter).FirstOrDefault();
if (doc != null)
{
using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write))
{
bucket.DownloadToStream(doc.Id, outputFile);
}
}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var cursor = await bucket.FindAsync(filter);
var fileInfoList = await cursor.ToListAsync();
var doc = fileInfoList.FirstOrDefault();
if (doc != null)
{
using (var outputFile = new FileStream("/path/to/output_file", FileMode.Create, FileAccess.Write))
{
await bucket.DownloadToStreamAsync(doc.Id, outputFile);
}
}

To find files in a GridFS bucket, call the Find() or FindAsync() method on your GridFSBucket instance. These methods accept the following parameters:

Parameter
Description

filter

A query filter that specifies the entries to match in the files collection.

Data type: FilterDefinition<GridFSFileInfo>. For more information, see the API documentation for the Find() method.

source

The stream from which to read the file contents.

Data type: Stream

options

Optional. An instance of the GridFSFindOptions class that specifies the configuration for the find operation. The default value is null.

Data type: GridFSFindOptions

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

The following code example shows how to retrieve and print file metadata from files in a GridFS bucket. The Find() method returns an IAsyncCursor<GridFSFileInfo> instance from which you can access the results. It uses a foreach loop to iterate through the returned cursor and display the contents of the files uploaded in the Upload Files examples.

Select the Synchronous or Asynchronous tab to see the corresponding code.

var filter = Builders<GridFSFileInfo>.Filter.Empty;
var files = bucket.Find(filter);
foreach (var file in files.ToEnumerable())
{
Console.WriteLine(file.ToJson());
}
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" :
{ "$date" : ... }, "filename" : "new_file" }
{ "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" :
{ "$date" : ... }, "filename" : "my_file" }
var filter = Builders<GridFSFileInfo>.Filter.Empty;
var files = await bucket.FindAsync(filter);
await files.ForEachAsync(file => Console.Out.WriteLineAsync(file.ToJson()))
{ "_id" : { "$oid" : "..." }, "length" : 13, "chunkSize" : 261120, "uploadDate" :
{ "$date" : ... }, "filename" : "new_file" }
{ "_id" : { "$oid" : "..." }, "length" : 50, "chunkSize" : 1048576, "uploadDate" :
{ "$date" : ... }, "filename" : "my_file" }

To customize the find operation, pass an instance of the GridFSFindOptions class to the Find() or FindAsync() method. The GridFSFindOptions class contains the following properties:

Property
Description

Sort

The sort order of the results. If you don't specify a sort order, the method returns the results in the order in which they were inserted.

Data type: SortDefinition<GridFSFileInfo>. For more information, see the API documentation for the Sort property.

To delete files from a GridFS bucket, call the Delete() or DeleteAsync() method on your GridFSBucket instance. This method removes a file's metadata collection and its associated chunks from your bucket.

The Delete and DeleteAsync() methods accept the following parameters:

Parameter
Description

id

The _id of the file to delete.

Data type: BsonValue

cancellationToken

Optional. A token that you can use to cancel the operation.

Data type: CancellationToken

The following code example shows how to delete a file named "my_file" passing its _id value to delete_file():

  • Uses the Builders class to create a filter that matches the file named "my_file"

  • Uses the Find() method to find the file named "my_file"

  • Passes the _id value of the file to the Delete() method to delete the file

Select the Synchronous or Asynchronous tab to see the corresponding code.

var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var doc = bucket.Find(filter).FirstOrDefault();
if (doc != null)
{
bucket.Delete(doc.Id);
}
var filter = Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, "new_file");
var cursor = await bucket.FindAsync(filter);
var fileInfoList = await cursor.ToListAsync();
var doc = fileInfoList.FirstOrDefault();
if (doc != null)
{
await bucket.DeleteAsync(doc.Id);
}

Note

File Revisions

The Delete() and DeleteAsync() methods support deleting only one file at a time. If you want to delete each file revision, or files with different upload times that share the same file name, collect the _id values of each revision. Then, pass each _id value in separate calls to the Delete() or DeleteAsync() method.

To learn more about the classes used on this page, see the following API documentation:

To learn more about the methods in the GridFSBucket class used on this page, see the following API documentation:

Back

Search Geospatially