Encrypted Collections
On this page
Field level encryption comes with performance and storage costs. Every field you choose to encrypt:
Adds writes to insert and update operations.
Requires additional storage, because MongoDB maintains an index of encrypted fields to improve query performance.
This section summarizes the storage and write impact of encrypting collections, and explains how to compact encrypted collection indexes to minimize these costs. If you want to encrypt fields and configure them for querying, see Encrypted Fields and Enabled Queries.
Overview
Queryable Encryption introduces the ability to encrypt sensitive fields in your documents using randomized encryption, while still being able to query the encrypted fields.
With Queryable Encryption, a given plaintext value always encrypts to a different ciphertext, while still remaining queryable. To enable this functionality, Queryable Encryption uses three data structures:
Two metadata collections
A field in every document in the encrypted collection called
__safeContent__
Warning
It is critical that these data structures are not modified or deleted, or query results will be incorrect.
Metadata Collections
When you create an encrypted collection, MongoDB creates two metadata collections:
enxcol_.<collectionName>.esc
, referred to asESC
enxcol_.<collectionName>.ecoc
, referred to asECOC
Example
If you create a collection called "patients", MongoDB creates the following metadata collections:
enxcol_.patients.esc
enxcol_.patients.ecoc
When you insert documents with a queryable encrypted field, MongoDB updates the metadata collections to maintain an index that enables querying. The field becomes an "indexed field". This comes at a cost in storage and write speed for every such field.
Important
When you drop an encrypted collection, drop the associated metadata collections immediately afterwards:
enxcol_.<collectionName>.esc
enxcol_.<collectionName>.ecoc
Otherwise, re-creating the collection with the same name puts the metadata collections in a conflicted state that consumes excess storage space and degrades CRUD performance.
Equality and Range Query Impact
Equality queries carry fixed additional costs for storage and write operations. Range query costs depend on the queryable field's parameters. Tightly bounding these queries greatly decreases their performance impact.
Write Costs for Equality Queryable Fields
Insert Operations
When inserting a document, each indexed field requires two additional writes to metadata collections.
One write to
ESC
One write to
ECOC
Example
Inserting a document with two indexed fields requires:
One write to the encrypted collection.
Four writes to the metadata collections.
Update Operations
When updating a document, each indexed field requires two additional writes to metadata collections.
One write to
ESC
One write to
ECOC
Example
Updating a document with two indexed fields requires:
One write to the encrypted collection.
Four writes to the metadata collections.
Delete Operations
When deleting a document, indexed fields do not require any additional writes.
Storage Costs for Equality Queryable Fields
Before any metadata compaction, the ESC
and ECOC
contain one metadata
document for every field/value pair of every indexed field. Indexing one
encrypted field/value pair across 1000 documents requires 1000 documents in
the ESC
and 1000 documents in the ECOC
.
Note
A Queryable Encryption collection that encrypts every field can take up to 2-3 times the storage space, to account for metadata collections. For example, a 1 GB collection may have a storage requirement of 2-3 GB.
Best Practices
Don't encrypt fields that don't need it. Most data only requires a small subset of fields to be encrypted, such as those that contain Personally Identifiable Information.
Don't enable range queries on a field if equality queries are sufficient for your users.
For encrypted fields with range queries enabled, review the field configuration, especially the min, max, and precision parameters. Setting tight bounds for range queries greatly decreases the performance impact of those fields.
Model your data and prototype it to determine the actual storage and write increases for your deployment.
Metadata Collection Compaction
As you insert or update documents, the metadata collections change and grow.
Metadata collection compaction empties the ECOC
and reduces the size of the ESC
.
Scheduling Metadata Compaction
Important
In the worst case, running metadata compaction reveals the number of unique field/value pairs inserted across all documents since the previous compaction.
For details on the exact information revealed by metadata compaction, see "Section 6: Theoretical Analysis" and "Section 9: Guidelines" in MongoDB's Queryable Encryption Technical Paper.
As a best practice, keep track of the following information:
encfields
: the number of encrypted fields per document.docinserts
: the number of documents inserted since the last compaction.valinserts
: the number of unique field/value pairs inserted since the last compaction.
To reduce the size of the ESC
metadata collection by at least
t
documents, run metadata compaction when the following formula is
satisfied:
(encfields
· docinserts
) - valinserts
≥ t
The number of encrypted fields per document multiplied by the number of
insertions since last compaction, minus the number of unique field/value
pairs inserted across all documents since the last compaction, should be
greater than or equal to the number of documents to remove from the ESC
.
Example
For example, in a collection with six encrypted fields, you can reduce the
size of the ESC
by at least 1000 documents when total documents inserted
and unique field/value pairs inserted since the last compaction meet the
following conditions:
(6 · docinserts
) - valinserts
≥ 1000
The formula would be satisfied, for example, if 200 documents were inserted since the last compaction with a total of 200 unique field/value pairs across all documents, or if 400 documents were inserted since the last compaction with 700 unique field/value pairs.
Running Metadata Compaction
You must run metadata compaction manually. Use mongosh
and
run the db.collection.compactStructuredEncryptionData()
command:
Example
const eDB = "encryption" const eKV = "__keyVault" const secretDB = "records" const secretCollection = "patients" const localKey = fs.readFileSync("master-key.txt") const localKeyProvider = { key: localKey } const queryableEncryptionOpts = { kmsProviders: { local: localKeyProvider }, keyVaultNamespace: `${eDB}.${eKV}`, } const encryptedClient = Mongo("localhost:27017", queryableEncryptionOpts) const encryptedDB = encryptedClient.getDB(secretDB) const encryptedCollection = encryptedDB.getCollection(secretCollection) encryptedCollection.compactStructuredEncryptionData()
{ "stats": { ... }, "ok": 1, ... }
You can check the size of a metadata collection using mongosh
and running the db.collection.totalSize()
command.
Example
In this example, the encrypted collection is named "patients".
db.enxcol_.patients.esc.totalSize()
1407960328