The Difference a (Field) Name Makes: Reduce Document Size and Increase Performance
It sometimes feels like I have the perfect job for a software geek who loves to travel.
I was recently at a customer event in Greece. During one of the breaks, I joined a group of developers who were having an animated debate on the terrace.
The big question was whether it's better to use camelCase or snake_case for field names in MongoDB documents. The first part of my response was that it's mainly a style decision, likely influenced by the conventions of the programming language you're using for your application. The second part was that this decision does have a right answer, and your decision will impact performance.
This article will answer that question and demonstrate how other design decisions regarding data representation within a document impact the performance of your application.
So which is it—
camelCase
or
snake_case
?
How MongoDB stores documents in its cache
The MongoDB database has a built-in LRU (Least Recently Used) cache that holds an in-memory copy of recently accessed documents. The documents are stored in BSON format.
Let's take this document as an example:
{
_id: 81873,
color: "Red",
size: "Small",
shape: "Cylinder",
props: {
edge: 2,
face: 3
},
coords: [ 2.2, 5.1]
}
This is how that document is stored in the cache:
Figure 1.
A table with a row for each of the fields in the document seen above.
For each document, the BSON contains:
The type of the field (int32, string, document, etc.).
The name of the field.
The length of the value (if the field type doesn't have a fixed size).
The field's value.
Why should you care about document size?
The purpose of the database cache is to speed up queries. If the requested document is already in the cache, then the query isn't slowed down by having to fetch the data from disk.
The smaller the size of each document, the more documents can fit in the cache (memory isn't infinite).
The more documents that fit in the cache, the higher the probability that the document(s) your application requests are already in the cache.
Smaller documents also reduce the volume of data that needs to be sent over the network between the database and your application.
Optimizing document size
I start with a baseline document and find its size. I then step through a number of optimizations, and for each one, I measure how the document size changes. In none of the steps do I reduce how much information is held in the document.
Baseline
Our initial document has this form:
{ ...
"top_level_name_1_middle_level_name_1_bottom_level_name_1": "Your data goes here",
"top_level_name_1_middle_level_name_1_bottom_level_name_2": "",
"top_level_name_1_middle_level_name_1_bottom_level_name_3": "Your data goes here",
"top_level_name_1_middle_level_name_1_bottom_level_name_4": "",
...
"top_level_name_2_middle_level_name_5_bottom_level_name_5": "Your data goes here",
"top_level_name_2_middle_level_name_5_bottom_level_name_6": "",
"top_level_name_2_middle_level_name_5_bottom_level_name_7": "Your data goes here",
"top_level_name_2_middle_level_name_5_bottom_level_name_8": "",
"top_level_name_2_middle_level_name_5_bottom_level_name_9": "Your data goes here",
...
"top_level_name_10_middle_level_name_10_bottom_level_name_9": "Your data goes here",
"top_level_name_10_middle_level_name_10_bottom_level_name_10": ""
}
The document contains 1,000 fields, all at the top level of the document. Half of the fields contain the string "Your data goes here," while the other half contain an empty string.
My collection contains a single document with this structure. I use MongoDB Compass to check the size of this document in the cache (note that the disk copy (Storage Size in the capture below) will be smaller as it's compressed):
Figure 2.
MongoDB Compass showing that schema 1 contains 1 document with a document size of 72.82 KB.
Compass shows that the document size (in memory/cache) is 72.82 KB. This is our baseline measurement.
Adding hierarchy
For the first optimization, I added extra structure to the document. Rather than having 1,000 fields at the top level of the document, I have 10 fields that each contain 10 sub-fields. Those sub-fields in turn contain 10 fields, all of which are strings:
{
...
"top_level_name_1": {
"middle_level_name_1": {
"bottom_level_name_1": "Your data goes here",
"bottom_level_name_2": "",
"bottom_level_name_3": "Your data goes here",
"bottom_level_name_4": "",
"bottom_level_name_5": "Your data goes here",
"bottom_level_name_6": "",
"bottom_level_name_7": "Your data goes here",
"bottom_level_name_8": "",
"bottom_level_name_9": "Your data goes here",
"bottom_level_name_10": ""
},
...
"middle_level_name_10": {
...
"bottom_level_name_9": "Your data goes here",
"bottom_level_name_10": ""
}
},
...
"top_level_name_10": {
...
}
}
Note that we haven't lost any information from the field names. Instead of having a field named
top_level_name_1_middle_level_name_1_bottom_level_name_1
, we have one named
top_level_name_1.middle_level_name_1.bottom_level_name_1
(note the "dot notation" used to indicate field levels within the document hierarchy.)
A quick check with Compass shows that this more structured document has reduced the document size:
Figure 3.
MongoDB Compass showing that schema 2 contains 1 document with a document size of 38.46 KB.
The more organized document uses
38.46 KB
of memory. That's almost a
50% reduction
in the size of the original document. That means almost twice as many documents will fit in the database cache.
The reason that the document has shrunk is that we're storing shorter field names.
No context or information has been lost, and we have documents that are easier for people to understand.
Replace empty strings with null
50% of the lowest-level fields contain an empty string. What happens if we store null instead?
{
...
"top_level_name_1": {
"middle_level_name_1": {
"bottom_level_name_1": "Your data goes here",
"bottom_level_name_2": null,
"bottom_level_name_3": "Your data goes here",
"bottom_level_name_4": null,
"bottom_level_name_5": "Your data goes here",
"bottom_level_name_6": null,
"bottom_level_name_7": "Your data goes here",
"bottom_level_name_8": null,
"bottom_level_name_9": "Your data goes here",
"bottom_level_name_10": null
},
...
"middle_level_name_10": {
...
"bottom_level_name_9": "Your data goes here",
"bottom_level_name_10": null
}
},
...
"top_level_name_10": {
...
}
}
Compass shows us a further small reduction in document size:
Figure 4.
MongoDB Compass showing that schema 3 contains 1 document with a document size of 35.96 KB.
The size has been reduced from 38.46 KB to
35.96 KB
.
The saving comes because we no longer need to store the length or value of the empty strings.
Removing null fields
The polymorphic nature of MongoDB collections means that different documents in the same collection can contain different fields. Rather than storing fields that contain null values, we can remove those fields altogether (note that querying a missing field yields the same result as querying on one set to null):
{
...
"top_level_name_1": {
"middle_level_name_1": {
"bottom_level_name_1": "Your data goes here",
"bottom_level_name_3": "Your data goes here",
"bottom_level_name_5": "Your data goes here",
"bottom_level_name_7": "Your data goes here",
"bottom_level_name_9": "Your data goes here",
},
...
"middle_level_name_10": {
...
"bottom_level_name_9": "Your data goes here",
}
},
...
"top_level_name_10": {
...
}
}
This means that we no longer have to store those empty fields (including the field names).
Compass confirms a notable saving:
Figure 5.
MongoDB Compass showing that schema 4 contains 1 document with a document size of 25.36 KB.
The new document consumes
25.36 KB
of memory compared to 35.96 KB for the document with null fields.
camelCase vs. snake_case
Finally, we get to answer the question as to which is more performant: camelCase or snake_case:
{
...
"topLevelName1": {
"middleLevelName1": {
"bottomLevelName1": "Your data goes here",
"bottomLevelName3": "Your data goes here",
"bottomLevelName5": "Your data goes here",
"bottomLevelName7": "Your data goes here",
"bottomLevelName9": "Your data goes here",
},
...
"middleLevelName10": {
...
"bottomLevelName9": "Your data goes here",
}
},
...
"topLevelName10": {
...
}
}
Compass shows that camelCase has it!
Figure 6.
MongoDB Compass showing that schema 5 contains 1 document with a document size of 23.53 KB.
The new document uses
23.53 KB
, which is a
7% saving
over the 25.36 KB when using snake_case. In total, the schema changes have reduced the document size by
67.7%
.
As you've probably figured out by now, camelCase results in smaller documents because the field names are shorter—we no longer have to waste cache space storing thousands of _ characters.
Taking it too far
In the steps taken so far, no information has been lost. The field names are just as instructive as they were in our original document. If you structure your hierarchy around the natural structure of your data (e.g., have a field representing an address with sub-fields for street number, street name, city...), then the final document is easier for humans to understand than the original. The hierarchy also makes it faster to find documents using unindexed keys.
It's a win-win-win.
What about taking things to the extreme? If shorter field names mean we can fit more documents into cache, then should we opt for this?
{
...
"a1": {
"b1": {
"c1": "Your data goes here",
"c3": "Your data goes here",
"c5": "Your data goes here",
"c7": "Your data goes here",
"c9": "Your data goes here",
},
...
"b10": {
...
"c9": "Your data goes here",
}
},
...
"a10": {
...
}
}
Compass confirms that it delivers a significant saving:
Figure 7.
MongoDB Compass showing that schema 6 contains 1 document with a document size of 15.02 KB.
At
15.02 KB
, this is a
36% saving
over the already-optimized document.
However, this has come at a cost. The shorter field names have made it much more difficult for people to read the document and understand its contents.
Shortening field names can produce smaller documents, which can speed up your application. However, you need to be able to maintain your application, so it doesn't make sense to go too far. Pick names that are concise but still convey the meaning of the field's value.
Summary
If you'd like to reproduce the results and perform your own experiments, then you can use this
MongoDB VS Studio playground
to recreate the results:
// This is a playground for MongoDB for VS Code Extension.
// It creates 6 collections that each contain 1 document. All 6 collections
// contain the same information but with different field names. This is to
// demonstrate the different ways that MongoDB can store the same data and what
// impact your design decisions have on document size (and therefore how many
// documents can fit in the database cache → performance).
use('FieldNames');
const schema1 = db.getCollection('schema1');
const schema2 = db.getCollection('schema2');
const schema3 = db.getCollection('schema3');
const schema4 = db.getCollection('schema4');
const schema5 = db.getCollection('schema5');
const schema6 = db.getCollection('schema6');
schema1.drop();
schema2.drop();
schema3.drop();
schema4.drop();
schema5.drop();
schema6.drop();
let doc = {};
for (let outer = 1; outer <= 10; outer++) {
for(let middle = 1; middle <= 10; middle++) {
for(let inner = 1; inner <= 10; inner++) {
doc[`top_level_name_${outer}_middle_level_name_${middle}_bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : "";
}
}
}
schema1.insertOne(doc);
doc = {};
for (let outer = 1; outer <= 10; outer++) {
let middleLevel = {};
for(let middle = 1; middle <= 10; middle++) {
let innerLevel = {};
for(let inner = 1; inner <= 10; inner++) {
innerLevel[`bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : "";
}
middleLevel[`middle_level_name_${middle}`] = innerLevel;
}
doc[`top_level_name_${outer}`] = middleLevel;
}
schema2.insertOne(doc);
doc = {};
for (let outer = 1; outer <= 10; outer++) {
let middleLevel = {};
for(let middle = 1; middle <= 10; middle++) {
let innerLevel = {};
for(let inner = 1; inner <= 10; inner++) {
innerLevel[`bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : null;
}
middleLevel[`middle_level_name_${middle}`] = innerLevel;
}
doc[`top_level_name_${outer}`] = middleLevel;
}
schema3.insertOne(doc);
doc = {};
for (let outer = 1; outer <= 10; outer++) {
let middleLevel = {};
for(let middle = 1; middle <= 10; middle++) {
let innerLevel = {};
for(let inner = 1; inner <= 10; inner++) {
if (inner % 2 !== 0) {
innerLevel[`bottom_level_name_${inner}`] = "Your data goes here";
}
}
middleLevel[`middle_level_name_${middle}`] = innerLevel;
}
doc[`top_level_name_${outer}`] = middleLevel;
}
schema4.insertOne(doc);
doc = {};
for (let outer = 1; outer <= 10; outer++) {
let middleLevel = {};
for(let middle = 1; middle <= 10; middle++) {
let innerLevel = {};
for(let inner = 1; inner <= 10; inner++) {
if (inner % 2 !== 0) {
innerLevel[`bottomLevelName${inner}`] = "Your data goes here";
}
}
middleLevel[`middleLevelName${middle}`] = innerLevel;
}
doc[`topLevelName${outer}`] = middleLevel;
}
schema5.insertOne(doc);
doc = {};
for (let outer = 1; outer <= 10; outer++) {
let middleLevel = {};
for(let middle = 1; middle <= 10; middle++) {
let innerLevel = {};
for(let inner = 1; inner <= 10; inner++) {
if (inner % 2 !== 0) {
innerLevel[`c${inner}`] = "Your data goes here";
}
}
middleLevel[`b${middle}`] = innerLevel;
}
doc[`a${outer}`] = middleLevel;
}
schema6.insertOne(doc);
Learn more about MongoDB design reviews
Design reviews
are a chance for a design expert from MongoDB to advise you on how best to use MongoDB for your application. The reviews are focused on making you successful using MongoDB. It's never too early to request a review. By engaging us early (perhaps before you've even decided to use MongoDB), we can advise you when you have the best opportunity to act on it.
This article explained how designing a MongoDB schema that matches how your application works with data can meet your performance requirements without needing a cache layer. If you want help to come up with that schema, then a design review is how to get that help.
Would your application benefit from a review?
Schedule your design review
.
September 3, 2025