Docs Menu

Create Ranges in a Sharded Cluster

In most situations a sharded cluster will create, split, and distribute ranges automatically without user intervention. However, in some cases, MongoDB cannot create enough ranges or distribute data fast enough to support the required throughput.

For example, if you want to ingest a large volume of data into a cluster where you have inserts distributed across shards, pre-splitting the ranges of an empty sharded collection can improve throughput.

Note

Starting in MongoDB 6.0, the balancer does not distribute empty ranges. To pre-split the collection, use moveRange to distribute the empty ranges across the shards in the cluster. moveRange automatically splits the range to be moved, meaning moveRange performs both splitting and moving. You don't need to manually split the range with split.

Alternatively, by defining the zones and zone ranges before sharding an empty or a non-existing collection, the shard collection operation creates ranges for the defined zone ranges as well as any additional ranges to cover the entire range of the shard key values and performs an initial range distribution based on the zone ranges. For more information, see Empty Collection.

Warning

Only pre-split ranges for an empty collection. Manually splitting ranges for a populated collection can lead to unpredictable range ranges and sizes as well as inefficient or ineffective balancing behavior.

The following example shows how to manually generate and distribute ranges. The example uses a collection in the sample.documents namespace and shards that collection on the email field.

1

Create a function to define shard key ranges. This example creates ranges based on possible email addresses because email will be used as the shard key.

// Generate two character prefix email ranges.
function getRanges(shards) {
let ranges = [];
// The total number of prefix possibilities is 26 * 26 (aa to zz).
// We calculate the number of combinations to add in a range.
const totalCombinationsPerShard = 26 * 26 / shards.length;
let minKey = {
email: MinKey
};
let maxKey = {
email: MinKey
};
for(let i = 1; i <= shards.length; ++i) {
// 97 is lower case 'a' in ASCII.
let prefix = 97 + ((totalCombinationsPerShard*i)/26);
let suffix = 97 + ((totalCombinationsPerShard*i)%26);
let initialChars = String.fromCharCode(prefix) + String.fromCharCode(suffix);
minKey = maxKey;
maxKey = {
email: i !== shards.length ? initialChars : MaxKey
};
ranges.push({
min: minKey,
max: maxKey
});
}
return ranges;
}
2

To shard the sample.documents collection, run this command:

db.adminCommand( {
shardCollection: 'sample.documents',
key: {
email: 1
}
} );

Note

Because the collection is empty, the shardCollection command automatically creates an index on the email field to support the shard key.

3

To allocate the shards to the ranges defined in step 1, run the following command:

const shards = db.adminCommand({
listShards: 1
}).shards;
let ranges = getRanges(shards);
for (let i = 0; i < ranges.length; ++i) {
db.adminCommand({
moveRange: 'sample.documents',
min: ranges[i].min,
max: ranges[i].max,
toShard: shards[i]._id
});
}

The moveRange command distributes the empty ranges across the shards in the cluster. The cluster is now optimized for bulk inserts.

To further improve performance, create additional indexes to support your application's common queries.