Manage Sharded Collections
On this page
Important
The Managed Sharded Collections UI is deprecated. Ops Manager 7.0.0 will not include this feature.
Overview
sharding distributes data across multiple machines. MongoDB uses sharding to support deployments with very large data sets and high throughput operations. Ops Manager can create sharded clusters and sharded collections on those clusters.
This page explains how Ops Manager can manage sharded collections including determining how documents are distributed within sharded collections.
Sharding involves defining a shard key which is then used to partition documents within a collection. See the MongoDB manual for a more detailed explanation of sharding.
A shard key consists of one or more indexed fields that exist in every document within a collection. A shard key on a compound index is known as a compound shard key. Each collection can only have one shard key. You cannot change the shard key once you shard a collection. A sharded cluster can support both sharded and unsharded collections. See the MongoDB manual for best practices on choosing a shard key.
The sharded cluster attempts to distribute the documents in a sharded collection evenly among the shards in the cluster. You can use sharding zones to manage the distribution of documents within the collection.
Zone sharding associates ranges of a collection's shard key values to one or more shards in the cluster called a zone. MongoDB eventually routes documents within a given range to the associated zone. This allows for targeted data distribution. Ops Manager supports both zoned and default sharding. See the MongoDB manual for a more detailed explanation of zone sharding.
Note
Tag Aware Sharding and Zone Sharding are interchangeable. Tag Aware sharding transitions to Zone Sharding with the release of MongoDB 3.4.
The following procedures explain how Ops Manager can:
Manage your sharded collections
Change when sharded cluster Balancer runs
Create new sharded collections
Import your sharded collections into Ops Manager
Define zones for sharded clusters
Define ranges for sharded collections
Each procedure assumes you have clicked the Deployment button to display the Deployment page first.
Enable Sharded Collection Management
You can use Ops Manager to manage sharded collections. If you want Ops Manager to manage sharded collections, you need to run the import process in Ops Manager first. This ensures that no collections have their configurations overriden accidentally.
Troubleshoot any failed imports. (Optional)
There are a few possible errors that could happen when importing sharded collections.
- Overlapping ranges
Ops Manager does not support overlapped defined ranges.
Example
A compound shard key may appear to have overlapped ranges when it does not. This example explains the difference.
A simple compound shard key comprises two integers with values between 1 and 10. The chunk ranges for a collection where each chunk is approximately 64 MB are:
minmax[$min, $min]
[1, 8]
[1,8]
[3,1]
[3,1]
[5,2]
[5,2]
[5,10]
[5,10]
[7,3]
[7,3]
[$max,$max]
The ranges are based on the two values combined (or compound) and not each value individually. The second value goes up and down in each chunk, but the combination always increases from minimum to maximum.
You can use
mongosh
to resolve this issue on the database directly.Check the status of the shard ranges.
Review the ranges for possible overlap.
- Data type mismatch
For each range, Ops Manager requires the minimum and maximum values of each field in a shard key to be the same BSON data type. A compound shard key in a range can use a different BSON type for each field in the key. Ops Manager verifies this when the sharded collections are imported and when ranges are created.
Note
The Min key and Max key are different data types and are the only exception to not mixing BSON data types in the range.
You can use
mongosh
to resolve this issue on the database directly.Check the status of the shard ranges.
Type Checking for the minimum and maximum values.
- Data type invalid
The minimum and maximum values for a range can only use eight BSON data types:
String
Integer
Double
Long
Date
Timestamp
ObjectId
MinKey / MaxKey
You can use
mongosh
to resolve this issue on the database directly.Check the status of the shard ranges.
Type Checking for the minimum and maximum values.
Change when Sharded Cluster Balancer Runs
You can use Ops Manager to set when your sharded cluster balances data across shards.
Each sharded cluster has a process called a balancer that works to ensure an even distribution of chunks across each shard. Migrating chunks across your sharded cluster can impact performance. Balancer efficiency depends upon shard key selection. Use the Ops Manager Balancer manager interface to set specific windows during which the Balancer can run, such as scheduling balancing rounds during off-peak hours.
If you want to change the balancing window for your sharded cluster:
Configure a window of time when the Balancer runs.
To change when the Balancer runs:
Click the to the right of Schedule the Balancer.
In the Start box, type the time when the window should begin using 24-hour time.
In the Stop box, type the time when the window should end using 24-hour time.
Click Save.
Note
Values for Start and Stop can be between
00:00
and 23:59
, but Stop can be a value that
is earlier than Start. If Stop is earlier
than Start, then Stop is treated as being
on the next day.
Example
If you want your migration window to be between 11:00 pm and
2:00 am, you would set Start to 23:00
and
Stop to 02:00
the next day.
Create a New Sharded Collection
You can create a new sharded collecton using Ops Manager.
Important
If the field or fields chosen as the shard key are not indexed, the Automation creates the shard key index in the foreground. This operation may potentially impact production workloads. For more information on foreground index builds, see Index Build Operations on a Populated Collection.
Ops Manager does not support compound indexes that cover shard keys. To learn more about a compound index on the shard key, see shard key indexes.
Type the name for the Shard Key into the Shard Key 1 field.
There are two mutually exclusive options for shard keys:
Check hashed if you want to use a hashed shard key. You can expand Advanced Settings to optionally optimize the distribution of documents in your collection. To optimize, you can do the following:
Select the checkbox for
presplitHashedZones
to perform initial chunk creation and distribution for an empty or non-existing collection based on the defined zones and zone ranges for the collection.Specify the minimum number of chunks to create initially when sharding an empty collection with a hashed shard key. We recommend
2
chunks, but you can specify up to8192
per shard. This setting corresponds to the MongoDBnumInitialChunks
setting for sharded collections.
To learn more about these options, see sh.shardCollection().
Check Enforce Unique Key if you want to have unique key names.
A Shard Key cannot be unique and hashed.
If you want to create a compound shard key, click + add another field.
You may hash up to one key in a compound shard key.
Check Enforce Unique Key if you want to have unique key names.
A compound shard key cannot include more than three keys.
Important
Hashing a compound shard key is supported starting in MongoDB version 4.4. If you hash a compound shard key and want to downgrade to FCV 4.2, you must first drop the sharded collection with a hashed key.
Click Set Up Ranges to zone shards. (Optional)
If you want to use zone sharding on this collection, follow the steps under Define how collections are sharded using ranges.
Configure Zoned Sharding
Note
Follow the next two procedures in this section if you intend on using zoned sharding for your sharded collections. Otherwise, you may skip this section.
Group Shards into Zones
Zones are a named project of one or more shards. After creating one or more zones, you can assign a range of shard key values and their corresponding documents to a zone. MongoDB eventually routes documents within a given range to the associated zone. Each zone can include multiple ranges and multiple shards. Each shard can belong to more than one zone. Each shard displays its zone(s) to the right of its name under Deployment.
Click Review and Deploy.
If you try to delete a shard zone that has a tagged range associated with it, it fails. If you try to remove the last shard from a zone that has ranges tagged to it, that also fails. You must move all tagged ranges to another zone before you can remove the last shard from that zone.
Define How Collections are Sharded using Ranges
Ranges specify minimum and maximum values for each field in a shard key. Each defined range is associated to a single zone. MongoDB eventually routes documents within a given range to the associated zone. The minimum value is an inclusive lower bound of the shard key values. The maximum value is exclusive upper bound of the shard key values. A range can belong to only one zone, but a zone can have multiple ranges.
Documents are routed based on the configured zones and ranges once the balancer moves the range into the desired zone. Once that occurs, documents within a range are routed to the associated zone and those outside a range may be routed to any shard in the cluster.
For each shard key, enter the minimum and maximum values and select the associated zone.
Compound shard keys have one range per component shard key but together are associated with only one zone.
Note
If your shard key is a compound shard key with a hashed field, valid range value types for the hashed field are:
NumberLong
minKey
maxKey
A range's minimum value is inclusive and the maximum value is exclusive.
Example
The following two ranges do not overlap:
min | max | zone |
---|---|---|
1 | 10 | A |
10 | 20 | B |
Note
Min and Max are absolute values: the absolute minimum and maximum value of any range without explicitly listing a specific value.
Each range can be associated only to a single zone. You cannot assign the same range to more than one zone.
Disable Sharded Collection Management
Click Unmanage.
Important
When a sharded collection is unmanaged, your sharded collections and zones are not deleted. These collections and zones can no longer be managed from the Ops Manager interface.