Docs 菜单

shardCollection

shardCollection

分片 a collection to distribute its documents across shards. The shardCollection command must be run against the admin database.

注意

在 6.0 版本中进行了更改

Starting in MongoDB 6.0, sharding a collection does not require you to first run the enableSharding command to configure the database.

提示

mongosh 中,该命令也可通过 sh.shardCollection() 辅助方法运行。

辅助方法对 mongosh 用户来说很方便,但它们返回的信息级别可能与数据库命令不同。如果不追求方便或需要额外的返回字段,请使用数据库命令。

此命令可用于以下环境中托管的部署:

重要

无服务器实例不支持此命令。 有关更多信息,请参阅不支持的命令。

要运行 shardCollection,请使用 db.runCommand( { <command> } ) 方法。

该命令采用以下形式:

db.adminCommand(
{
shardCollection: "<database>.<collection>",
key: { <field1>: <1|"hashed">, ... },
unique: <boolean>,
numInitialChunks: <integer>,
presplitHashedZones: <boolean>,
collation: { locale: "simple" },
timeseries: <object>
}
)

该命令接受以下字段:

字段
类型
说明

shardCollection

字符串

要分片的集合的命名空间,格式为 <database>.<collection>

key

文档

The document that specifies the field or fields to use as the 片键.

{ <field1>: <1|"hashed">, ... }

将字段值设置为以下任一项:

片键 must be supported by an index. Unless the collection is empty, the index must exist prior to the shardCollection command. If the collection is empty, MongoDB creates the index prior to sharding the collection if the index that can support the shard key does not already exist.

另请参阅分片键索引

unique

布尔

Specify true to ensure that the underlying index enforces a unique constraint. Defaults to false.

You cannot specify true when using hashed shard keys.

numInitialChunks

整型

Specifies the initial number of chunks to create across all shards in the cluster when sharding an empty collection with a 哈希片键. MongoDB will then create and balance chunks across the cluster. The numInitialChunks must result in less than 8192 per shard.

If the collection is not empty or the shard key does not contain a hashed field, the operation returns an error.

  • If sharding with presplitHashedZones: true, MongoDB attempts to evenly distribute the specified number of chunks across the zones in the cluster.

  • If sharding with presplitHashedZones: false or omitted and no zones and zone ranges are defined for the empty collection, MongoDB attempts to evenly distributed the specified number of chunks across the shards in the cluster.

  • If sharding with presplitHashedZones: false or omitted and zones and zone ranges have been defined for the empty collection, numInitialChunks has no effect.

collation

文档

Optional. If the collection specified to shardCollection has a default 排序规则, you must include a collation document with { locale : "simple" }, or the shardCollection command fails. At least one of the indexes whose fields support the shard key pattern must have the simple collation.

布尔

Optional. Specify true to perform initial chunk creation and distribution for an empty or non-existing collection based on the defined zones and zone ranges for the collection. For hashed sharding only.

shardCollection with presplitHashedZones: true returns an error if any of the following are true:

对象

Optional. Specify this option to create a new sharded time series collection.

To shard an existing time series collection, omit this parameter.

When the collection specified to shardCollection is a time series collection and the timeseries option is not specified, MongoDB uses the values that define the existing time series collection to populate the timeseries field.

For detailed syntax, see Time Series Options.

5.1 版本中的新功能

5.1 版本中的新功能

To create a new 时间序列集合 that is sharded, specify the timeseries option to shardCollection

The timeseries option takes the following fields:

字段
类型
说明

timeField

字符串

必需。包含每个时间序列文档中日期的字段的名称。时间序列集合中的文档必须具有有效 BSON 日期,以作为 timeField 的值。

从MongoDB 8.0开始,不推荐使用timeField作为时间序列集合中的分分片键。

metaField

字符串

可选。包含每个时间序列文档中元数据的字段的名称。指定字段中的元数据应是用于标记一系列独一无二的文档的数据。元数据应该很少改变(如有)。指定字段的名称可能不是 timeseries.timeField 或与 _id 相同。该字段可以是任何数据类型。

虽然metaField字段是可选的,但使用元数据可以改进查询优化。 例如,MongoDB 会自动为新集合的metaFieldtimeField字段创建复合索引。 如果您没有为此字段提供值,则仅根据时间对数据进行分桶。

granularity

字符串

Optional. Possible values are:

  • "seconds"

  • "minutes"

  • "hours"

By default, MongoDB sets the granularity to "seconds" for high-frequency ingestion.

Manually set the granularity parameter to improve performance by optimizing how data in the time series collection is stored internally. To select a value for granularity, choose the closest match to the time span between consecutive incoming measurements.

If you specify the timeseries.metaField, consider the time span between consecutive incoming measurements that have the same unique value for the metaField field. Measurements often have the same unique value for the metaField field if they come from the same source.

If you do not specify timeseries.metaField, consider the time span between all measurements that are inserted in the collection.

If you set the granularity parameter, you can't set the bucketMaxSpanSeconds and bucketRoundingSeconds parameters.

While you can change your shard key later, it is important to carefully consider your shard key choice to avoid scalability and perfomance issues.

另请参阅:

对时间序列集合进行分片时,您只能为分片键指定以下字段:

  • metaField

  • metaField 的子字段

  • timeField

您可以在分片键中指定这些字段的组合。不允许在分片键模式中使用任何其他字段,包括 _id

在您指定分片键时:

提示

避免指定 timeField 作为分片键。由于 timeField单调增加的,因此,可能导致所有写入都出现在集群中的单个数据段上。理想情况下,数据均匀分布在数据段之间。

要了解如何最好地选择分片键,请参阅:

从MongoDB 8.0开始,不推荐使用timeField作为时间序列集合中的分分片键。

Hashed shard keys use a hashed index or a compound hashed index as the shard key.

Use the form field: "hashed" to specify a hashed shard key field.

注意

If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.

另请参阅:

The shard collection operation (i.e. shardCollection command and the sh.shardCollection() helper) can perform initial chunk creation and distribution for an empty or a non-existing collection if zones and zone ranges have been defined for the collection. Initial chunk distribution allows for a faster setup of zoned sharding. After the initial distribution, the balancer manages the chunk distribution going forward per usual.

See Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection for an example. If sharding a collection using a ranged or single-field hashed shard key, the numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection.

To shard a collection using a compound hashed index, see Zone Sharding and Compound Hashed Indexes.

MongoDB 支持使用组合哈希索引对集合进行分片。在使用组合哈希分片键对空集合或不存在的集合进行分片时,需要满足额外的要求,MongoDB 才能执行初始数据块创建和分配。

The numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection and presplitHashedZones is false.

有关示例,请参阅为空集合或不存在的集合预先定义区域和区域范围

另请参阅:

If specifying unique: true:

  • If the collection is empty, shardCollection creates the unique index on the shard key if such an index does not already exist.

  • If the collection is not empty, you must create the index first before using shardCollection

尽管可以有以分片键为前缀的唯一复合索引,但如果使用unique参数,则集合必须在分片键上有唯一索引。

See also Sharded Collection and Unique Indexes

If the collection has a default 排序规则, the shardCollection command must include a collation parameter with the value { locale: "simple" }. For non-empty collections with a default collation, you must have at least one index with the simple collation whose fields support the shard key pattern.

You do not need to specify the collation option for collections without a collation. If you do specify the collation option for a collection with no collation, it will have no effect.

mongos uses "majority" for the 写入安全机制 of the shardCollection command, its helper sh.shardCollection(), and the sh.shardAndDistributeCollection() method.

The following operation enables sharding for the people collection in the records database and uses the zipcode field as the 片键:

db.adminCommand( { shardCollection: "records.people", key: { zipcode: 1 } } )

另请参阅: