Convert a Replica Set to a Sharded Cluster
On this page
Overview
This tutorial converts a single three-member replica set to a sharded cluster with two shards. Each shard is an independent three-member replica set. This tutorial is specific to MongoDB 6.0. For other versions of MongoDB, refer to the corresponding version of the MongoDB Manual.
The procedure is as follows:
Create the initial three-member replica set and insert data into a collection. See Set Up Initial Replica Set.
Start the config servers and a
mongos
. See Deploy Config Server Replica Set andmongos
.Add the initial replica set as a shard. See Add Initial Replica Set as a Shard.
Create a second shard and add to the cluster. See Add Second Shard.
Shard the desired collection. See Shard a Collection.
Considerations
Individual steps in these procedures note when downtime will occur.
Important
These procedures cause some downtime for your deployment.
Prerequisites
This tutorial uses a total of ten servers: one server for the
mongos
and three servers each for the first replica set, the second replica set, and the config server replica set.
Each server must have a resolvable domain, hostname, or IP address within your system.
The tutorial uses the default data directories (e.g. /data/db
and
/data/configdb
). Create the appropriate directories with
appropriate permissions. To use different paths, see
Configuration File Options .
Procedures
Set Up Initial Replica Set
This procedure creates the initial three-member replica set rs0
.
The replica set members are on the following hosts:
mongodb0.example.net
, mongodb1.example.net
, and
mongodb2.example.net
.
Start each member of the replica set with the appropriate options.
For each member, start a mongod
instance with the
following settings:
Set
replication.replSetName
option to the replica set name. If your application connects to more than one replica set, each set must have a distinct name.Set
net.bindIp
option to the hostname/ip or a comma-delimited list of hostnames/ips.Set any other settings as appropriate for your deployment.
In this tutorial, the three mongod
instances are
associated with the following hosts:
Replica Set Member | Hostname |
---|---|
Member 0 | mongodb0.example.net |
Member 1 | mongodb1.example.net |
Member 2 | mongodb2.example.net |
The following example specifies the replica set name and the ip
binding through the --replSet
and --bind_ip
command-line options:
Warning
Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.
mongod --replSet "rs0" --bind_ip localhost,<hostname(s)|ip address(es)>
For <hostname(s)|ip address(es)>
, specify the hostname(s) and/or
ip address(es) for your mongod
instance that remote
clients (including the other members of the replica set) can use to
connect to the instance.
Alternatively, you can also specify the replica set name
and the ip addresses
in a configuration file:
replication: replSetName: "rs0" net: bindIp: localhost,<hostname(s)|ip address(es)>
To start mongod
with a configuration file, specify the
configuration file's path with the --config
option:
mongod --config <path-to-config>
In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of this document.
Connect mongosh
to one of the mongod
instances.
From the same machine where one of the mongod
is running
(in this tutorial, mongodb0.example.net
), start
mongosh
. To connect to the mongod
listening to localhost on the default port of 27017
, simply issue:
mongosh
Depending on your path, you may need to specify the path to the
mongosh
binary.
If your mongod
is not running on the default port, specify the
--port
option for mongosh
.
Initiate the replica set.
From mongosh
, run rs.initiate()
on
replica set member 0.
Important
Run rs.initiate()
on just one and only one
mongod
instance for the replica set.
Important
To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.
Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.
rs.initiate( { _id : "rs0", members: [ { _id: 0, host: "mongodb0.example.net:27017" }, { _id: 1, host: "mongodb1.example.net:27017" }, { _id: 2, host: "mongodb2.example.net:27017" } ] })
MongoDB initiates a replica set, using the default replica set configuration.
Create and populate a new collection.
The following step adds one million documents to the collection
test_collection
and can take several minutes depending on
your system.
To determine the primary, use rs.status()
.
Issue the following operations on the primary of the replica set:
use test var bulk = db.test_collection.initializeUnorderedBulkOp(); people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina", "Katie", "Jeff"]; for(var i=0; i<1000000; i++){ user_id = i; name = people[Math.floor(Math.random()*people.length)]; number = Math.floor(Math.random()*10001); bulk.insert( { "user_id":user_id, "name":name, "number":number }); } bulk.execute();
For more information on deploying a replica set, see Deploy a Replica Set.
Deploy Config Server Replica Set and mongos
This procedure deploys the three-member replica set for the config
servers and the
mongos
.
The config servers use the following hosts:
mongodb7.example.net
,mongodb8.example.net
, andmongodb9.example.net
.The
mongos
usesmongodb6.example.net
.
Deploy the config servers as a three-member replica set.
Start a config server on mongodb7.example.net
,
mongodb8.example.net
, and mongodb9.example.net
. Specify the
same replica set name. The config servers use the default data
directory /data/configdb
and the default port 27019
.
Warning
Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.
mongod --configsvr --replSet configReplSet --bind_ip localhost,<hostname(s)|ip address(es)>
To modify the default settings or to include additional options
specific to your deployment, see mongod
or
Configuration File Options.
Connect mongosh
to one of the config servers and
run rs.initiate()
to initiate the replica set.
Important
Run rs.initiate()
on just one and only one
mongod
instance for the replica set.
Important
To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.
Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.
rs.initiate( { _id: "configReplSet", configsvr: true, members: [ { _id: 0, host: "mongodb07.example.net:27019" }, { _id: 1, host: "mongodb08.example.net:27019" }, { _id: 2, host: "mongodb09.example.net:27019" } ] } )
Start a mongos
instance.
On mongodb6.example.net
, start the mongos
specifying
the config server replica set name followed by a slash /
and at least
one of the config server hostnames and ports.
mongos --configdb configReplSet/mongodb07.example.net:27019,mongodb08.example.net:27019,mongodb09.example.net:27019 --bind_ip localhost,<hostname(s)|ip address(es)>
Restart the Replica Set as a Shard
For sharded clusters, mongod
instances for
the shards must explicitly specify its role as a shardsvr
,
either via the configuration file setting
sharding.clusterRole
or via the command line option
--shardsvr
.
Note
Determine the primary and secondary members.
Connect mongosh
to one of the members and run
rs.status()
to determine the primary and secondary members.
Restart secondary members with the --shardsvr
option.
One secondary at a time, shut down
and restart each secondary
with the --shardsvr
option.
Warning
This step requires some downtime for applications connected to
secondary members of the replica set. Applications connected to a
secondary may error with CannotVerifyAndSignLogicalTime
after
restarting the secondary until you perform the steps in
Add Initial Replica Set as a Shard. Restarting your application will
also stop it from receiving CannotVerifyAndSignLogicalTime
errors.
To continue to use the same port, include the --port
option. Include additional options, such as --bind_ip
, as
appropriate for your deployment.
Warning
Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.
mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>
Include any other options as appropriate for your deployment. Repeat this step for the other secondary.
Step down the primary.
Connect mongosh
to the primary and stepdown the primary.
Warning
This step requires some downtime. Applications may error with
CannotVerifyAndSignLogicalTime
after stepping down the primary
until you perform the steps in Add Initial Replica Set as a Shard.
Restarting your application will also stop it from receiving
CannotVerifyAndSignLogicalTime
errors.
rs.stepDown()
Restart the primary with the --shardsvr
option.
Shut down the primary and restart with the --shardsvr
option.
To continue to use the same port, include the --port
option.
mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>
Include any other options as appropriate for your deployment.
Add Initial Replica Set as a Shard
The following procedure adds the initial replica set rs0
as a shard.
Add the shard.
Add a shard to the cluster with the sh.addShard()
method:
sh.addShard( "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" )
Warning
Once the new shard is active, mongosh
and other clients must
always connect to the mongos
instance. Do not connect
directly to the mongod
instances. If your clients connect to
shards directly, you may create data or metadata inconsistencies.
Add Second Shard
The following procedure deploys a new replica set rs1
for the
second shard and adds it to the cluster. The replica set members are on
the following hosts: mongodb3.example.net
,
mongodb4.example.net
, and mongodb5.example.net
.
For sharded clusters, mongod
instances for
the shards must explicitly specify its role as a shardsvr
,
either via the configuration file setting
sharding.clusterRole
or via the command line option
--shardsvr
.
Note
Start each member of the replica set with the appropriate options.
For each member, start a mongod
, specifying the replica
set name through the --replSet
option
and its role as a shard with the
--shardsvr
option. Specify additional
options, such as --bind_ip
, as
appropriate.
Warning
Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.
For replication-specific parameters, see Replication Options.
mongod --replSet "rs1" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>
Repeat this step for the other two members of the rs1
replica set.
Initiate the replica set.
From mongosh
, run rs.initiate()
to
initiate a replica set that consists of the current member.
Important
Run rs.initiate()
on just one and only one
mongod
instance for the replica set.
Important
To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.
Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.
rs.initiate( { _id : "rs1", members: [ { _id: 0, host: "mongodb3.example.net:27017" }, { _id: 1, host: "mongodb4.example.net:27017" }, { _id: 2, host: "mongodb5.example.net:27017" } ] })
Add the shard.
In mongosh
, when connected to the
mongos
, add the shard to the cluster with the
sh.addShard()
method:
sh.addShard( "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" )
Shard a Collection
Determine the shard key.
For the collection to shard, determine the shard key. The shard key determines how MongoDB distributes the documents between shards. Good shard keys:
have values that are evenly distributed among all documents,
group documents that are often accessed at the same time into contiguous chunks, and
allow for effective distribution of activity among shards.
For more information, see Choose a Shard Key.
This procedure will use the number
field as the shard key for
test_collection
.
Create an index on the shard key.
Before sharding a non-empty collection, create an index on the shard key.
use test db.test_collection.createIndex( { number : 1 } )
Shard the collection.
In the test
database, shard the test_collection
,
specifying number
as the shard key.
use test sh.shardCollection( "test.test_collection", { "number" : 1 } )
mongos
uses "majority"
for the
write concern of the
shardCollection
command and its helper
sh.shardCollection()
.
The method returns the status of the operation:
{ "collectionsharded" : "test.test_collection", "ok" : 1 }
The balancer redistributes
chunks of documents when it next runs. As clients insert additional
documents into this collection, the mongos
routes the
documents to the appropriate shard.
Confirm the shard is balancing.
To confirm balancing activity, run db.stats()
or
db.printShardingStatus()
in the test
database.
use test db.stats() db.printShardingStatus()
Example output of the db.stats()
:
{ "raw" : { "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" : { "db" : "test", "collections" : 1, "views" : 0, "objects" : 640545, "avgObjSize" : 70.83200339949052, "dataSize" : 45370913, "storageSize" : 50438144, "numExtents" : 0, "indexes" : 2, "indexSize" : 24502272, "ok" : 1, "$gleStats" : { "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("7fffffff0000000000000003") } }, "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" : { "db" : "test", "collections" : 1, "views" : 0, "objects" : 359455, "avgObjSize" : 70.83259935179647, "dataSize" : 25461132, "storageSize" : 8630272, "numExtents" : 0, "indexes" : 2, "indexSize" : 8151040, "ok" : 1, "$gleStats" : { "lastOpTime" : Timestamp(0, 0), "electionId" : ObjectId("7fffffff0000000000000001") } } }, "objects" : 1000000, "avgObjSize" : 70, "dataSize" : 70832045, "storageSize" : 59068416, "numExtents" : 0, "indexes" : 4, "indexSize" : 32653312, "fileSize" : 0, "extentFreeList" : { "num" : 0, "totalSize" : 0 }, "ok" : 1 }
Example output of the db.printShardingStatus()
:
--- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5be0a488039b1964a7208c60") } shards: { "_id" : "rs0", "host" : "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017", "state" : 1 } { "_id" : "rs1", "host" : "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017", "state" : 1 } active mongoses: "3.6.8" : 1 autosplit: Currently enabled: yes balancer: Currently enabled: yes Currently running: yes Collections with active migrations: test.test_collection started at Mon Nov 05 2018 15:16:45 GMT-0500 Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: 1 : Success databases: { "_id" : "test", "primary" : "rs0", "partitioned" : true } test.test_collection shard key: { "number" : 1 } unique: false balancing: true chunks: rs0 5 rs1 1 { "number" : { "$minKey" : 1 } } -->> { "number" : 1195 } on : rs1 Timestamp(2, 0) { "number" : 1195 } -->> { "number" : 2394 } on : rs0 Timestamp(2, 1) { "number" : 2394 } -->> { "number" : 3596 } on : rs0 Timestamp(1, 5) { "number" : 3596 } -->> { "number" : 4797 } on : rs0 Timestamp(1, 6) { "number" : 4797 } -->> { "number" : 9588 } on : rs0 Timestamp(1, 1) { "number" : 9588 } -->> { "number" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 2)
Run these commands for a second time to demonstrate that chunks are migrating from rs0
to rs1
.