Docs Menu
Docs Home
/
MongoDB Manual
/ /

Convert a Replica Set to a Sharded Cluster

On this page

  • Overview
  • Considerations
  • Prerequisites
  • Procedures

This tutorial converts a single three-member replica set to a sharded cluster with two shards. Each shard is an independent three-member replica set. This tutorial is specific to MongoDB 6.0. For other versions of MongoDB, refer to the corresponding version of the MongoDB Manual.

The procedure is as follows:

  1. Create the initial three-member replica set and insert data into a collection. See Set Up Initial Replica Set.

  2. Start the config servers and a mongos. See Deploy Config Server Replica Set and mongos.

  3. Add the initial replica set as a shard. See Add Initial Replica Set as a Shard.

  4. Create a second shard and add to the cluster. See Add Second Shard.

  5. Shard the desired collection. See Shard a Collection.

Individual steps in these procedures note when downtime will occur.

Important

These procedures cause some downtime for your deployment.

This tutorial uses a total of ten servers: one server for the mongos and three servers each for the first replica set, the second replica set, and the config server replica set.

Each server must have a resolvable domain, hostname, or IP address within your system.

The tutorial uses the default data directories (e.g. /data/db and /data/configdb). Create the appropriate directories with appropriate permissions. To use different paths, see Configuration File Options .

This procedure creates the initial three-member replica set rs0. The replica set members are on the following hosts: mongodb0.example.net, mongodb1.example.net, and mongodb2.example.net.

1

For each member, start a mongod instance with the following settings:

  • Set replication.replSetName option to the replica set name. If your application connects to more than one replica set, each set must have a distinct name.

  • Set net.bindIp option to the hostname/ip or a comma-delimited list of hostnames/ips.

  • Set any other settings as appropriate for your deployment.

In this tutorial, the three mongod instances are associated with the following hosts:

Replica Set Member
Hostname
Member 0
mongodb0.example.net
Member 1
mongodb1.example.net
Member 2
mongodb2.example.net

The following example specifies the replica set name and the ip binding through the --replSet and --bind_ip command-line options:

Warning

Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.

mongod --replSet "rs0" --bind_ip localhost,<hostname(s)|ip address(es)>

For <hostname(s)|ip address(es)>, specify the hostname(s) and/or ip address(es) for your mongod instance that remote clients (including the other members of the replica set) can use to connect to the instance.

Alternatively, you can also specify the replica set name and the ip addresses in a configuration file:

replication:
replSetName: "rs0"
net:
bindIp: localhost,<hostname(s)|ip address(es)>

To start mongod with a configuration file, specify the configuration file's path with the --config option:

mongod --config <path-to-config>

In production deployments, you can configure a init script to manage this process. Init scripts are beyond the scope of this document.

2

From the same machine where one of the mongod is running (in this tutorial, mongodb0.example.net), start mongosh. To connect to the mongod listening to localhost on the default port of 27017, simply issue:

mongosh

Depending on your path, you may need to specify the path to the mongosh binary.

If your mongod is not running on the default port, specify the --port option for mongosh.

3

From mongosh, run rs.initiate() on replica set member 0.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

Important

To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.

Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.

rs.initiate( {
_id : "rs0",
members: [
{ _id: 0, host: "mongodb0.example.net:27017" },
{ _id: 1, host: "mongodb1.example.net:27017" },
{ _id: 2, host: "mongodb2.example.net:27017" }
]
})

MongoDB initiates a replica set, using the default replica set configuration.

4

The following step adds one million documents to the collection test_collection and can take several minutes depending on your system.

To determine the primary, use rs.status().

Issue the following operations on the primary of the replica set:

use test
var bulk = db.test_collection.initializeUnorderedBulkOp();
people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina", "Katie", "Jeff"];
for(var i=0; i<1000000; i++){
user_id = i;
name = people[Math.floor(Math.random()*people.length)];
number = Math.floor(Math.random()*10001);
bulk.insert( { "user_id":user_id, "name":name, "number":number });
}
bulk.execute();

For more information on deploying a replica set, see Deploy a Replica Set.

This procedure deploys the three-member replica set for the config servers and the mongos.

  • The config servers use the following hosts: mongodb7.example.net, mongodb8.example.net, and mongodb9.example.net.

  • The mongos uses mongodb6.example.net.

1

Start a config server on mongodb7.example.net, mongodb8.example.net, and mongodb9.example.net. Specify the same replica set name. The config servers use the default data directory /data/configdb and the default port 27019.

Warning

Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.

mongod --configsvr --replSet configReplSet --bind_ip localhost,<hostname(s)|ip address(es)>

To modify the default settings or to include additional options specific to your deployment, see mongod or Configuration File Options.

Connect mongosh to one of the config servers and run rs.initiate() to initiate the replica set.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

Important

To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.

Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.

rs.initiate( {
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "mongodb07.example.net:27019" },
{ _id: 1, host: "mongodb08.example.net:27019" },
{ _id: 2, host: "mongodb09.example.net:27019" }
]
} )
2

On mongodb6.example.net, start the mongos specifying the config server replica set name followed by a slash / and at least one of the config server hostnames and ports.

mongos --configdb configReplSet/mongodb07.example.net:27019,mongodb08.example.net:27019,mongodb09.example.net:27019 --bind_ip localhost,<hostname(s)|ip address(es)>

For sharded clusters, mongod instances for the shards must explicitly specify its role as a shardsvr, either via the configuration file setting sharding.clusterRole or via the command line option --shardsvr.

Note

Default port for mongod instances with the shardsvr role is 27018. To use a different port, specify net.port setting or --port option.

1

Connect mongosh to one of the members and run rs.status() to determine the primary and secondary members.

2

One secondary at a time, shut down and restart each secondary with the --shardsvr option.

Warning

This step requires some downtime for applications connected to secondary members of the replica set. Applications connected to a secondary may error with CannotVerifyAndSignLogicalTime after restarting the secondary until you perform the steps in Add Initial Replica Set as a Shard. Restarting your application will also stop it from receiving CannotVerifyAndSignLogicalTime errors.

To continue to use the same port, include the --port option. Include additional options, such as --bind_ip, as appropriate for your deployment.

Warning

Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.

mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Include any other options as appropriate for your deployment. Repeat this step for the other secondary.

3

Connect mongosh to the primary and stepdown the primary.

Warning

This step requires some downtime. Applications may error with CannotVerifyAndSignLogicalTime after stepping down the primary until you perform the steps in Add Initial Replica Set as a Shard. Restarting your application will also stop it from receiving CannotVerifyAndSignLogicalTime errors.

rs.stepDown()
4

Shut down the primary and restart with the --shardsvr option.

To continue to use the same port, include the --port option.

mongod --replSet "rs0" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Include any other options as appropriate for your deployment.

The following procedure adds the initial replica set rs0 as a shard.

1
mongosh mongodb6.example.net:27017/admin
2

Add a shard to the cluster with the sh.addShard() method:

sh.addShard( "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" )

Warning

Once the new shard is active, mongosh and other clients must always connect to the mongos instance. Do not connect directly to the mongod instances. If your clients connect to shards directly, you may create data or metadata inconsistencies.

The following procedure deploys a new replica set rs1 for the second shard and adds it to the cluster. The replica set members are on the following hosts: mongodb3.example.net, mongodb4.example.net, and mongodb5.example.net.

For sharded clusters, mongod instances for the shards must explicitly specify its role as a shardsvr, either via the configuration file setting sharding.clusterRole or via the command line option --shardsvr.

Note

Default port for mongod instances with the shardsvr role is 27018. To use a different port, specify net.port setting or --port option.

1

For each member, start a mongod, specifying the replica set name through the --replSet option and its role as a shard with the --shardsvr option. Specify additional options, such as --bind_ip, as appropriate.

Warning

Before binding to a non-localhost (e.g. publicly accessible) IP address, ensure you have secured your cluster from unauthorized access. For a complete list of security recommendations, see Security Checklist. At minimum, consider enabling authentication and hardening network infrastructure.

For replication-specific parameters, see Replication Options.

mongod --replSet "rs1" --shardsvr --port 27017 --bind_ip localhost,<hostname(s)|ip address(es)>

Repeat this step for the other two members of the rs1 replica set.

2

Connect mongosh to one member of the replica set (e.g. mongodb3.example.net)

mongosh mongodb3.example.net
3

From mongosh, run rs.initiate() to initiate a replica set that consists of the current member.

Important

Run rs.initiate() on just one and only one mongod instance for the replica set.

Important

To avoid configuration updates due to IP address changes, use DNS hostnames instead of IP addresses. It is particularly important to use a DNS hostname instead of an IP address when configuring replica set members or sharded cluster members.

Use hostnames instead of IP addresses to configure clusters across a split network horizon. Starting in MongoDB 5.0, nodes that are only configured with an IP address will fail startup validation and will not start.

rs.initiate( {
_id : "rs1",
members: [
{ _id: 0, host: "mongodb3.example.net:27017" },
{ _id: 1, host: "mongodb4.example.net:27017" },
{ _id: 2, host: "mongodb5.example.net:27017" }
]
})
4
mongosh mongodb6.example.net:27017/admin
5

In mongosh, when connected to the mongos, add the shard to the cluster with the sh.addShard() method:

sh.addShard( "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" )
1
mongosh mongodb6.example.net:27017/admin
2

For the collection to shard, determine the shard key. The shard key determines how MongoDB distributes the documents between shards. Good shard keys:

  • have values that are evenly distributed among all documents,

  • group documents that are often accessed at the same time into contiguous chunks, and

  • allow for effective distribution of activity among shards.

For more information, see Choose a Shard Key.

This procedure will use the number field as the shard key for test_collection.

3

Before sharding a non-empty collection, create an index on the shard key.

use test
db.test_collection.createIndex( { number : 1 } )
4

In the test database, shard the test_collection, specifying number as the shard key.

use test
sh.shardCollection( "test.test_collection", { "number" : 1 } )

mongos uses "majority" for the write concern of the shardCollection command and its helper sh.shardCollection().

The method returns the status of the operation:

{ "collectionsharded" : "test.test_collection", "ok" : 1 }

The balancer redistributes chunks of documents when it next runs. As clients insert additional documents into this collection, the mongos routes the documents to the appropriate shard.

5

To confirm balancing activity, run db.stats() or db.printShardingStatus() in the test database.

use test
db.stats()
db.printShardingStatus()

Example output of the db.stats():

{
"raw" : {
"rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" : {
"db" : "test",
"collections" : 1,
"views" : 0,
"objects" : 640545,
"avgObjSize" : 70.83200339949052,
"dataSize" : 45370913,
"storageSize" : 50438144,
"numExtents" : 0,
"indexes" : 2,
"indexSize" : 24502272,
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("7fffffff0000000000000003")
}
},
"rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" : {
"db" : "test",
"collections" : 1,
"views" : 0,
"objects" : 359455,
"avgObjSize" : 70.83259935179647,
"dataSize" : 25461132,
"storageSize" : 8630272,
"numExtents" : 0,
"indexes" : 2,
"indexSize" : 8151040,
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(0, 0),
"electionId" : ObjectId("7fffffff0000000000000001")
}
}
},
"objects" : 1000000,
"avgObjSize" : 70,
"dataSize" : 70832045,
"storageSize" : 59068416,
"numExtents" : 0,
"indexes" : 4,
"indexSize" : 32653312,
"fileSize" : 0,
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}

Example output of the db.printShardingStatus():

--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5be0a488039b1964a7208c60")
}
shards:
{ "_id" : "rs0", "host" : "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017", "state" : 1 }
{ "_id" : "rs1", "host" : "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017", "state" : 1 }
active mongoses:
"3.6.8" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: yes
Collections with active migrations:
test.test_collection started at Mon Nov 05 2018 15:16:45 GMT-0500
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
1 : Success
databases:
{ "_id" : "test", "primary" : "rs0", "partitioned" : true }
test.test_collection
shard key: { "number" : 1 }
unique: false
balancing: true
chunks:
rs0 5
rs1 1
{ "number" : { "$minKey" : 1 } } -->> { "number" : 1195 } on : rs1 Timestamp(2, 0)
{ "number" : 1195 } -->> { "number" : 2394 } on : rs0 Timestamp(2, 1)
{ "number" : 2394 } -->> { "number" : 3596 } on : rs0 Timestamp(1, 5)
{ "number" : 3596 } -->> { "number" : 4797 } on : rs0 Timestamp(1, 6)
{ "number" : 4797 } -->> { "number" : 9588 } on : rs0 Timestamp(1, 1)
{ "number" : 9588 } -->> { "number" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 2)

Run these commands for a second time to demonstrate that chunks are migrating from rs0 to rs1.

Back

Convert Sharded Cluster to Replica Set