Good day, colleagues!
We use a highly loaded MongoDB database in a ReplicaSet configuration, which includes one primary node, 3 secondary nodes and one arbiter.
One of the secondary nodes became in the recovering state, and at the same time, one of the secondary nodes was stopped. Multiple “slow query” type errors were recorded. We are using version 4.4.3. Tell me what configuration we should use to avoid such problems and maybe there are some best practices for configuring the database.
We use a highly loaded MongoDB database in a ReplicaSet configuration, which includes one primary node, 3 secondary nodes and one arbiter.
So you have 5 nodes: PSSSA. Assuming all are voting members, you need 3 data-bearing nodes available to acknowledge majority commits. Your arbiter casts a vote in elections to help elect or sustain a primary, but cannot acknowledge writes.
One of the secondary nodes became in the recovering state, and at the same time, one of the secondary nodes was stopped. Multiple “slow query” type errors were recorded.
In this degraded state, your configuration would be PS__A. That only leaves you with two nodes that can acknowledges writes and will create cache pressure because the majority commit point cannot be advanced until another secondary is available…
My recommendation would be to replace the arbiter with another data-bearing member.