Upgrade from 4.0 to 4.2 without downtime in PSA replicaset

Hi @Ishrat_Jahan,

After we stepped down the primary, there was some issue in the old primary and took a lot of time for this instance to be back up and running.

The main issue here is that a PSA setup will always have issues with read and write majority when one of the data-bearing node goes offline. Some possibly being:

And after sometime there was performance issue on the new primary and some writes also started timing out and the overall response time increased

Can you confirm when this issue happens whether or not the SECONDARY was up when the new PRIMARY was up? I.e. PSA.
Or was it a case where the SECONDARY was offline and the new PRIMARY was up? I.e. PXA (Where X is an offline node, specifically the SECONDARY in this scenario).

In the next upgrade i.e fro 4.0 to 4.2, we might face this issue again. is there a way to do this upgrade without read and write failures?

Unfortunately due to the nature of a PSA set, you will encounter this issue if any data-bearing node is offline for an extended period of time. However, there are workaround for this, as shown in the following procedure is followed.

Lastly, regarding the PSA set up as well, you may find information on Stennie’s response specifically regarding “Should you add an Arbiter?” on this topic useful.

Regards,
Jason

3 Likes