[ClusterLabs] host in standby causes havoc

Kadlecsik József kadlecsik.jozsef at wigner.hu
Thu Jun 15 06:58:07 EDT 2023


Hello,

We had a strange issue here: 7 node cluster, one node was put into standby 
mode to test a new iscsi setting on it. During configuring the machine it 
was rebooted and after the reboot the iscsi didn't come up. That caused a 
malformed communication (atlas5 is the node in standby) with the cluster:

Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  warning: Unexpected 
result (error) was recorded for probe of ocsi on atlas5 at Jun 15 10:09:32 2023
Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  notice: If it is not 
possible for ocsi to run on atlas5, see the resource-discovery option for 
location constraints
Jun 15 10:10:13 atlas0 pacemaker-schedulerd[7153]:  error: Resource ocsi 
is active on 2 nodes (attempting recovery)

The resource was definitely not active on 2 nodes. And that caused a storm 
of killing all virtual machines as resources.

How could one prevent such cases to come up?

Best regards,
Jozsef
--
E-mail : kadlecsik.jozsef at wigner.hu
PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt
Address: Wigner Research Centre for Physics
         H-1525 Budapest 114, POB. 49, Hungary


More information about the Users mailing list