[Pacemaker] Master/Slave failover during reboot

Eliot Gable egable at broadvox.com
Mon Jul 20 18:08:29 EDT 2009

I guess I lied. I was running 1.0.3. After upgrading to 1.0.4, I do not seem to have this problem anymore. At least, my first test resulted in exactly what I would expect - the slave took over immediately upon issuing a reboot command on the master. I will run some more tests and if I see it happening again, I will E-mail the list.

From: Eliot Gable
Sent: Monday, July 20, 2009 3:27 PM
To: 'pacemaker at oss.clusterlabs.org'
Subject: Master/Slave failover during reboot

I have a resource that is configured as a Master/Slave resource. If I kill a resource it is dependent on, it properly fails over to the other node. However, if I reboot the master node, it does not fail over. What I see is that the master node switches to UNCLEAN - Offline, the master resource stops running (crm_mon shows only the slave node running) and then it just sits there until the master node finishes booting. Once the rebooted node re-joins the cluster, it figures out which is master/slave and one of them gets promoted. The entire time, the partition says it has quorum. This is Pacemaker 1.0.4.
Is this expected behavior? Is there any rule or constraint I can add that would detect the reboot and cause a failover so that the slave is promoted to master before the other node finishes rebooting?

Thanks for any suggestions.

