[ClusterLabs] stonith in dual HMC environment
Digimer
lists at alteeve.ca
Tue Mar 21 03:09:19 EDT 2017
On 20/03/17 12:22 PM, Alexander Markov wrote:
> Hello guys,
>
> it looks like I miss something obvious, but I just don't get what has
> happened.
>
> I've got a number of stonith-enabled clusters within my big POWER boxes.
> My stonith devices are two HMC (hardware management consoles) - separate
> servers from IBM that can reboot separate LPARs (logical partitions)
> within POWER boxes - one per every datacenter.
>
> So my definition for stonith devices was pretty straightforward:
>
> primitive st_dc2_hmc stonith:ibmhmc \
> params ipaddr=10.1.2.9
> primitive st_dc1_hmc stonith:ibmhmc \
> params ipaddr=10.1.2.8
> clone cl_st_dc2_hmc st_dc2_hmc
> clone cl_st_dc1_hmc st_dc1_hmc
>
> Everything was ok when we tested failover. But today upon power outage
> we lost one DC completely. Shortly after that cluster just literally
> hanged itself upong trying to reboot nonexistent node. No failover
> occured. Nonexistent node was marked OFFLINE UNCLEAN and resources were
> marked "Started UNCLEAN" on nonexistent node.
>
> UNCLEAN seems to flag a problems with stonith configuration. So my
> question is: how to avoid such behaviour?
>
> Thank you!
Please share your config along with the logs from the nodes that were
effected.
cheers,
digimer
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and died in cotton fields and sweatshops." - Stephen Jay Gould
More information about the Users
mailing list