[ClusterLabs] Cluster with two STONITH devices
Dejan Muhamedagic
dejanmm at fastmail.fm
Fri Apr 10 12:39:55 UTC 2015
Hi,
On Wed, Apr 08, 2015 at 04:54:09PM +0100, Jorge Lopes wrote:
> Hi all.
>
> I'm having difficulties orchestrating two STONITH devices in my cluster. I
> have been struggling with this in past days and I need some help, please.
>
> A simplified version of my cluster and its goals is as follows:
> - The cluster has two physical servers, each with two nodes (VMWare virtual
> machines): overall, there are 4 nodes in this simplified version.
> - There are two resource groups: group-cluster-a and group-cluster-b.
> - To achieve a good CPU balance in the physical servers, the cluster is
> asymmetric, with one group running in one server and the other group
> running on the other server.
> - If the VM of one host becomes not usable, then its resources are started
> in its sister VM deployed in the other physical host.
> - If one physical host becomes not usable, then all resources are started
> in the other physical host.
> - Two STONITH levels are used to fence the problematic nodes.
>
> The resources have the following behavior:
> - If the resource monitor detects a problem, then Pacemaker tries to
> restart the resource in the same node.
> - If it fails, then STONITH takes place (vcenter reboots the VM) and
> Pacemaker starts the resource in the sister VM present in the other
> physical host.
> - If restarting the VM fails, I want to power off the physical server and
> Pacemaker will start all resources in the other physical host.
>
>
> The HA stack is:
> Ubuntu 14.04 (the node OS, which is a visualized guest running in VMWare
> ESXi 5.5)
> Pacemaker 1.1.12
> Corosync 2.3.4
> CRM 2.1.2
>
> The 4 nodes are:
> cluster-a-1
> cluster-a-2
> cluster-b-1
> cluster-b-2
>
> The relevant configuration is:
>
> property symmetric-cluster=false
> property stonith-enabled=true
> property no-quorum-policy=stop
>
> group group-cluster-a vip-cluster-a docker-web
> location loc-group-cluster-a-1 group-cluster-a inf: cluster-a-1
> location loc-group-cluster-a-2 group-cluster-a 500: cluster-a-2
>
> group group-cluster-b vip-cluster-b docker-srv
> location loc-group-cluster-b-1 group-cluster-b 500: cluster-b-1
> location loc-group-cluster-b-2 group-cluster-b inf: cluster-b-2
>
>
> # stonith vcenter definitions for host 1
> # run in any of the host2 VM
> primitive stonith-vcenter-host1 stonith:external/vcenter \
> params \
> VI_SERVER="192.168.40.20" \
> VI_CREDSTORE="/etc/vicredentials.xml" \
> HOSTLIST="cluster-a-1=cluster-a-1;cluster-a-2=cluster-a-2" \
> RESETPOWERON="1" \
> priority="2" \
priority is deprecated. Not sure if it will work.
> pcmk_host_check="static-list" \
> pcmk_host_list="cluster-a-1 cluster-a-2" \
Normally, you shouldn't need the pcmk_* attributes for LHA
stonith agents. Please try without them.
> op monitor interval="60s"
I'd suggest to increase the interval (60m or 2h).
> location loc1-stonith-vcenter-host1 stonith-vcenter-host1 500: cluster-b-1
> location loc2-stonith-vcenter-host1 stonith-vcenter-host1 501: cluster-b-2
>
> # stonith vcenter definitions for host 2
> # run in any of the host1 VM
> primitive stonith-vcenter-host2 stonith:external/vcenter \
> params \
> VI_SERVER="192.168.40.21" \
> VI_CREDSTORE="/etc/vicredentials.xml" \
> HOSTLIST="cluster-b-1=cluster-b-1;cluster-b-2=cluster-b-2" \
> RESETPOWERON="1" \
> priority="2" \
> pcmk_host_check="static-list" \
> pcmk_host_list="cluster-b-1 cluster-b-2" \
> op monitor interval="60s"
>
> location loc1-stonith-vcenter-host2 stonith-vcenter-host2 500: cluster-a-1
> location loc2-stonith-vcenter-host2 stonith-vcenter-host2 501: cluster-a-2
>
>
> # stonith IPMI definitions for host 1 (DELL with iDRAC 7 enterprise
> interface at 192.168.40.15)
> # run in any of the host2 VM
> primitive stonith-ipmi-host1 stonith:external/ipmi \
> params hostname="host1" ipaddr="192.168.40.15" userid="root"
> passwd="mypassword" interface="lanplus" \
> priority="1" \
> pcmk_host_check="static-list" \
> pcmk_host_list="cluster-a-1 cluster-a-2" \
> op start interval="0" timeout="60s" requires="nothing" \
> op monitor interval="3600s" timeout="20s" requires="nothing"
>
> location loc1-stonith-ipmi-host1 stonith-ipmi-host1 500: cluster-b-1
> location loc2-stonith-ipmi-host1 stonith-ipmi-host1 501: cluster-b-2
>
>
> # stonith IPMI definitions for host 2 (DELL with iDRAC 7 enterprise
> interface at 192.168.40.16)
> # run in any of the host1 VM
> primitive stonith-ipmi-host2 stonith:external/ipmi \
> params hostname="host2" ipaddr="192.168.40.16" userid="root"
> passwd="mypassword" interface="lanplus" \
> priority="1" \
> pcmk_host_check="static-list" \
> pcmk_host_list="cluster-b-1 cluster-b-2" \
> op start interval="0" timeout="60s" requires="nothing" \
> op monitor interval="3600s" timeout="20s" requires="nothing"
>
> location loc1-stonith-ipmi-host2 stonith-ipmi-host2 500: cluster-a-1
> location loc2-stonith-ipmi-host2 stonith-ipmi-host2 501: cluster-a-2
Try with sth like this:
fencing_topology \
node1: stonith-resources-that-can-manage-node1 \
node2: stonith-resources-that-can-manage-node2 \
...
Please see crm configure help fencing_topology
HTH,
Dejan
>
> What is working:
> - When an error is detected in one resource, the resource restart in the
> same node, as expected.
> - With the STONITH external/ipmi resource *stopped*, a fail in one node
> makes the vcenter rebooting it and the resources starts in the sister node.
>
>
> What is not so good:
> - When vcenter reboots one node, then the resource starts in the other node
> as expected but then they return to the original node as soon as it becomes
> online. This makes a bit of ping-pong and I think it is a consequence of
> how the locations are defined. Any suggestion to avoid this? After the
> resource was moved to another node, I would prefer that it stays there,
> instead of returning it to the original node. I can think of playing with
> the resource affinity scores - is this way it should be done?
>
> What is wrong:
> Lets consider this scenario.
> I have a set of resources provided by a docker agent. My test consists in
> stopping the docker service in the node cluster-a-1, which makes the docker
> agent to return OCF_ERR_INSTALLED to Pacemaker (this is a change I made in
> the docker agent, when compared to the github repository version). With the
> IPMI STONITH resource stopped, this leads to the node cluster-a-1 restart,
> which is expected.
>
> But with the IPMI STONITH resource started, I notice an erratic behavior:
> - Some times, the resources at the node cluster-a-1 are stopped and no
> STONITH happens. Also, the resources are not moved to the node cluster-a-2.
> In this situation, if I manually restart the node cluster-a-1 (virtual
> machine restart), then the IPMI STONITH takes place and restarts the
> corresponding physical server.
> - Sometimes, the IPMI STONITH starts before the vCenter STONITH, which is
> not expected because the vCenter STONITH has higher priority.
>
> I also have noticed this pattern (with both STONITH resources running):
> 1. With the cluster running without errors, I run "stop docker" in node
> cluster-a-1.
> 2. This leads to the vCenter STONITH to act as expected.
> 3. After the cluster is running again without errors, I run again "stop
> docker" in node cluster-a-1.
> 4. Now, the vCenter STONITH doesn't run and, instead, it is the IPMI
> STONITH that runs. This is unexpected for me, as I was expecting to see the
> vCenter STONITH to run again.
>
> I might have something wrong in my stonith definition, but I can't figure
> what.
> Any idea how to correct this?
>
> And how can I set external/ipmi to power off the physical host, instead of
> rebooting it?
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list