[ClusterLabs] Stonith configuration

Fri Feb 14 13:06:59 EST 2020

On February 14, 2020 4:44:53 PM GMT+02:00, "BASDEN, ALASTAIR G." <a.g.basden at durham.ac.uk> wrote:
>Hi Strahil,
>corosync-cfgtool -s
>Printing ring status.
>Local node ID 1
>RING ID 0
> 	id	= 172.17.150.20
> 	status	= ring 0 active with no faults
>RING ID 1
> 	id	= 10.0.6.20
> 	status	= ring 1 active with no faults
>
>corosync-quorumtool -s
>Quorum information
>------------------
>Date:             Fri Feb 14 14:41:11 2020
>Quorum provider:  corosync_votequorum
>Nodes:            2
>Node ID:          1
>Ring ID:          1/96
>Quorate:          Yes
>
>Votequorum information
>----------------------
>Expected votes:   2
>Highest expected: 2
>Total votes:      2
>Quorum:           1
>Flags:            2Node Quorate WaitForAll
>
>Membership information
>----------------------
>     Nodeid      Votes Name
>          1          1 node1.primary.network (local)
>          2          1 node2.primary.network
>
>
>On the surviving node, the 10.0.6.21 interface flipflopped (though
>nothing 
>detected on the other node), and that is what started it all off.
>
>We have no firewall running.
>
>Cheers,
>Alastair.
>
>
>On Fri, 14 Feb 2020, Strahil Nikolov wrote:
>
>> On February 14, 2020 12:41:58 PM GMT+02:00, "BASDEN, ALASTAIR G."
><a.g.basden at durham.ac.uk> wrote:
>>> Hi,
>>> I wonder whether anyone could give me some advice about a stonith
>>> configuration.
>>>
>>> We have 2 nodes, which form a HA cluster.
>>>
>>> These have 3 networks:
>>> A generic network over which they are accessed (eg ssh)
>>> (node1.primary.network, node2.primary.network)
>>> A directly connected cable between them (10.0.6.20, 10.0.6.21).
>>> A management network, on which ipmi is (172.16.150.20,
>172.16.150.21)
>>>
>>> We have done:
>>> pcs cluster setup --name hacluster node1.primary.network,10.0.6.20
>>> node2.primary.network,10.0.6.21 --token 20000
>>> pcs cluster start --all
>>> pcs property set no-quorum-policy=ignore
>>> pcs property set stonith-enabled=true
>>> pcs property set symmetric-cluster=true
>>> pcs stonith create node1_ipmi fence_ipmilan ipaddr="172.16.150.20"
>>> lanplus=true login="root" passwd="password"
>>> pcmk_host_list="node1.primary.network" power_wait=10
>>> pcs stonith create node2_ipmi fence_ipmilan ipaddr="172.16.150.21"
>>> lanplus=true login="root" passwd="password"
>>> pcmk_host_list="node2.primary.network" power_wait=10
>>>
>>> /etc/corosync/corosync.conf has:
>>> totem {
>>>     version: 2
>>>     cluster_name: hacluster
>>>     secauth: off
>>>     transport: udpu
>>>     rrp_mode: passive
>>>     token: 20000
>>> }
>>>
>>> nodelist {
>>>     node {
>>>         ring0_addr: node1.primary.network
>>>         ring1_addr: 10.0.6.20
>>>         nodeid: 1
>>>     }
>>>
>>>     node {
>>>         ring0_addr: node2.primary.network
>>>         ring1_addr: 10.0.6.21
>>>          nodeid: 2
>>>     }
>>> }
>>>
>>> quorum {
>>>     provider: corosync_votequorum
>>>     two_node: 1
>>> }
>>>
>>> logging {
>>>     to_logfile: yes
>>>     logfile: /var/log/cluster/corosync.log
>>>     to_syslog: no
>>> }
>>>
>>>
>>> What I find is that if there is a problem with the directly
>connected
>>> cable, the nodes stonith each other, even though the generic network
>is
>>>
>>> fine.
>>>
>>> What I would expect is that they would only shoot each other when
>both
>>> networks are down (generic and directly connected).
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> Alastair.
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>
>> What is  the output of :
>> corosync-cfgtool -s
>> corosync-quorumtool -s
>>
>> Also check the logs of the suvived node for clues.
>>
>> What about firewall ?
>> Have you enabled 'high-availability' service on firewalld on all
>zones for your interfaces ?
>>
>> Best Regards,
>> Strahil Nikolov
>>
>>

One thing that comes to my mind is that you have a 20s  token, but consensus is the default - but it should be token * 1.2 -> 24000 (24s).

That can be done live  with some caution. Just set the cluster in maintenance, reload the corosync (or even better  stop&start the cluster stack) and then run a 'crm_simulate'  to verify what will happen when you remove the maintenance.
Last, remove the maintenance  if the simulation doesn't show any action.

Corosync seems OK, but you should consider  if you really need  'WaitForAll'.
If both nodes fail (power failiure for example) - you need  to power up both before the cluster starts a resource.

There is a chance that the primary network had an issue at the same time but this can be detected only in the logs.

If you think that you can share the logs - send a link.
Otherwise you have to analyze them by yourself. Keep in mind that the DC node has more comprehensive logs, but if it was the fenced server - check both servers.

Note: You can check if the fencing mechanism has  an option for a delay and then you can evaluate which node is hosting the more important resource . Then configure the delays, so that important node  gets fenced second.

Note2:  Consider adding a third node  /for example a VM/ or a qdevice on a  separate node (allows  to be  on a separate network, so a simple  routing is the only requirement ) and reconfigure  the cluster  , so you have 'Expected  votes: 3' .
This will protect  you from split brain and  is highly recommended.

P.S.: Sorry for the long post :D

Best Regards,
Strahil Nikolov