[ClusterLabs] How to unfence without reboot (fence_mpath)

Mon Feb 17 09:18:57 EST 2020

On February 17, 2020 3:36:27 PM GMT+02:00, Ondrej <ondrej-clusterlabs at famera.cz> wrote:
>Hello Strahil,
>
>On 2/17/20 3:39 PM, Strahil Nikolov wrote:
>> Hello Ondrej,
>> 
>> thanks for your reply. I really appreciate that.
>> 
>> I have picked fence_multipath as I'm preparing for my EX436 and I
>can't know what agent will be useful on the exam.
>> Also ,according to https://access.redhat.com/solutions/3201072 ,
>there could be a race condition with fence_scsi.
>
>I believe that exam is about testing knowledge in configuration and not
>
>testing knowledge in knowing which race condition bugs are present and 
>how to handle them :)
>If you have access to learning materials for EX436 exam I would 
>recommend trying those ones out - they have labs and comprehensive 
>review exercises that are useful in preparation for exam.
>
>> So, I've checked the cluster when fencing and the node immediately
>goes offline.
>> Last messages from pacemaker are:
>> <snip>
>> Feb 17 08:21:57 node1.localdomain stonith-ng[23808]:   notice: Client
>stonith_admin.controld.23888.b57ceee7 wants to fence (reboot)
>'node1.localdomain' with device '(any)'
>> Feb 17 08:21:57 node1.localdomain stonith-ng[23808]:   notice:
>Requesting peer fencing (reboot) of node1.localdomain
>> Feb 17 08:21:57 node1.localdomain stonith-ng[23808]:   notice:
>FENCING can fence (reboot) node1.localdomain (aka. '1'): static-list
>> Feb 17 08:21:58 node1.localdomain stonith-ng[23808]:   notice:
>Operation reboot of node1.localdomain by node2.localdomain for 
>stonith_admin.controld.23888 at node1.localdomain.ede38ffb: OK
>- This part looks OK - meaning the fencing looks like a success.
>> Feb 17 08:21:58 node1.localdomain crmd[23812]:     crit: We were
>allegedly just fenced by node2.localdomain for node1.localdomai
>- this is also normal as node just announces that it was fenced by
>other 
>node
>
>> <snip>
>> 
>> Which for me means - node1 just got fenced again. Actually fencing
>works ,as I/O is immediately blocked and the reservation is removed.
>> 
>> I've used https://access.redhat.com/solutions/2766611 to setup the
>fence_mpath , but I could have messed up something.
>-  note related to exam: you will not have Internet on exam, so I would
>
>expect that you would have to configure something that would not
>require 
>access to this (and as Dan Swartzendruber pointed out in other email - 
>we cannot* even see RH links without account)
>
>* you can get free developers account to read them, but ideally that 
>should be not needed and is certainly inconvenient for wide public
>audience
>
>> 
>> Cluster config is:
>> [root at node3 ~]# pcs config show
>> Cluster Name: HACLUSTER2
>> Corosync Nodes:
>>   node1.localdomain node2.localdomain node3.localdomain
>> Pacemaker Nodes:
>>   node1.localdomain node2.localdomain node3.localdomain
>> 
>> Resources:
>>   Clone: dlm-clone
>>    Meta Attrs: interleave=true ordered=true
>>    Resource: dlm (class=ocf provider=pacemaker type=controld)
>>     Operations: monitor interval=30s on-fail=fence
>(dlm-monitor-interval-30s)
>>                 start interval=0s timeout=90 (dlm-start-interval-0s)
>>                 stop interval=0s timeout=100 (dlm-stop-interval-0s)
>>   Clone: clvmd-clone
>>    Meta Attrs: interleave=true ordered=true
>>    Resource: clvmd (class=ocf provider=heartbeat type=clvm)
>>     Operations: monitor interval=30s on-fail=fence
>(clvmd-monitor-interval-30s)
>>                 start interval=0s timeout=90s
>(clvmd-start-interval-0s)
>>                 stop interval=0s timeout=90s (clvmd-stop-interval-0s)
>>   Clone: TESTGFS2-clone
>>    Meta Attrs: interleave=true
>>    Resource: TESTGFS2 (class=ocf provider=heartbeat type=Filesystem)
>>     Attributes: device=/dev/TEST/gfs2 directory=/GFS2 fstype=gfs2
>options=noatime run_fsck=no
>>     Operations: monitor interval=15s on-fail=fence OCF_CHECK_LEVEL=20
>(TESTGFS2-monitor-interval-15s)
>>                 notify interval=0s timeout=60s
>(TESTGFS2-notify-interval-0s)
>>                 start interval=0s timeout=60s
>(TESTGFS2-start-interval-0s)
>>                 stop interval=0s timeout=60s
>(TESTGFS2-stop-interval-0s)
>> 
>> Stonith Devices:
>>   Resource: FENCING (class=stonith type=fence_mpath)
>>    Attributes: devices=/dev/mapper/36001405cb123d0000000000000000000
>pcmk_host_argument=key
>pcmk_host_map=node1.localdomain:1;node2.localdomain:2;node3.localdomain:3
>pcmk_monitor_action=metadata pcmk_reboot_action=off
>>    Meta Attrs: provides=unfencing
>>    Operations: monitor interval=60s (FENCING-monitor-interval-60s)
>> Fencing Levels:
>> 
>> Location Constraints:
>> Ordering Constraints:
>>    start dlm-clone then start clvmd-clone (kind:Mandatory)
>(id:order-dlm-clone-clvmd-clone-mandatory)
>>    start clvmd-clone then start TESTGFS2-clone (kind:Mandatory)
>(id:order-clvmd-clone-TESTGFS2-clone-mandatory)
>> Colocation Constraints:
>>    clvmd-clone with dlm-clone (score:INFINITY)
>(id:colocation-clvmd-clone-dlm-clone-INFINITY)
>>    TESTGFS2-clone with clvmd-clone (score:INFINITY)
>(id:colocation-TESTGFS2-clone-clvmd-clone-INFINITY)
>> Ticket Constraints:
>> 
>> Alerts:
>>   No alerts defined
>> 
>> Resources Defaults:
>>   No defaults set
>> 
>> [root at node3 ~]# crm_mon -r1
>> Stack: corosync
>> Current DC: node3.localdomain (version 1.1.20-5.el7_7.2-3c4c782f70) -
>partition with quorum
>> Last updated: Mon Feb 17 08:39:30 2020
>> Last change: Sun Feb 16 18:44:06 2020 by root via cibadmin on
>node1.localdomain
>> 
>> 3 nodes configured
>> 10 resources configured
>> 
>> Online: [ node2.localdomain node3.localdomain ]
>> OFFLINE: [ node1.localdomain ]
>> 
>> Full list of resources:
>> 
>>   FENCING        (stonith:fence_mpath):  Started node2.localdomain
>>   Clone Set: dlm-clone [dlm]
>>       Started: [ node2.localdomain node3.localdomain ]
>>       Stopped: [ node1.localdomain ]
>>   Clone Set: clvmd-clone [clvmd]
>>       Started: [ node2.localdomain node3.localdomain ]
>>       Stopped: [ node1.localdomain ]
>>   Clone Set: TESTGFS2-clone [TESTGFS2]
>>       Started: [ node2.localdomain node3.localdomain ]
>>       Stopped: [ node1.localdomain ]
>> 
>> 
>> 
>> 
>> In the logs , I've noticed that the node is first unfenced and later
>it is fenced again... For the unfence , I believe "meta
>provides=unfencing" is 'guilty', yet I'm not sure about the action from
>node2.
>
>'Unfecing' is exactly the expected behavior when provides=unfencing is 
>present (and it should be present with fence_scsi and fence_multipath).
>
>Here the important part is "first unfenced and later it is fenced 
>again". If everything is in normal state, then the node should not be 
>just fenced again. So it would make sense to me to investigate that 
>'fencing' after unfencing. I would expect that one of the nodes will 
>have a more verbose logs that would give idea why the fencing was 
>ordered. (my lucky guess would be failed 'monitor' operation on any of 
>the resources as all of them 'on-fail=fence', but this would need a 
>support from logs to be sure)
>Also logs from fenced node can provide some information what happened
>on 
>node - if that was the cause of fencing.
>
>> So far I have used SCSI reservations only with ServiceGuard, while
>SBD on SUSE - and I was wondering if the setup is correctly done.
>I don't see anything particularly bad looking from configuration point 
>of view. Best place to look for reason are now the logs from other
>nodes 
>after 'unfencing' and before 'fencing again'
>
>> Storage in this test setup is a Highly Available iSCSI Cluster ontop
>of DRBD /RHEL 7 again/, and it seems that SCSI Reservations Support is
>OK.
> From logs you have provided so far the reservations keys works as 
>fencing is happening and reports OK.
>
>> Best Regards,
>> Strahil Nikolov
>
>Example of fencing because 'monitor' operation of resource 'testtest' 
>failed from logs:
>
>Feb 17 22:32:15 [1289] fastvm-centos-7-7-174    pengine:  warning: 
>pe_fence_node:       Cluster node fastvm-centos-7-7-175 will be fenced:
>
>testtest failed there
>Feb 17 22:32:15 [1289] fastvm-centos-7-7-174    pengine:   notice: 
>LogNodeActions:       * Fence (reboot) fastvm-centos-7-7-175 'testtest 
>failed there'
>
>--
>Ondrej
>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/

Hey Ondrej,

Sadly the lab in the training is using a customized fencing mechanism that cannot be reproduced outside of RedHat training lab.
As I don't know what will be the environment (Red Hat prevents any disclosure on that) , I have to pick a fencing mechanism that will work in any environment and 'fence_mpath' matches those criteria.

Sadly, RedHat expects the engineer to be able to deal  with bugs (RedHat CEO's intervirew from several years ago confirmed that), so if I know that fence_scsi can have issues - it is better to play safe and avoid it.

I'm sorry for quoting the RedHat's  Solutions . It mentions that each node  should have a unique reservation_key ( in /etc/multipath.conf ) and the stonith agent is not defined with  the mandatory 'key' as it is in the pcmk_host_maps  .

Best Regards,
Strahil Nikolov