[ClusterLabs] pcs stonith fence - Error: unable to fence

Strahil Nikolov hunter86_bg at yahoo.com
Mon Jan 20 23:41:04 EST 2020


On January 20, 2020 6:06:05 PM GMT+02:00, Ken Gaillot <kgaillot at redhat.com> wrote:
>On Sat, 2020-01-18 at 22:20 +0000, Strahil Nikolov wrote:
>> Sorry for the spam.
>> I figured out that I forgot to specify the domain for the 'drbd1' and
>> thus it has reacted like that.
>> The strange thing is that pcs allows me to fence a node , that is not
>> in the cluster :)
>> 
>> Do you think that this behaviour is a bug?
>> If yes, I can open an issue to the upstream
>> 
>> 
>> Best Regards,
>> Strahil Nikolov
>
>Leaving pcs out of the picture for a moment, from pacemaker's view the
>stonith_admin command is just passing along what the user requested,
>and the fencing daemon determines whether it's a valid request or not
>and fails the request appropriately. So technically it's not a bug.
>
>However I see two possible areas of improvement:
>
>- The status display should show not just that the request failed, but
>why. There is a project already planned to show why fencing was
>initiated, so this would be a good addition to that. It's just a matter
>of having developer time to do it.
>
>- Since pcs is at a higher level than stonith_admin, it could
>require "--force" if a given node isn't in the cluster configuration.
>Feel free to file an upstream request for that.
>
>
>> В неделя, 19 януари 2020 г., 00:01:11 ч. Гринуич+2, Strahil Nikolov <
>> hunter86_bg at yahoo.com> написа: 
>> 
>> 
>> 
>> 
>> 
>> Hi All,
>> 
>> 
>> I am building a test cluster with fence_rhevm stonith agent on RHEL
>> 7.7 and oVirt 4.3.
>> When I fenced drbd3 from drbd1 using 'pcs stonith fence drbd3' - the
>> fence action was successfull.
>> 
>> So then I decided to test the fencing the opposite way and it
>> partially failed.
>> 
>> 
>> 1. in oVirt the machine was powered off and then powered on properly
>> - so the communication with the engine is OK
>> 2. the command on drbd3 to fence drbd1 did stuck and then reported as
>> failiure despite the VM was reset.
>> 
>> 
>> 
>> Now 'pcs status' is reporting the following:
>> Failed Fencing Actions:
>> * reboot of drbd1 failed: delegate=drbd3.localdomain,
>> client=stonith_admin.1706, origin=drbd3.localdomain,
>>    last-failed='Sat Jan 18 23:18:24 2020'
>> 
>> 
>> 
>> 
>> My stonith is configured as follows:
>> Stonith Devices: 
>> Resource: ovirt_FENCE (class=stonith type=fence_rhevm) 
>>  Attributes: ipaddr=engine.localdomain login=fencerdrbd at internal
>> passwd=I_have_replaced_that
>pcmk_host_map=drbd1.localdomain:drbd1;drbd2.localdomain:drbd2;drbd3.localdomain:drbd
>> 3 power_wait=3 ssl=1 ssl_secure=1 
>>  Operations: monitor interval=60s (ovirt_FENCE-monitor-interval-60s) 
>> Fencing Levels:
>> 
>> 
>> 
>> Do I need to add some other settings to the fence_rhevm stonith agent
>> ?
>> 
>> 
>> Manually running the status command from drbd2/drbd3 is OK:
>> 
>> 
>> [root at drbd3 ~]# fence_rhevm -o status --ssl --ssl-secure -a
>> engine.localdomain --username='fencerdrbd at internal'  
>> --password=I_have_replaced_that -n drbd1 
>> Status: ON
>> 
>> I'm attaching the logs from the drbd2 (DC) and drbd3.
>> 
>> 
>> Thanks in advance for your suggestions.
>> 
>> 
>> Best Regards,
>> Strahil Nikolov
>-- 
>Ken Gaillot <kgaillot at redhat.com>
>
>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/

Hi Ken,

Thanks for the clarification. The strange thing is that today I have managed to fence drbd3 using short instead of FQDN.

As my pacemaker nodes are registered with FQDN , I was expecting to fail. I guess the node-to-VM mapping is allowing it.

In fact, the  status enhancement would help alot.

Best Regards,
Strahil Nikolov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: corosync.log
Type: application/octet-stream
Size: 28513 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20200121/8357d5fb/attachment-0001.obj>


More information about the Users mailing list