[ClusterLabs] pcs stonith fence - Error: unable to fence
Ken Gaillot
kgaillot at redhat.com
Mon Jan 20 11:06:05 EST 2020
On Sat, 2020-01-18 at 22:20 +0000, Strahil Nikolov wrote:
> Sorry for the spam.
> I figured out that I forgot to specify the domain for the 'drbd1' and
> thus it has reacted like that.
> The strange thing is that pcs allows me to fence a node , that is not
> in the cluster :)
>
> Do you think that this behaviour is a bug?
> If yes, I can open an issue to the upstream
>
>
> Best Regards,
> Strahil Nikolov
Leaving pcs out of the picture for a moment, from pacemaker's view the
stonith_admin command is just passing along what the user requested,
and the fencing daemon determines whether it's a valid request or not
and fails the request appropriately. So technically it's not a bug.
However I see two possible areas of improvement:
- The status display should show not just that the request failed, but
why. There is a project already planned to show why fencing was
initiated, so this would be a good addition to that. It's just a matter
of having developer time to do it.
- Since pcs is at a higher level than stonith_admin, it could
require "--force" if a given node isn't in the cluster configuration.
Feel free to file an upstream request for that.
> В неделя, 19 януари 2020 г., 00:01:11 ч. Гринуич+2, Strahil Nikolov <
> hunter86_bg at yahoo.com> написа:
>
>
>
>
>
> Hi All,
>
>
> I am building a test cluster with fence_rhevm stonith agent on RHEL
> 7.7 and oVirt 4.3.
> When I fenced drbd3 from drbd1 using 'pcs stonith fence drbd3' - the
> fence action was successfull.
>
> So then I decided to test the fencing the opposite way and it
> partially failed.
>
>
> 1. in oVirt the machine was powered off and then powered on properly
> - so the communication with the engine is OK
> 2. the command on drbd3 to fence drbd1 did stuck and then reported as
> failiure despite the VM was reset.
>
>
>
> Now 'pcs status' is reporting the following:
> Failed Fencing Actions:
> * reboot of drbd1 failed: delegate=drbd3.localdomain,
> client=stonith_admin.1706, origin=drbd3.localdomain,
> last-failed='Sat Jan 18 23:18:24 2020'
>
>
>
>
> My stonith is configured as follows:
> Stonith Devices:
> Resource: ovirt_FENCE (class=stonith type=fence_rhevm)
> Attributes: ipaddr=engine.localdomain login=fencerdrbd at internal
> passwd=I_have_replaced_that pcmk_host_map=drbd1.localdomain:drbd1;drbd2.localdomain:drbd2;drbd3.localdomain:drbd
> 3 power_wait=3 ssl=1 ssl_secure=1
> Operations: monitor interval=60s (ovirt_FENCE-monitor-interval-60s)
> Fencing Levels:
>
>
>
> Do I need to add some other settings to the fence_rhevm stonith agent
> ?
>
>
> Manually running the status command from drbd2/drbd3 is OK:
>
>
> [root at drbd3 ~]# fence_rhevm -o status --ssl --ssl-secure -a
> engine.localdomain --username='fencerdrbd at internal'
> --password=I_have_replaced_that -n drbd1
> Status: ON
>
> I'm attaching the logs from the drbd2 (DC) and drbd3.
>
>
> Thanks in advance for your suggestions.
>
>
> Best Regards,
> Strahil Nikolov
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list