[ClusterLabs] Antw: Trying to Understanding crm-fence-peer.sh
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Jan 16 10:07:36 EST 2019
Hi!
I guess we need more logs; especially some events from storage2 before fencing
is triggered.
Regards,
Ulrich
>>> "Bryan K. Walton" <bwalton+1546953805 at leepfrog.com> schrieb am 16.01.2019
um
16:03 in Nachricht
<20190116150321.3j2f2upz67ethxox at mygeeto.inside.leepfrog.com>:
> I have posed this question on the DRBD‑user list, but didn't receive a
> response.
>
> I'm using DRBD 8.4 with Pacemaker in a two node cluster, with a
> single primary and fabric fencing.
>
> Almost all of my STONITH testing has worked as I would expect it to. I
> get expected results when I use iptables to sever the replication link,
> when I force a kernel panic, and when I trigger an unclean
> shutdown/reboot with sysrq trigger. The fact that my iptables test
> results in a fenced node, would seem to suggest that crm‑fence‑peer.sh
> is working as expected.
>
> However, if I issue a simple reboot command on my current primary node
> (storage1), what I see is that Pacemaker successfully fails over to the
> secondary node. But the logs on storage2 show the following:
>
> Jan 11 08:49:53 storage2 kernel: drbd r0: helper command: /sbin/drbdadm
> fence‑peer r0
> Jan 11 08:49:53 storage2 crm‑fence‑peer.sh[15594]:
> DRBD_CONF=/etc/drbd.conf DRBD_DONT_WARN_ON_VERSION_MISMATCH=1
> DRBD_MINOR=1 DRBD_PEER=storage1 DRBD_PEERS=storage1
> DRBD_PEER_ADDRESS=192.168.0.2 DRBD_PEER_AF=ipv4 DRBD_RESOURCE=r0
> UP_TO_DATE_NODES='' /usr/lib/drbd/crm‑fence‑peer.sh
> Jan 11 08:49:53 storage2 crm‑fence‑peer.sh[15594]: INFO peer is
> reachable, my disk is UpToDate: placed constraint
> 'drbd‑fence‑by‑handler‑r0‑StorageClusterClone'
> Jan 11 08:49:53 storage2 kernel: drbd r0: helper command: /sbin/drbdadm
> fence‑peer r0 exit code 4 (0x400)
> Jan 11 08:49:53 storage2 kernel: drbd r0: fence‑peer helper returned 4
> (peer was fenced)
>
> The exit code 4 would seem to suggest that storage1 should be fenced.
> But the switch ports connected to storage1 are still enabled.
>
> Am I misreading the logs here? This is a clean reboot, maybe fencing
> isn't supposed to happen in this situation? But the logs seem to
> suggest otherwise.
>
> Thanks!
> Bryan Walton
>
> ‑‑
> Bryan K. Walton 319‑337‑3877
> Linux Systems Administrator Leepfrog Technologies, Inc
>
> ‑‑‑‑‑ End forwarded message ‑‑‑‑‑
>
> ‑‑
> Bryan K. Walton 319‑337‑3877
> Linux Systems Administrator Leepfrog Technologies, Inc
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list