[ClusterLabs] SBD & Failed Peer

Mon Sep 7 03:27:30 EDT 2015

On 06/09/15 09:28 PM, Jorge Fábregas wrote:
> On 09/06/2015 04:23 PM, Jorge Fábregas wrote:
>> Assume an active/active cluster using OCFS2 and SBD with shared storage.
>> Then one node explodes (the hardware watchdog is gone as well
>> obviously).  
> 
> Ok I did two tests with this setup on my KVM lab  (one with SBD with
> shared-storage and the other with hypervisor-based STONITH
> (external/libvirt) while actively writing to an ocfs2 filesystem.
> 
> 
> ## SBD with shared-storage
> 
> Shut off one node abruptly (VM power-off) . Result: DLM/OCFS2 blocked
> for about 30 to 40 seconds and then it resumed.  That's nice!  I think
> at this moment (when resuming) the assumptions were:

And this is why I am nervous; It is always ideal to have a primary fence
method that has a method of confirming the 'off' state. IPMI fencing can
do this, as can hypervisor-based fence methods like fence_virsh and
fence_xvm.

Now I say this as a someone who uses PDU-based fencing for backup, which
has the same problem... It can't verify the 'off' and a human could
potentially change the plugs around (I personally deal with this by
mechanically strapping the cables in place).

>  -if the peer were alive it would have swallowed the poison pill we just
> placed
> - if the peer is freezed the watchdog would have taken care of him
> - we just wait a little extra bit before continuing...
> 
> (I really don't know if checking when was the last update of your
> partner - on the SBD disk-  is part of the role of the SBD daemon)
> 
> 
> ## External/Libvirt
> 
> I shut off one node but then disabled SSH on KVM host (so that fencing
> via qemu+ssh couldn't work).  Result: it blocked FOREVER.

Right; Fencing is not allowed to make assumptions. If the fence action
can't be confirmed to have succeeded, it's better to lock up than to
risk corruption.

When fence_virsh works, you *know* it worked, so it's ideal as a primary
fence method.

> Am I right in thinking that SBD is the way to go when using OCFS2
> filesystems?  (compared to hypervisor-based fencing or management-boards
> like iLO, DRAC etc)?

I would use IPMI (iLO, DRAC, etc) as the primary fence method and
something else as a secondary, backup method. You can use SBD + watchdog
as the backup method, or as I do, a pair of switched PDUs (I find APC
brand to be very fast in fencing).

> Now, the only thing I don't like about SBD is that when it loses contact
> with the shared disk, both nodes commit suicide.  I found out there's
> the "-P" option to SBD (that's supposed to prevent that as long as
> there's cluster communication) but it doesn't work in my SLES 11 SP4
> setup.  Maybe it's on SLES 12.
> 
> 
> Thanks,
> Jorge
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?