[ClusterLabs] DLM recovery stuck

Thu Aug 9 14:17:21 EDT 2018

David Teigland <teigland at redhat.com> writes:

> On Thu, Aug 09, 2018 at 06:11:48PM +0200, Ferenc Wágner wrote:
> 
>> Almost ten years ago you requested more info in a similar case, let's
>> see if we can get further now!
>
> Hi, the usual cause is that a network message from the dlm has been
> lost/dropped/missed.  The dlm can't recover from that, which is clearly a
> weak point in the design.  There may be some new development coming along
> to finally improve that.

Hi David,

Good to hear!  Can you share any more info about this development?

> One way you can confirm this is to check if the dlm on one or more nodes
> is waiting for a message that's not arriving.  Often you'll see an entry
> in the dlm "waiters" debugfs file corresponding to a response that's being
> waited on.

If you mean dlm/clvmd_waiters, it's empty on all nodes.  Is there
anything else to check?

> Another red flag is kernel messages from a driver indicating some network
> hickup at the time things hung.  I can't say if these messages you sent
> happened at the right time, or if they even correspond to the dlm
> interface, but it's worth checking as a possible explanation:
>
> [  137.207059] be2net 0000:05:00.0 enp5s0f0: Link is Up
> [  137.252901] be2net 0000:05:00.1 enp5s0f1: Link is Up

Hard to say...  This is an iSCSI offload card with two physical ports,
which are virtualized in the card into 4-4 logical ports, 3-3 of which
are passed to the OS as separate PCI functions, while the other two are
used for iSCSI traffic.  The DLM traffic goes through a Linux bond made
of enp5s0f4 and enp5s0f5, which is started at 112.393798 and used for
Corosync traffic first.  The above two lines are signs of OpenVSwitch
starting up for independent purposes.  It should be totally independent,
but it's the same device after all, so I can't exclude all possibility
of "crosstalk".

> [  153.886619]  connection2:0: detected conn error (1011)

See above: iSCSI traffic is offloaded, not visible on the OS level, and
these connection failures are expected at the moment because some of the
targets are inaccessible.  *But* it uses the same wire in the end, just
different VLANs, and the virtualization (in the card itself) may not
provide absolutely perfect separation.
-- 
Thanks,
Feri