[ClusterLabs] Strange Corosync (TOTEM) logs, Pacemaker OK but DLM stuck

Mon Sep 11 10:32:00 UTC 2017

Ferenc,

> wferi at niif.hu (Ferenc Wágner) writes:
>
>> Jan Friesse <jfriesse at redhat.com> writes:
>>
>>> wferi at niif.hu writes:
>>>
>>>> In a 6-node cluster (vhbl03-08) the following happens 1-5 times a day
>>>> (in August; in May, it happened 0-2 times a day only, it's slowly
>>>> ramping up):
>>>>
>>>> vhbl08 corosync[3687]:   [TOTEM ] A processor failed, forming new configuration.
>>>> vhbl03 corosync[3890]:   [TOTEM ] A processor failed, forming new configuration.
>>>> vhbl07 corosync[3805]:   [MAIN  ] Corosync main process was not scheduled for 4317.0054 ms (threshold is 2400.0000 ms). Consider token timeout increase.
>>>
>>> ^^^ This is main problem you have to solve. It usually means that
>>> machine is too overloaded. It is happening quite often when corosync
>>> is running inside VM where host machine is unable to schedule regular
>>> VM running.
>>
>> After some extensive tracing, I think the problem lies elsewhere: my
>> IPMI watchdog device is slow beyond imagination.
>
> Confirmed: setting watchdog_device: off cluster wide got rid of the
> above warnings.
>

Yep, good you found the issue. This is perfectly possible if ioctl blocks.

>> Its ioctl operations can take seconds, starving all other functions.
>> At least, it seems to block the main thread of Corosync.  Is this a
>> plausible scenario?  Corosync has two threads, what are their roles?

First (main) thread is basically doing almost everything. There is a 
main loop (epoll) I've described in previous mail.

Second thread is created by libqb and it's used only for logging. This 
is to prevent blocking of corosync when syslog/file log write blocks for 
some reason. It means some messages may be lost but it's still better 
than blocking.

Back to problem you have. It's definitively HW issue but I'm thinking 
how to solve it in software. Right now, I can see two ways:
1. Set dog FD to be non blocking right at the end of setup_watchdog - 
This is proffered but I'm not sure if it's really going to work.
2. Create thread which makes sure to tackle wd regularly.

Regards,
   Honza