[ClusterLabs] Our 2-Node Cluster with a Separate Qdevice Went Down Anyway?

Andrei Borzenkov arvidjaar at gmail.com
Mon Mar 1 05:43:49 EST 2021


On 01.03.2021 12:26, Jan Friesse wrote:
>>
> 
> Thanks for digging into logs. I believe Eric is hitting
> https://github.com/corosync/corosync-qdevice/issues/10 (already fixed,
> but may take some time to get into distributions) - it also contains
> workaround.
> 

I tested corosync-qnetd at df3c672 which should include these fixes. It
changed behavior, still I cannot explain it.

Again, ha1+ha2+qnetd, ha2 is current DC, I disconnect ha1 (block
everything with ha1 source MAC), stonith disabled. corosync and
corosync-qdevice on nodes are still 2.4.5 if it matters.

ha2:

ar 01 13:23:27 ha2 corosync[1576]:   [TOTEM ] A processor failed,
forming new configuration.
Mar 01 13:23:28 ha2 corosync[1576]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
Mar 01 13:23:28 ha2 corosync[1576]:   [TOTEM ] A new membership
(192.168.1.2:3632) was formed. Members left: 1
Mar 01 13:23:28 ha2 corosync[1576]:   [TOTEM ] Failed to receive the
leave message. failed: 1
Mar 01 13:23:28 ha2 corosync[1576]:   [CPG   ] downlist left_list: 1
received
Mar 01 13:23:28 ha2 pacemaker-based[2032]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-based[2032]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Lost attribute
writer ha1
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Removing all ha1
attributes for peer loss
Mar 01 13:23:28 ha2 pacemaker-attrd[2035]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-fenced[2033]:  notice: Node ha1 state is
now lost
Mar 01 13:23:28 ha2 pacemaker-fenced[2033]:  notice: Purged 1 peer with
id=1 and/or uname=ha1 from the membership cache
Mar 01 13:23:28 ha2 pacemaker-controld[2037]:  warning: Stonith/shutdown
of node ha1 was not expected
Mar 01 13:23:28 ha2 pacemaker-controld[2037]:  notice: State transition
S_IDLE -> S_POLICY_ENGINE
Mar 01 13:23:33 ha2 pacemaker-controld[2037]:  notice: High CPU load
detected: 1.200000
Mar 01 13:23:35 ha2 corosync[1576]:   [QUORUM] Members[1]: 2
Mar 01 13:23:35 ha2 corosync[1576]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Mar 01 13:23:35 ha2 pacemaker-attrd[2035]:  notice: Recorded local node
as attribute writer (was unset)
Mar 01 13:23:35 ha2 pacemaker-controld[2037]:  notice: Node ha1 state is
now lost
Mar 01 13:23:35 ha2 pacemaker-controld[2037]:  warning: Stonith/shutdown
of node ha1 was not expected
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Promote
p_drbd0:0        (   Slave -> Master ha2 )
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Start
p_fs_clust01     (                   ha2 )
Mar 01 13:23:36 ha2 pacemaker-schedulerd[2036]:  notice:  * Start
p_mysql_001      (                   ha2 )


So it is pretty fast to react (8 seconds)

ha1:

Mar 01 13:23:27 ha1 corosync[1552]:   [TOTEM ] A processor failed,
forming new configuration.
Mar 01 13:23:30 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
Mar 01 13:23:30 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3640) was formed. Members left: 2
Mar 01 13:23:30 ha1 corosync[1552]:   [TOTEM ] Failed to receive the
leave message. failed: 2
Mar 01 13:23:30 ha1 corosync[1552]:   [CPG   ] downlist left_list: 1
received
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Removing all ha2
attributes for peer loss
Mar 01 13:23:30 ha1 pacemaker-attrd[1738]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:30 ha1 pacemaker-based[1735]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-based[1735]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:30 ha1 pacemaker-controld[1740]:  notice: Our peer on the
DC (ha2) is dead
Mar 01 13:23:30 ha1 pacemaker-controld[1740]:  notice: State transition
S_NOT_DC -> S_ELECTION
Mar 01 13:23:30 ha1 pacemaker-fenced[1736]:  notice: Node ha2 state is
now lost
Mar 01 13:23:30 ha1 pacemaker-fenced[1736]:  notice: Purged 1 peer with
id=2 and/or uname=ha2 from the membership cache
Mar 01 13:23:32 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
Mar 01 13:23:32 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3644) was formed. Members
Mar 01 13:23:32 ha1 corosync[1552]:   [CPG   ] downlist left_list: 0
received
Mar 01 13:23:33 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
Mar 01 13:23:33 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3648) was formed. Members
Mar 01 13:23:33 ha1 corosync[1552]:   [CPG   ] downlist left_list: 0
received
Mar 01 13:23:35 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
...
Mar 01 13:24:05 ha1 corosync-qdevice[1563]: Can't connect to qnetd host.
(-5986): Network address not available (in use?)
Mar 01 13:24:05 ha1 corosync-qdevice[1563]: Mar 01 13:24:05 error
Can't connect to qnetd host. (-5986): Network address not available (in
use?)
Mar 01 13:24:05 ha1 corosync[1552]:   [VOTEQ ] waiting for quorum device
Qdevice poll (but maximum for 30000 ms)
Mar 01 13:24:05 ha1 corosync[1552]:   [TOTEM ] A new membership
(192.168.1.1:3736) was formed. Members
Mar 01 13:24:05 ha1 corosync[1552]:   [CPG   ] downlist left_list: 0
received
Mar 01 13:24:05 ha1 corosync[1552]:   [QUORUM] This node is within the
non-primary component and will NOT provide any services.
Mar 01 13:24:05 ha1 corosync[1552]:   [QUORUM] Members[1]: 1
Mar 01 13:24:05 ha1 corosync[1552]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Mar 01 13:24:05 ha1 pacemaker-controld[1740]:  warning: Quorum lost
Mar 01 13:24:05 ha1 pacemaker-controld[1740]:  notice: Node ha2 state is
now lost
Mar 01 13:24:05 ha1 pacemaker-controld[1740]:  notice: State transition
S_ELECTION -> S_INTEGRATION
Mar 01 13:24:05 ha1 pacemaker-controld[1740]:  notice: Updating quorum
status to false (call=56)
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  warning: Blind faith:
not fencing unseen nodes
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  warning: Fencing and
resource management disabled due to lack of quorum
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Stop
p_drbd0:0     ( Master ha1 )   due to no quorum
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Stop
p_drbd1:0     (  Slave ha1 )   due to no quorum
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Stop
p_fs_clust01     (        ha1 )   due to no quorum
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Start
p_fs_clust02     (        ha1 )   due to no quorum (blocked)
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Stop
p_mysql_001      (        ha1 )   due to no quorum
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Start
p_mysql_006      (        ha1 )   due to no quorum (blocked)
Mar 01 13:24:05 ha1 pacemaker-schedulerd[1739]:  notice:  * Start
p_mysql_666      (        ha1 )   due to no quorum (blocked)
Mar 01 13:24:05 ha1 pacemaker-controld[1740]:  notice: Processing graph
0 (ref=pe_calc-dc-16145


So it took it almost 40 seconds to make decision. Somehow it is exactly
the opposite of what I observed before - disconnected node was fast and
connected node was slow.

While I can understand why behavior changed for connected node, I still
do not understand why disconnected node now needs so much time.

And what is worse, this is not reliable - next time I test both nodes
react almost immediately (just 3 seconds to reach decision it is out of
quorum for disconnected node). That is the most irritating, as one
expects consistent behavior here.

That is something inside corosync/corosync-qdevice. At least it seems to
improve situation with qnetd response timing to survived nodes.


More information about the Users mailing list