[ClusterLabs] I_DC_TIMEOUT and node fenced when it joins the cluster
Strahil Nikolov
hunter86_bg at yahoo.com
Sat Apr 16 01:04:35 EDT 2022
Set the corosync token to 10000 miliseconds and adjust the consensus as per the man 5 corosync.conf and give it a try.
Don't forget to sync the corosync settings among the cluster.
Best Regards,Strahil Nikolov
On Fri, Apr 15, 2022 at 15:27, vitaly<vitaly at unitc.com> wrote: Hello Everybody.
I am seeing occasionally the following behavior on two node cluster.
1. Abruptly rebooting both nodes of the cluster (using "reboot")
2. Both nodes start to come up. Node d18-3-left (2) comes up first
Apr 13 23:56:09 d18-3-left corosync[11465]: [MAIN ] Corosync Cluster Engine ('2.4.4'): started and ready to provide service.
3. Second node d18-3-right (1) joins the cluster
Apr 13 23:56:58 d18-3-left corosync[11466]: [TOTEM ] A new membership (172.16.1.1:60) was formed. Members joined: 1
Apr 13 23:56:58 d18-3-left corosync[11466]: [QUORUM] This node is within the primary component and will provide service.
Apr 13 23:56:58 d18-3-left corosync[11466]: [QUORUM] Members[2]: 1 2
Apr 13 23:56:58 d18-3-left corosync[11466]: [MAIN ] Completed service synchronization, ready to provide service.
Apr 13 23:56:58 d18-3-left pacemakerd[11717]: notice: Quorum acquired
Apr 13 23:56:58 d18-3-left crmd[11763]: notice: Quorum acquired
4. 2 seconds later node d18-3-left shows I_DC_TIMEOUT and starts fencing of the newly joined node.
Apr 13 23:57:00 d18-3-left crmd[11763]: warning: Input I_DC_TIMEOUT received in state S_PENDING from crm_timer_popped
After that we get:
Apr 13 23:57:00 d18-3-left crmd[11763]: notice: State transition S_ELECTION -> S_INTEGRATION
Apr 13 23:57:00 d18-3-left crmd[11763]: warning: Input I_ELECTION_DC received in state S_INTEGRATION from do_election_check
and fence the node:
Apr 13 23:57:01 d18-3-left pengine[11762]: warning: Scheduling Node d18-3-right.lab.archivas.com for STONITH
Apr 13 23:57:01 d18-3-left pengine[11762]: notice: * Fence (reboot) d18-3-right.lab.archivas.com 'node is unclean'
5. After this the node that was fenced comes up again and joins the cluster without any issues.
Any idea on what is going on here?
Thanks,
_Vitaly
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20220416/4fb228f4/attachment.htm>
More information about the Users
mailing list