[ClusterLabs] maximum token value (knet)
Klaus Wenninger
kwenning at redhat.com
Mon Mar 15 09:05:52 EDT 2021
On 3/13/21 12:55 AM, Strahil Nikolov wrote:
> I will try to get into the details on monday, when I have access to
> the cluster again.
> I guess the /var/log/cluster/corosync.log and
> /etc/corosync/corosync.conf are the most interesting.
>
> So far, I have 6 node cluster with separate VLANs for HANA
> replication, prod and backup.
> Initially, I used pcs to create the corosync.conf with 2 IPs per node,
> token 40000, consensus 48000 and wait_for_all=1.
> Later I have expanded the cluster to 3 links and added qnet to the
> setup (only after I made it run (token 29000) ), so I'm ruling it out.
qdevice isn't using knet - right?
And VOTEQUORUM_QDEVICE_DEFAULT_SYNC_TIMEOUT is 30s. Unrelated coincidence?
Klaus
> I updated the cluster nodes from RHEL 8.1 to 8.2 , removed the
> consensus and enabled debug.
>
> As knet is using udp by default, and because the problem is hitting me
> both in udp (default settings) and sctp - the problem is not in the
> protocol.
>
> I've also enabled pacemaker blackbox, although I doubt that has any
> effect on corosync.
>
> How can I enable trace logs for corosync only ?
>
> Best Regards,
> Strahil Nikolov
>
>
>
> On Fri, Mar 12, 2021 at 17:01, Jan Friesse
> <jfriesse at redhat.com> wrote:
> Strahil,
>
> > Interesting...
> > Yet, this doesn't explain why token of 30000 causes the nodes to
> never assemble a cluster (waiting for half an hour, using
> wait_for_all=1) , while setting it to 29000 works like a charm.
>
> Definitively.
>
> Could you please provide a bit more info about your setup
> (config/logs/how many nodes cluster has/...)? Because I've just
> briefly
> tested two nodes setup with 30 sec token timeout and it was working
> perfectly fine.
>
> >
> > Thankfully we got RH subsciption, so RH devs will provide more
> detailed output on the issue.
>
> As Jehan correctly noted if it would really get to RH devs it would
> probably get to me ;) But before that GSS will take care of checking
> configs/hw/logs/... and they are really good in finding problems with
> setup/hw/...
>
> >
> > I was hoping that I missed in the documentation about the
> maximum token size...
>
> Nope.
>
> No matter what, if you can send config/logs/... we may try to find
> out
> what is root of the problem here on ML or you can really try GSS,
> but as
> Jehan told, it would be nice if you can post result so other
> people (me
> included) knows what was the main problem.
>
> Thanks and regards,
> Honza
>
>
> >
> > Best Regards,
> > Strahil Nikolov
> >
> >
> >
> >
> >
> >
> > В четвъртък, 11 март 2021 г., 19:12:58 ч. Гринуич+2, Jan Friesse
> <jfriesse at redhat.com <mailto:jfriesse at redhat.com>> написа:
> >
> >
> >
> >
> >
> > Strahil,
> >> Hello all,
> >> I'm building a test cluster on RHEL8.2 and I have noticed that
> the cluster fails to assemble ( nodes stay inquorate as if the
> network is not working) if I set the token at 30000 or more (30s+).
> >
> > Knet waits for enough pong replies for other nodes before it
> marks them
> > as alive and starts sending/receiving packets from them. By
> default it
> > needs to receive 2 pongs and ping is sent 4 times in token
> timeout so it
> > means 15 sec until node is considered up for 30 sec token timeout.
> >
> >> What is the maximum token value with knet ?On SLES12 (I think
> it was corosync 1) , I used to set the token/consensus with far
> greater values on some of our clusters.
> >
> > I'm really not aware about any arbitrary limits.
> >
> >
> >> Best Regards,Strahil Nikolov
> >>
> >
> > Regards,
> >
> > Honza
> >
> >>
> >>
> >> _______________________________________________
> >> Manage your subscription:
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> <https://lists.clusterlabs.org/mailman/listinfo/users>
> >>
> >> ClusterLabs home: https://www.clusterlabs.org/
> <https://www.clusterlabs.org/>
> >
> >>
> >
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list