[ClusterLabs] corosync 2.4 CPG config change callback

Wed Mar 14 21:03:15 UTC 2018

On Fri, 2018-03-09 at 17:26 +0100, Jan Friesse wrote:
> Thomas,
> 
> > Hi,
> > 
> > On 3/7/18 1:41 PM, Jan Friesse wrote:
> > > Thomas,
> > > 
> > > > First thanks for your answer!
> > > > 
> > > > On 3/7/18 11:16 AM, Jan Friesse wrote:
> 
> ...
> 
> > TotemConfchgCallback: ringid (1.1436)
> > active processors 3: 1 2 3
> > EXIT
> > Finalize  result is 1 (should be 1)
> > 
> > 
> > Hope I did both test right, but as it reproduces multiple times
> > with testcpg, our cpg usage in our filesystem, this seems like
> > valid tested, not just an single occurrence.
> 
> I've tested it too and yes, you are 100% right. Bug is there and
> it's 
> pretty easy to reproduce when node with lowest nodeid is paused.
> It's 
> slightly harder when node with higher nodeid is paused.
> 
> Most of the clusters are using power fencing, so they simply never
> sees 
> this problem. That may be also the reason why it wasn't reported
> long 
> time ago (this bug exists virtually at least since OpenAIS
> Whitetank). 
> So really nice work with finding this bug.
> 
> What I'm not entirely sure is what may be best way to solve this 
> problem. What I'm sure is, that it's going to be "fun" :(
> 
> Lets start with very high level of possible solutions:
> - "Ignore the problem". CPG behaves more or less correctly.
> "Current" 
> membership really didn't changed so it doesn't make too much sense
> to 
> inform about change. It's possible to use cpg_totem_confchg_fn_t to
> find 
> out when ringid changes. I'm adding this solution just for
> completeness, 
> because I don't prefer it at all.
> - cpg_confchg_fn_t adds all left and back joined into left/join list
> - cpg will sends extra cpg_confchg_fn_t call about left and joined 
> nodes. I would prefer this solution simply because it makes cpg
> behavior 
> equal in all situations.
> 
> Which of the options you would prefer? Same question also for @Ken (-

Pacemaker should react essentially the same whichever of the last two
options is used. There could be differences due to timing (the second
solution might allow some work to be done between when the left and
join messages are received), but I think it should behave reasonably
with either approach.

Interestingly, there is some old code in Pacemaker for handling when a
node left and rejoined but "the cluster layer didn't notice", that may
have been a workaround for this case.

> > 
> what would you prefer for PCMK) and @Chrissie.
> 
> Regards,
>    Honza
> 
> 
> > 
> > cheers,
> > Thomas
> > 
> > > > 
> > > > > Now it's really cpg application problem to synchronize its
> > > > > data. Many applications (usually FS) are using quorum
> > > > > together with fencing to find out, which cluster partition is
> > > > > quorate and clean inquorate one.
> > > > > 
> > > > > Hopefully my explanation help you and feel free to ask more
> > > > > questions!
> > > > > 
> > > > 
> > > > They help, but I'm still a bit unsure about why the CB could
> > > > not happen here,
> > > > may need to dive a bit deeper into corosync :)
> > > > 
> > > > > Regards,
> > > > >    Honza
> > > > > 
> > > > > > 
> > > > > > help would be appreciated, much thanks!
> > > > > > 
> > > > > > cheers,
> > > > > > Thomas
> > > > > > 
> > > > > > [1]: https://git.proxmox.com/?p=pve-cluster.git;a=tree;f=da
> > > > > > ta/src;h=e5493468b456ba9fe3f681f387b4cd5b86e7ca08;hb=HEAD
> > > > > > [2]: https://git.proxmox.com/?p=pve-cluster.git;a=blob;f=da
> > > > > > ta/src/dfsm.c;h=cdf473e8226ab9706d693a457ae70c0809afa0fa;hb
> > > > > > =HEAD#l1096
> > > > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> > 
> 
> 
-- 
Ken Gaillot <kgaillot at redhat.com>