[ClusterLabs] corosync 2.4 CPG config change callback
Christine Caulfield
ccaulfie at redhat.com
Tue Mar 13 06:01:55 EDT 2018
On 09/03/18 16:26, Jan Friesse wrote:
> Thomas,
>
>> Hi,
>>
>> On 3/7/18 1:41 PM, Jan Friesse wrote:
>>> Thomas,
>>>
>>>> First thanks for your answer!
>>>>
>>>> On 3/7/18 11:16 AM, Jan Friesse wrote:
>
> ...
>
>> TotemConfchgCallback: ringid (1.1436)
>> active processors 3: 1 2 3
>> EXIT
>> Finalize result is 1 (should be 1)
>>
>>
>> Hope I did both test right, but as it reproduces multiple times
>> with testcpg, our cpg usage in our filesystem, this seems like
>> valid tested, not just an single occurrence.
>
> I've tested it too and yes, you are 100% right. Bug is there and it's
> pretty easy to reproduce when node with lowest nodeid is paused. It's
> slightly harder when node with higher nodeid is paused.
>
> Most of the clusters are using power fencing, so they simply never sees
> this problem. That may be also the reason why it wasn't reported long
> time ago (this bug exists virtually at least since OpenAIS Whitetank).
> So really nice work with finding this bug.
>
> What I'm not entirely sure is what may be best way to solve this
> problem. What I'm sure is, that it's going to be "fun" :(
>
> Lets start with very high level of possible solutions:
> - "Ignore the problem". CPG behaves more or less correctly. "Current"
> membership really didn't changed so it doesn't make too much sense to
> inform about change. It's possible to use cpg_totem_confchg_fn_t to find
> out when ringid changes. I'm adding this solution just for completeness,
> because I don't prefer it at all.
> - cpg_confchg_fn_t adds all left and back joined into left/join list
> - cpg will sends extra cpg_confchg_fn_t call about left and joined
> nodes. I would prefer this solution simply because it makes cpg behavior
> equal in all situations.
>
> Which of the options you would prefer? Same question also for @Ken (->
> what would you prefer for PCMK) and @Chrissie.
>
The last option makes most sense to me too - it's more consistent and
'what you would expect' I think.
Chrissie
More information about the Users
mailing list