[ClusterLabs] corosync 2.4 CPG config change callback

Thomas Lamprecht t.lamprecht at proxmox.com
Wed Apr 25 03:08:01 EDT 2018


Honza,

On 4/24/18 6:38 PM, Jan Friesse wrote:
>> On 4/6/18 10:59 AM, Jan Friesse wrote:
>>> Thomas Lamprecht napsal(a):
>>>> Am 03/09/2018 um 05:26 PM schrieb Jan Friesse:
>>>>> I've tested it too and yes, you are 100% right. Bug is there and it's
>>>>> pretty easy to reproduce when node with lowest nodeid is paused. It's
>>>>> slightly harder when node with higher nodeid is paused.
>>>>>
>>>>
>>>> Do you were able to make some progress on this issue?
>>>
>>> Ya, kind of. Sadly I had to work on different problem, but I'm expecting to sent patch next week.
>>>
>>
>> I guess the different problems where the ones related to the issued CVEs :)
> 
> Yep.
> 
> Also I've spent quite a lot of the time thinking about best possible solution. CPG is quite old, it was full of weird bugs and risk of breakage is very high.
> 
> Anyway, I've decided to not to try hack what is apparently broken and just go for risky but proper solution (= needs a LOT more testing, but so far looks good).
> 

I did not looked deep into how your revert plays out with the
mentioned commits of the heuristics approach, but this fix would
mean to bring corosync back to a state it had already, and thus
was already battle tested?

Patch and approach seems good to me, with my limited knowledge,
when looking at the various "bandaid" fix commits you mentioned.

> Patch is in PR (needle): https://github.com/corosync/corosync/pull/347
> 

Much thanks! First tests work well here.
I could not yet reproduce the problem with the patch applied in both,
testcpg and our cluster configuration file system.

I'll let it run 

cheers,
Thomas




More information about the Users mailing list