[ClusterLabs] libqb 0.17.1 - segfault at 1b8

Ken Gaillot kgaillot at redhat.com
Mon May 2 17:05:22 EDT 2016


On 05/02/2016 03:45 PM, Jan Pokorný wrote:
> Hello Radoslaw,
> 
> On 02/05/16 11:47 -0500, Radoslaw Garbacz wrote:
>> When testing pacemaker I encountered a start error, which seems to be
>> related to reported libqb segmentation fault.
>> - cluster started and acquired quorum
>> - some nodes failed to connect to CIB, and lost membership as a result
>> - restart solved the problem
>>
>> Segmentation fault reports libqb library in version 0.17.1, a standard
>> package provided for CentOS.6.
> 
> Chances are that you are running into this nasty bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1114852
> 
>> Please let me know if the problem is known, and if  there is a remedy (e.g.
>> using the latest libqb).
> 
> Try libqb >= 0.17.2.
> 
> [...]
> 
>> Logs from /var/log/messages:
>>
>> Apr 22 15:46:41 (...) pacemakerd[111190]:   notice: Additional logging
>> available in /var/log/pacemaker.log
>> Apr 22 15:46:41 (...) pacemakerd[111190]:   notice: Configured corosync to
>> accept connections from group 498: Library error (2)
> 
> IIRC, that last line ^ was one of the symptoms.

Yes, that does look like the culprit. The root cause is libqb being
unable to handle 6-digit PIDs, which we can see in the above logs --
"[111190]".

As a workaround, you can lower /proc/sys/kernel/pid_max (aka
kernel.pid_max sysctl variable), if you don't want to install a newer
libqb before CentOS 6.8 is released, which will have the fix.




More information about the Users mailing list