[ClusterLabs] libqb 0.17.1 - segfault at 1b8
kgaillot at redhat.com
Mon May 2 17:05:22 EDT 2016
On 05/02/2016 03:45 PM, Jan Pokorný wrote:
> Hello Radoslaw,
> On 02/05/16 11:47 -0500, Radoslaw Garbacz wrote:
>> When testing pacemaker I encountered a start error, which seems to be
>> related to reported libqb segmentation fault.
>> - cluster started and acquired quorum
>> - some nodes failed to connect to CIB, and lost membership as a result
>> - restart solved the problem
>> Segmentation fault reports libqb library in version 0.17.1, a standard
>> package provided for CentOS.6.
> Chances are that you are running into this nasty bug:
>> Please let me know if the problem is known, and if there is a remedy (e.g.
>> using the latest libqb).
> Try libqb >= 0.17.2.
>> Logs from /var/log/messages:
>> Apr 22 15:46:41 (...) pacemakerd: notice: Additional logging
>> available in /var/log/pacemaker.log
>> Apr 22 15:46:41 (...) pacemakerd: notice: Configured corosync to
>> accept connections from group 498: Library error (2)
> IIRC, that last line ^ was one of the symptoms.
Yes, that does look like the culprit. The root cause is libqb being
unable to handle 6-digit PIDs, which we can see in the above logs --
As a workaround, you can lower /proc/sys/kernel/pid_max (aka
kernel.pid_max sysctl variable), if you don't want to install a newer
libqb before CentOS 6.8 is released, which will have the fix.
More information about the Users