<div dir="ltr"><div><div><div><div><div><div><div>Hi,<br><br></div>Firstly thank you for such a great tool.<br><br></div>When testing pacemaker I encountered a start error, which seems to be related to reported libqb segmentation fault.<br></div>- cluster started and acquired quorum<br></div>- some nodes failed to connect to CIB, and lost membership as a result<br></div>- restart solved the problem<br><br></div>Segmentation fault reports libqb library in version 0.17.1, a standard package provided for CentOS.6.<br><br></div><div>Please let me know if the problem is known, and if there is a remedy (e.g. using the latest libqb).<br></div><div>Logs are below.<br><br><br></div><div>Thank you in advance,<br></div><div><br><br><br></div><div><br>Logs from /var/log/messages:<br><br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Configured corosync to accept connections from group 498: Library error (2)<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Starting Pacemaker 1.1.13-1.el6 (Build: 577898d): generated-manpages agent-manpages ncurses libqb-logging libqb-ipc upstart nagios corosync-native atomic-attrd acls<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Quorum acquired<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: pcmk_quorum_notification: Node (...)[3] - state is now member (was (null))<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: pcmk_quorum_notification: Node (...)[4] - state is now member (was (null))<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: pcmk_quorum_notification: Node (...)[2] - state is now member (was (null))<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: pcmk_quorum_notification: Node (...)[1] - state is now member (was (null))<br>Apr 22 15:46:41 (...) lrmd[111194]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) stonith-ng[111193]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) cib[111192]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) attrd[111195]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) stonith-ng[111193]: notice: Connecting to cluster infrastructure: corosync<br>Apr 22 15:46:41 (...) pengine[111196]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) attrd[111195]: notice: Connecting to cluster infrastructure: corosync<br>Apr 22 15:46:41 (...) crmd[111197]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) crmd[111197]: notice: CRM Git Version: 1.1.13-1.el6 (577898d)<br>Apr 22 15:46:41 (...) attrd[111195]: error: Could not connect to the Cluster Process Group API: 11<br>Apr 22 15:46:41 (...) attrd[111195]: error: Cluster connection failed<br>Apr 22 15:46:41 (...) attrd[111195]: notice: Cleaning up before exit<br>Apr 22 15:46:41 (...) stonith-ng[111193]: notice: crm_update_peer_proc: Node (...)[3] - state is now member (was (null))<br>Apr 22 15:46:41 (...) pacemakerd[111190]: error: Managed process 111195 (attrd) dumped core<br>Apr 22 15:46:41 (...) pacemakerd[111190]: error: The attrd process (111195) terminated with signal 11 (core=1)<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Respawning failed child process: attrd<br>Apr 22 15:46:41 (...) cib[111192]: notice: Connecting to cluster infrastructure: corosync<br>Apr 22 15:46:41 (...) cib[111192]: error: Could not connect to the Cluster Process Group API: 11<br>Apr 22 15:46:41 (...) cib[111192]: crit: Cannot sign in to the cluster... terminating<br>Apr 22 15:46:41 (...) kernel: [17169.112132] attrd[111195]: segfault at 1b8 ip 00007f6fc9dc3181 sp 00007ffd7cf668f0 error 4 in libqb.so.0.17.1[7f6fc9db4000+21000]<br>Apr 22 15:46:41 (...) pacemakerd[111190]: warning: The cib process (111192) can no longer be respawned, shutting the cluster down.<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Shutting down Pacemaker<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Stopping crmd: Sent -15 to process 111197<br>Apr 22 15:46:41 (...) attrd[111198]: notice: Additional logging available in /var/log/pacemaker.log<br>Apr 22 15:46:41 (...) crmd[111197]: warning: Couldn't complete CIB registration 1 times... pause and retry<br>Apr 22 15:46:41 (...) crmd[111197]: notice: Invoking handler for signal 15: Terminated<br>Apr 22 15:46:41 (...) crmd[111197]: notice: Requesting shutdown, upper limit is 1200000ms<br>Apr 22 15:46:41 (...) crmd[111197]: warning: FSA: Input I_SHUTDOWN from crm_shutdown() received in state S_STARTING<br>Apr 22 15:46:41 (...) crmd[111197]: notice: State transition S_STARTING -> S_STOPPING [ input=I_SHUTDOWN cause=C_SHUTDOWN origin=crm_shutdown ]<br>Apr 22 15:46:41 (...) crmd[111197]: notice: Disconnecting from Corosync<br>Apr 22 15:46:41 (...) attrd[111198]: notice: Connecting to cluster infrastructure: corosync<br>Apr 22 15:46:41 (...) attrd[111198]: error: Could not connect to the Cluster Process Group API: 11<br>Apr 22 15:46:41 (...) attrd[111198]: error: Cluster connection failed<br>Apr 22 15:46:41 (...) attrd[111198]: notice: Cleaning up before exit<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Stopping pengine: Sent -15 to process 111196<br>Apr 22 15:46:41 (...) pengine[111196]: notice: Invoking handler for signal 15: Terminated<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Stopping attrd: Sent -15 to process 111198<br>Apr 22 15:46:41 (...) pacemakerd[111190]: error: Managed process 111198 (attrd) dumped core<br>Apr 22 15:46:41 (...) pacemakerd[111190]: error: The attrd process (111198) terminated with signal 11 (core=1)<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Stopping lrmd: Sent -15 to process 111194<br>Apr 22 15:46:41 (...) lrmd[111194]: notice: Invoking handler for signal 15: Terminated<br>Apr 22 15:46:41 (...) pacemakerd[111190]: notice: Stopping stonith-ng: Sent -15 to process 111193<br>Apr 22 15:46:41 (...) kernel: [17169.121628] attrd[111198]: segfault at 1b8 ip 00007f3a98f66181 sp 00007ffe33407380 error 4 in libqb.so.0.17.1[7f3a98f57000+21000]<br>Apr 22 15:46:50 (...) stonith-ng[111193]: error: Could not connect to the CIB service: Transport endpoint is not connected (-107)<br>Apr 22 15:46:50 (...) stonith-ng[111193]: notice: Invoking handler for signal 15: Terminated<br>Apr 22 15:46:50 (...) pacemakerd[111190]: notice: Shutdown complete<br>Apr 22 15:46:50 (...) pacemakerd[111190]: notice: Attempting to inhibit respawning after fatal error<br><br><br><br></div><br>Logs from corosync log:<br><br>Apr 22 15:46:22 [93582] (...) corosync notice [MAIN ] Corosync Cluster Engine exiting normally<br>Apr 22 15:46:40 [111147] (...) corosync notice [MAIN ] Corosync Cluster Engine ('2.3.5.12-a71e'): started and ready to provide service.<br>Apr 22 15:46:40 [111147] (...) corosync info [MAIN ] Corosync built-in features: dbus pie relro bindnow<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] Initializing transport (UDP/IP Unicast).<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] The network interface [(...)] is now up.<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync configuration map access [0]<br>Apr 22 15:46:40 [111147] (...) corosync info [QB ] server name: cmap<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync configuration service [1]<br>Apr 22 15:46:40 [111147] (...) corosync info [QB ] server name: cfg<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2]<br>Apr 22 15:46:40 [111147] (...) corosync info [QB ] server name: cpg<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync profile loading service [4]<br>Apr 22 15:46:40 [111147] (...) corosync notice [QUORUM] Using quorum provider corosync_votequorum<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5]<br>Apr 22 15:46:40 [111147] (...) corosync info [QB ] server name: votequorum<br>Apr 22 15:46:40 [111147] (...) corosync notice [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3]<br>Apr 22 15:46:40 [111147] (...) corosync info [QB ] server name: quorum<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] adding new UDPU member {(...)}<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] adding new UDPU member {(...)}<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] adding new UDPU member {(...)}<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] adding new UDPU member {(...)}<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] A new membership ((...):660) was formed. Members joined: 3<br>Apr 22 15:46:40 [111147] (...) corosync notice [QUORUM] Members[1]: 3<br>Apr 22 15:46:40 [111147] (...) corosync notice [MAIN ] Completed service synchronization, ready to provide service.<br>Apr 22 15:46:40 [111147] (...) corosync notice [TOTEM ] A new membership ((...):664) was formed. Members joined: 4 2 1<br>Apr 22 15:46:40 [111147] (...) corosync notice [QUORUM] This node is within the primary component and will provide service.<br>Apr 22 15:46:40 [111147] (...) corosync notice [QUORUM] Members[4]: 3 4 2 1<br>Apr 22 15:46:40 [111147] (...) corosync notice [MAIN ] Completed service synchronization, ready to provide service.<br>Apr 22 15:46:41 [111147] (...) corosync error [MAIN ] Denied connection attempt from 498:498<br>Apr 22 15:46:41 [111147] (...) corosync error [QB ] Invalid IPC credentials (111148-111195-2).<br>Apr 22 15:46:41 [111147] (...) corosync error [MAIN ] Denied connection attempt from 498:498<br>Apr 22 15:46:41 [111147] (...) corosync error [QB ] Invalid IPC credentials (111148-111192-2).<br>Apr 22 15:46:41 [111147] (...) corosync error [MAIN ] Denied connection attempt from 498:498<br>Apr 22 15:46:41 [111147] (...) corosync error [QB ] Invalid IPC credentials (111148-111198-2).<br><br><div><br clear="all"><div><div><div><div><div><div><div><div><br>-- <br><div class="gmail_signature"><div dir="ltr"><div>Best Regards,<br><br>Radoslaw Garbacz<br></div>XtremeData Incorporation<br></div></div>
</div></div></div></div></div></div></div></div></div></div>