[Pacemaker] qb_ipcs_disconnect message in corosync cluster

Bharathiraja P raja at where2getit.com
Wed Jan 7 23:27:25 EST 2015


Thanks Andrew.

I upgraded corosync and pacemaker and the cluster works fine now.

On Thu, Jan 8, 2015 at 8:26 AM, Andrew Beekhof <andrew at beekhof.net> wrote:

>
> > On 15 Dec 2014, at 4:29 pm, Bharathiraja P <raja at where2getit.com> wrote:
> >
> > Hi Andrew,
> >
> > Frequently one node gets disconnected from CIB and stops the cluster
> resources. I'm not able to start or cleanup failed actions for any of the
> resources. For ex, if nodeA gets disconnected from CIB, I won't be able to
> run actions on a resource like cleanup/stop/restart,... as that hangs
> forever.
> >
> > In corosync log I will see a message like this " cib:    debug:
> qb_ipcs_disconnect:       qb_ipcs_disconnect(3760-5529-
> > 13) state:2"
> >
> > All I had to do is to force kill the cib process on both nodes multiple
> times.
> >
> > Let me know if you need any other info to nail down this issue.
>
> For starters, we'd need to know what process 5529 was and what the rest of
> the processes in the cluster were doing.
> Its impossible to say anything from so few non-error logs.
>
> >
> > --
> > Bharathiraja
> >
> > On Mon, Dec 15, 2014 at 9:19 AM, Andrew Beekhof <andrew at beekhof.net>
> wrote:
> >
> > > On 12 Dec 2014, at 9:57 pm, Bharathiraja P <raja at where2getit.com>
> wrote:
> > >
> > > Hi,
> > >
> > > We run pacemaker+corosync cluster on OpenSuSE 13.1 QEMU guests.
> > >
> > > Frequently, one node gets disconnected from cib. This is the message
> seen in corosync logs,
> > >
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:    debug:
> qb_ipcs_dispatch_connection_request:      HUP conn (3760-5529-13)
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:    debug:
> qb_ipcs_disconnect:       qb_ipcs_disconnect(3760-5529-13) state:2
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:     info:
> crm_client_destroy:       Destroying 0 events
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:    debug:
> qb_rb_close:      Free'ing ringbuffer:
> /dev/shm/qb-cib_ro-response-3760-5529-13-header
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:    debug:
> qb_rb_close:      Free'ing ringbuffer:
> /dev/shm/qb-cib_ro-event-3760-5529-13-header
> > > Nov 25 08:36:07 [3760] sysmon-secondary        cib:    debug:
> qb_rb_close:      Free'ing ringbuffer:
> /dev/shm/qb-cib_ro-request-3760-5529-13-header
> > >
> > >
> > > Can you pls help fix the issue?
> >
> > What issue?
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20150108/50e39936/attachment-0003.html>


More information about the Pacemaker mailing list