[Pacemaker] After reboot, node does not an automatically rejoin

Tom Tux tomtux80 at gmail.com
Thu Jul 19 09:11:36 UTC 2012


When I reboot one of our two-node-cluster-boxes (sles11 sp1, fully
patched, HAE installed, the node does not rejoin himself to the
cluster. I got the following error:

corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.attrd failed: ipc delivery failed (rc=-2)
corosync[5377]:  [pcmk  ] WARN: route_ais_message: Sending message to
local.cib failed: ipc delivery failed (rc=-2)

The corosync-objctl-tool knows both members as joined:
$ corosync-objctl | grep member
runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(
runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(

The 'crm status' gives the following output:
$ crm status
Connection to cluster failed: connection failed

After a manual restart (/etc/init.d/openais restart), the node rejoins
successfully. Any reasons/hints, why the node doesn't do the rejoin
within the normal init-procedure?

Many thanks.

More information about the Pacemaker mailing list