[Pacemaker] Nodes appear UNCLEAN (offline) during Pacemaker upgrade to 1.1.7

Parshvi parshvi.17 at gmail.com
Wed Nov 28 01:50:35 EST 2012


Parshvi <parshvi.17 at ...> writes:

> 
> Thanks Andrew for your input.
> Andrew Beekhof <andrew at ...> writes:
> 
> > 
> > On Fri, Nov 23, 2012 at 11:47 PM, Parshvi <parshvi.17 at ...> wrote:
> > > Hi,
> > > We are upgrading to Pacemaker 1.1.7 and Corosync 1.4.3.
> > > The previous version was:
> > > Pacemaker: 1.0.12
> > > Corosync : 1.2.7
> > > The issues faced in the older version are:
> > > 1) Numerous, Policy engine and crmd crashes, stopping failed cluster 
> resources
> > > from recovering.
> > 
> > Did you report any of these?
> > I can't fix bugs I don't know about.
> I have raised the issue on the forum mails. Haven't opened a bug though on 
> bugzilla. I would file a bug for the issue now.
A bug has been filed in this regard: Bug Id:5124
> > 
> > > 2) pacemaker logs show FSM in pending state, service comes in sync only 
> after a
> > > restart.
> > 
> > As above.
> Raised the issue on forum. Will file a bug now.
> > 
> > >
> > > Environment:
> > > 1) OS: OEL 5.8
> > > RPMS(packages) for Pacemaker 1.1.7, Corosync 1.4.3 and other dependent 
pkgs 
> are
> > > not available for OEL 5.8. Hence, we have build all pkgs from source 
> (github).
> > 
> > Did you try the ones at: http://clusterlabs.org/rpm-next/

rpms of pacemaker 1.1.8 & corosync 1.4.1 at http://clusterlabs.org/rpm-
next/rhel-5 installed correctly.
crmsh (crm shell): No rpms available for OEL-5(or RHEL-5)
Tried building crmsh from source. It throws the following warnings & fails at 
configure:
/bin/sh: crmd: command not found
WARNING: list index out of range
WARNING: could not get the pacemaker version, bad installation?
As per crmsh forum, it requires pkgconfig(.pc) files. This requires pacemaker 
and corosync to be build.
Building pacemaker 1.1.8 on OEL-5.8 requires glibc, libqb of compatible versions 
(build fails).
Can u suggest any help in regard to crm ?
> 
> > 
> > >
> > > We have a two node cluster. We have installed the build binaries on both 
> cluster
> > > nodes. crm_mon shows both nodes as online. All processes of corosync and
> > > pacemaker appear started and running.
> > >
> > > Issues faced:
> > > We have another setup, consisting of two nodes in the cluster(same as 
> above).
> > > Pkg binaries have been installed on both the nodes.
> > > One of the nodes appears UNCLEAN (offline) and other node appears 
(offline).
> > > crmd process continuously respawns until its max respawn count is reached. 
> DC
> > > appears NONE in crm_mon.
> > >
> > > I have an hb_report of the nodes. I can share it if needed.
> > 
> > Yes please. Not much we can do without it.  Or at least without some
> > sort of description beyond "the crmd respawns".
> Will share the hb_report.
You can find the hb_report as an attachment to Bug Id:5124
Please provide some insight.
> > 

> > > I would request the owners to please respond with some input. The old 
> version is
> > > a concern at our production.







More information about the Pacemaker mailing list