Okey dokey, I've done some further troubleshooting and started again from scratch on a new node. I'm performing this setup on a CentOS 5.5 node. Here's an excerpt from my messages file taken after doing a "yum -y install pacemaker corosync"<div>
<br></div><div><div>Apr 8 11:50:19 cvt-db-003 yum: Updated: bzip2-libs-1.0.3-6.el5_5.x86_64</div><div>....many packages removed......</div><div>Apr 8 11:50:34 cvt-db-003 yum: Installed: corosync-1.2.7-1.1.el5.i386</div>
<div>Apr 8 11:50:34 cvt-db-003 yum: Installed: corosynclib-1.2.7-1.1.el5.x86_64</div><div>Apr 8 11:50:34 cvt-db-003 yum: Installed: pacemaker-libs-1.0.10-1.4.el5.x86_64</div><div>Apr 8 11:50:34 cvt-db-003 yum: Installed: corosync-1.2.7-1.1.el5.x86_64</div>
<div>Apr 8 11:50:35 cvt-db-003 yum: Installed: heartbeat-stonith-2.1.4-11.el5.x86_64</div><div>Apr 8 11:50:35 cvt-db-003 yum: Installed: pacemaker-1.0.10-1.4.el5.i386</div><div>Apr 8 11:50:35 cvt-db-003 yum: Updated: rpm-libs-4.4.2.3-20.el5_5.1.x86_64</div>
<div>Apr 8 11:50:35 cvt-db-003 yum: Updated: rpm-4.4.2.3-20.el5_5.1.x86_64</div><div>Apr 8 11:50:35 cvt-db-003 yum: Updated: rpm-python-4.4.2.3-20.el5_5.1.x86_64</div><div>Apr 8 11:50:36 cvt-db-003 yum: Installed: pacemaker-1.0.10-1.4.el5.x86_64</div>
<div>Apr 8 11:50:39 cvt-db-003 cl_status: [18858]: ERROR: Cannot signon with heartbeat</div><div>Apr 8 11:50:39 cvt-db-003 cl_status: [18858]: ERROR: REASON: hb_api_signon: Can't initiate connection to heartbeat</div>
<div>Apr 8 11:50:39 cvt-db-003 cl_status: [18859]: ERROR: Cannot signon with heartbeat</div><div>Apr 8 11:50:39 cvt-db-003 cl_status: [18859]: ERROR: REASON: hb_api_signon: Can't initiate connection to heartbeat</div>
<div>Apr 8 11:51:39 cvt-db-003 cl_status: [18971]: ERROR: Cannot signon with heartbeat</div><div>...many more follow....</div><div><br></div><div><br>What's weird to me is that I hadn't started ANY services or run any commands by this point, I'm thinking something in the RPM is kicking off that cl_status command.</div>
<div><br></div><div>I believe I've determined that when rpm package heartbeat-3.0.3-2.3.el5.x86_64.rpm is installed, that's when the errors start occurring. It seems like that is a required dependency for the latest pacemaker RPM on <a href="http://www.clusterlabs.org/rpm/epel-5/">http://www.clusterlabs.org/rpm/epel-5/</a>. I removed the pacemaker and heartbeat packages using yum, and then re-added them via RPMs, but found out that pacemaker requires the heartbeat-libs package or tools such as crm_verify fail. Following re-install of heartbeat-libs, pacemaker, and pacemaker-libs with --no-deps, no more erroneous error messages. I can break/fix the issue by installing and removing the heartbeat-3.0.3-2.3.el5.x86_64 package.</div>
<div><br></div><div>c</div><div><br></div><br><div class="gmail_quote">On Fri, Apr 8, 2011 at 9:48 AM, Lars Ellenberg <span dir="ltr"><<a href="mailto:lars.ellenberg@linbit.com">lars.ellenberg@linbit.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">On Fri, Apr 08, 2011 at 09:13:45AM +0200, Andrew Beekhof wrote:<br>
> On Thu, Apr 7, 2011 at 11:48 PM, Colin Hines <<a href="mailto:colinhines@gmail.com">colinhines@gmail.com</a>> wrote:<br>
> > I've recently followed the clusters from scratch v2 document for RHEL and<br>
> > although my cluster works and fails over correctly using corosync, I have<br>
> > the following error message coming up in my logs consistently, twice a<br>
> > minute:<br>
> > Apr 7 17:44:41 cvt-db-005 cl_status: [5901]: ERROR: Cannot signon with<br>
> > heartbeat<br>
> > Apr 7 17:44:41 cvt-db-005 cl_status: [5901]: ERROR: REASON: hb_api_signon:<br>
> > Can't initiate connection to heartbeat<br>
><br>
> Someone/something is running cl_status.<br>
> Find out who/what and stop them - it has no place in a corosync based cluster.<br>
<br>
</div>That could be the status action of the SBD stonith plugin,<br>
between commits<br>
<a href="http://hg.linux-ha.org/glue/rev/faada7f3d069" target="_blank">http://hg.linux-ha.org/glue/rev/faada7f3d069</a> (Apr 2010)<br>
<a href="http://hg.linux-ha.org/glue/rev/1448deafdf79" target="_blank">http://hg.linux-ha.org/glue/rev/1448deafdf79</a> (May 2010)<br>
<br>
if so, upgrade your "cluster glue".<br>
<div class="im"><br>
> > I can send my configs, but they're pretty vanilla, has anyone seen anything<br>
> > like this before. I did have a heartbeat installation on this host before<br>
> > I followed the CFSv2 document, but heartbeat is stopped and I've verified<br>
> > that cl_status doesn't output those errors if I stop corosync.<br>
> > c<br>
<br>
</div><font color="#888888">--<br>
: Lars Ellenberg<br>
: LINBIT | Your Way to High Availability<br>
: DRBD/HA support and consulting <a href="http://www.linbit.com" target="_blank">http://www.linbit.com</a><br>
<br>
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.<br>
</font><div><div></div><div class="h5"><br>
_______________________________________________<br>
Pacemaker mailing list: <a href="mailto:Pacemaker@oss.clusterlabs.org">Pacemaker@oss.clusterlabs.org</a><br>
<a href="http://oss.clusterlabs.org/mailman/listinfo/pacemaker" target="_blank">http://oss.clusterlabs.org/mailman/listinfo/pacemaker</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker" target="_blank">http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker</a><br>
</div></div></blockquote></div><br></div>