<div class="gmail_quote">On Thu, May 27, 2010 at 5:50 PM, Steven Dake <span dir="ltr"><<a href="mailto:sdake@redhat.com">sdake@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On 05/27/2010 08:40 AM, Diego Remolina wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Is there any workaround for this? Perhaps a slightly older version of<br>
the rpms? If so where do I find those?<br>
<br>
</blockquote>
<br></div>
Corosync 1.2.1 doesn't have this issue apparently. With corosync 1.2.1, please don't use "debug: on" keyword in your config options. I am not sure where Andrew has corosync 1.2.1 rpms available.<br>
<br>
The corosync project itself doesn't release rpms. See our policy on this topic:<br>
<br>
<a href="http://www.corosync.org/doku.php?id=faq:release_binaries" target="_blank">http://www.corosync.org/doku.php?id=faq:release_binaries</a><br>
<br>
Regards<br>
-steve<div><div></div><div class="h5"><br>
<br></div></div></blockquote><div><br></div><div>In my case, using pacemaker/corosync from clusterlabs repo on rh el 5.5 32 bit I had:</div><div>- both nodes ha1 and ha2 with </div><div><div>[root@ha1 ~]# rpm -qa corosync\* pacemaker\*</div>
<div>pacemaker-1.0.8-6.el5</div><div>corosynclib-1.2.1-1.el5</div><div>corosync-1.2.1-1.el5</div><div>pacemaker-libs-1.0.8-6.el5</div></div><div><br></div><div>- stop of corosync on node ha1</div><div>- update (using clusterlabs repo proposed and applied packages for pacemaker with same version... donna if same bits..)</div>
<div>This takes corosync to 1.2.2</div><div>- start of corosync on ha1 and successfull join with the still corosync 1.2.1 one</div><div> May 27 18:59:23 ha1 corosync[5136]: [MAIN ] Corosync Cluster Engine exiting with status -1 at main.c:160.</div>
<div>May 27 19:06:19 ha1 yum: Updated: corosynclib-1.2.2-1.1.el5.i386</div><div>May 27 19:06:19 ha1 yum: Updated: pacemaker-libs-1.0.8-6.1.el5.i386</div><div>May 27 19:06:19 ha1 yum: Updated: corosync-1.2.2-1.1.el5.i386</div>
<div>May 27 19:06:20 ha1 yum: Updated: pacemaker-1.0.8-6.1.el5.i386</div><div>May 27 19:06:20 ha1 yum: Updated: corosynclib-devel-1.2.2-1.1.el5.i386</div><div>May 27 19:06:22 ha1 yum: Updated: pacemaker-libs-devel-1.0.8-6.1.el5.i386</div>
<div>May 27 19:06:59 ha1 corosync[7442]: [MAIN ] Corosync Cluster Engine ('1.2.2'): started and ready to provide service.</div><div>May 27 19:06:59 ha1 corosync[7442]: [MAIN ] Corosync built-in features: nss rdma</div>
<div>May 27 19:06:59 ha1 corosync[7442]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.</div><div>May 27 19:06:59 ha1 corosync[7442]: [TOTEM ] Initializing transport (UDP/IP).</div>
<div>May 27 19:06:59 ha1 corosync[7442]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).</div><div><br></div><div>this implies also start of resources on it (nfsclient and apache in my case)</div>
<div><br></div><div>- move (and unmove to be able to take them again) of resources from ha2 to the updated node ha1 (nfs-group in my case)</div><div><div> Resource Group: nfs-group</div><div> lv_drbd0 (ocf::heartbeat:LVM): Started ha1</div>
<div> ClusterIP (ocf::heartbeat:IPaddr2): Started ha1</div><div> NfsFS (ocf::heartbeat:Filesystem): Started ha1</div><div> nfssrv (ocf::heartbeat:nfsserver): Started ha1</div></div><div>
<br></div><div>- stop of corosync 1.2.1 on ha2</div><div>- update of pacemaker and corosync on ha2</div><div>- startup of corosync on ha2 and correct join to cluster with start of its resources (nfsclient and apache in my case)</div>
<div><div>May 27 19:14:42 ha2 corosync[30954]: [pcmk ] notice: pcmk_shutdown: cib confirmed stopped</div><div>May 27 19:14:42 ha2 corosync[30954]: [pcmk ] notice: stop_child: Sent -15 to stonithd: [30961]</div><div>
May 27 19:14:42 ha2 stonithd: [30961]: notice: /usr/lib/heartbeat/stonithd normally quit.</div><div>May 27 19:14:42 ha2 corosync[30954]: [pcmk ] info: pcmk_ipc_exit: Client stonithd (conn=0x82aee48, async-conn=0x82aee48) left</div>
<div>May 27 19:14:43 ha2 corosync[30954]: [pcmk ] notice: pcmk_shutdown: stonithd confirmed stopped</div><div>May 27 19:14:43 ha2 corosync[30954]: [pcmk ] info: update_member: Node ha2 now has process list: 00000000000000000000000000000002 (2)</div>
<div>May 27 19:14:43 ha2 corosync[30954]: [pcmk ] notice: pcmk_shutdown: Shutdown complete</div><div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: Pacemaker Cluster Manager 1.0.8</div><div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync extended virtual synchrony service</div>
<div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync configuration service</div><div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01</div>
<div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync cluster config database access v1.01</div><div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync profile loading service</div>
<div>May 27 19:14:43 ha2 corosync[30954]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1</div><div>May 27 19:14:43 ha2 corosync[30954]: [MAIN ] Corosync Cluster Engine exiting with status -1 at main.c:160.</div>
<div>May 27 19:15:51 ha2 yum: Updated: corosynclib-1.2.2-1.1.el5.i386</div><div>May 27 19:15:51 ha2 yum: Updated: pacemaker-libs-1.0.8-6.1.el5.i386</div><div>May 27 19:15:52 ha2 yum: Updated: corosync-1.2.2-1.1.el5.i386</div>
<div>May 27 19:15:52 ha2 yum: Updated: pacemaker-1.0.8-6.1.el5.i386</div><div>May 27 19:17:00 ha2 corosync[3430]: [MAIN ] Corosync Cluster Engine ('1.2.2'): started and ready to provide service.</div><div>May 27 19:17:00 ha2 corosync[3430]: [MAIN ] Corosync built-in features: nss rdma</div>
<div>May 27 19:17:00 ha2 corosync[3430]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.</div><div>May 27 19:17:00 ha2 corosync[3430]: [TOTEM ] Initializing transport (UDP/IP).</div>
<div>May 27 19:17:00 ha2 corosync[3430]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).</div></div><div><br></div><div>So in my case the sw upgrade was successfull with no downtime.</div>
<div><br></div><div>Gianluca</div><div><br></div><div><br></div></div>