<div class="gmail_quote">On Thu, May 27, 2010 at 5:50 PM, Steven Dake <span dir="ltr"><<a href="mailto:sdake@redhat.com">sdake@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">On 05/27/2010 08:40 AM, Diego Remolina wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Is there any workaround for this? Perhaps a slightly older version of<br>

the rpms? If so where do I find those?<br>

<br>

</blockquote>

<br></div>

Corosync 1.2.1 doesn't have this issue apparently.  With corosync 1.2.1, please don't use "debug: on" keyword in your config options.  I am not sure where Andrew has corosync 1.2.1 rpms available.<br>

<br>

The corosync project itself doesn't release rpms.  See our policy on this topic:<br>

<br>

<a href="http://www.corosync.org/doku.php?id=faq:release_binaries" target="_blank">http://www.corosync.org/doku.php?id=faq:release_binaries</a><br>

<br>

Regards<br>

-steve<div><div></div><div class="h5"><br>

<br></div></div></blockquote><div><br></div><div>In my case, using pacemaker/corosync from clusterlabs repo on rh el 5.5 32 bit I had:</div><div>- both nodes ha1 and ha2 with </div><div><div>[root@ha1 ~]# rpm -qa corosync\* pacemaker\*</div>

<div>pacemaker-1.0.8-6.el5</div><div>corosynclib-1.2.1-1.el5</div><div>corosync-1.2.1-1.el5</div><div>pacemaker-libs-1.0.8-6.el5</div></div><div><br></div><div>- stop of corosync on node ha1</div><div>- update (using clusterlabs repo proposed and applied packages for pacemaker with same version... donna if same bits..)</div>

<div>This takes corosync to 1.2.2</div><div>- start of corosync on ha1 and successfull join with the still corosync 1.2.1 one</div><div> May 27 18:59:23 ha1 corosync[5136]:   [MAIN  ] Corosync Cluster Engine exiting with status -1 at main.c:160.</div>

<div>May 27 19:06:19 ha1 yum: Updated: corosynclib-1.2.2-1.1.el5.i386</div><div>May 27 19:06:19 ha1 yum: Updated: pacemaker-libs-1.0.8-6.1.el5.i386</div><div>May 27 19:06:19 ha1 yum: Updated: corosync-1.2.2-1.1.el5.i386</div>

<div>May 27 19:06:20 ha1 yum: Updated: pacemaker-1.0.8-6.1.el5.i386</div><div>May 27 19:06:20 ha1 yum: Updated: corosynclib-devel-1.2.2-1.1.el5.i386</div><div>May 27 19:06:22 ha1 yum: Updated: pacemaker-libs-devel-1.0.8-6.1.el5.i386</div>

<div>May 27 19:06:59 ha1 corosync[7442]:   [MAIN  ] Corosync Cluster Engine ('1.2.2'): started and ready to provide service.</div><div>May 27 19:06:59 ha1 corosync[7442]:   [MAIN  ] Corosync built-in features: nss rdma</div>

<div>May 27 19:06:59 ha1 corosync[7442]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.</div><div>May 27 19:06:59 ha1 corosync[7442]:   [TOTEM ] Initializing transport (UDP/IP).</div>

<div>May 27 19:06:59 ha1 corosync[7442]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).</div><div><br></div><div>this implies also start of resources on it (nfsclient and apache in my case)</div>

<div><br></div><div>- move (and unmove to be able to take them again) of resources from ha2 to the updated node ha1 (nfs-group in my case)</div><div><div> Resource Group: nfs-group</div><div>     lv_drbd0   (ocf::heartbeat:LVM):   Started ha1</div>

<div>     ClusterIP  (ocf::heartbeat:IPaddr2):       Started ha1</div><div>     NfsFS      (ocf::heartbeat:Filesystem):    Started ha1</div><div>     nfssrv     (ocf::heartbeat:nfsserver):     Started ha1</div></div><div>

<br></div><div>- stop of corosync 1.2.1 on ha2</div><div>- update of pacemaker and corosync on ha2</div><div>- startup of corosync on ha2 and correct join to cluster with start of its resources (nfsclient and apache in my case)</div>

<div><div>May 27 19:14:42 ha2 corosync[30954]:   [pcmk  ] notice: pcmk_shutdown: cib confirmed stopped</div><div>May 27 19:14:42 ha2 corosync[30954]:   [pcmk  ] notice: stop_child: Sent -15 to stonithd: [30961]</div><div>

May 27 19:14:42 ha2 stonithd: [30961]: notice: /usr/lib/heartbeat/stonithd normally quit.</div><div>May 27 19:14:42 ha2 corosync[30954]:   [pcmk  ] info: pcmk_ipc_exit: Client stonithd (conn=0x82aee48, async-conn=0x82aee48) left</div>

<div>May 27 19:14:43 ha2 corosync[30954]:   [pcmk  ] notice: pcmk_shutdown: stonithd confirmed stopped</div><div>May 27 19:14:43 ha2 corosync[30954]:   [pcmk  ] info: update_member: Node ha2 now has process list: 00000000000000000000000000000002 (2)</div>

<div>May 27 19:14:43 ha2 corosync[30954]:   [pcmk  ] notice: pcmk_shutdown: Shutdown complete</div><div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: Pacemaker Cluster Manager 1.0.8</div><div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync extended virtual synchrony service</div>

<div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync configuration service</div><div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01</div>

<div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync cluster config database access v1.01</div><div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync profile loading service</div>

<div>May 27 19:14:43 ha2 corosync[30954]:   [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1</div><div>May 27 19:14:43 ha2 corosync[30954]:   [MAIN  ] Corosync Cluster Engine exiting with status -1 at main.c:160.</div>

<div>May 27 19:15:51 ha2 yum: Updated: corosynclib-1.2.2-1.1.el5.i386</div><div>May 27 19:15:51 ha2 yum: Updated: pacemaker-libs-1.0.8-6.1.el5.i386</div><div>May 27 19:15:52 ha2 yum: Updated: corosync-1.2.2-1.1.el5.i386</div>

<div>May 27 19:15:52 ha2 yum: Updated: pacemaker-1.0.8-6.1.el5.i386</div><div>May 27 19:17:00 ha2 corosync[3430]:   [MAIN  ] Corosync Cluster Engine ('1.2.2'): started and ready to provide service.</div><div>May 27 19:17:00 ha2 corosync[3430]:   [MAIN  ] Corosync built-in features: nss rdma</div>

<div>May 27 19:17:00 ha2 corosync[3430]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.</div><div>May 27 19:17:00 ha2 corosync[3430]:   [TOTEM ] Initializing transport (UDP/IP).</div>

<div>May 27 19:17:00 ha2 corosync[3430]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).</div></div><div><br></div><div>So in my case the sw upgrade was successfull with no downtime.</div>

<div><br></div><div>Gianluca</div><div><br></div><div><br></div></div>