<html><body><p>Ulrich, <br><br>Thank you very much for your feedback. <br><br>You wrote, "<tt>Could it be you forgot "allow-migrate=true" at the resource level or some migration IP address at the node level?<br>I only have SLES11 here...</tt>"<br><br>I know for sure that the pacemaker remote node (zs95kjg110102) I mentioned below is configured correctly for pacemaker Live Guest Migration. <br>I can demonstrate this using the 'pcs resource move' CLI : <br><br>I will migrate this "remote node" guest (zs95kjg110102) and resource "zs95kjg110102_res" to another cluster node <br>(e.g. zs95kjpcs1 / 10.20.93.12) , using the 'pcs1' hostname / IP which is currently running on zs93kjpcs1 (10.20.93.11): <br><br>[root@zs95kj ~]# pcs resource show |grep zs95kjg110102_res<br> zs95kjg110102_res (ocf::heartbeat:VirtualDomain): Started zs93kjpcs1<br><br>[root@zs93kj ~]# pcs resource move zs95kjg110102_res zs95kjpcs1<br><br>[root@zs93kj ~]# pcs resource show |grep zs95kjg110102_res<br> zs95kjg110102_res (ocf::heartbeat:VirtualDomain): Started zs95kjpcs1<br><br>## On zs95kjpcs1, you can see that the guest is actually running there...<br><br>[root@zs95kj ~]# virsh list |grep zs95kjg110102<br> 63 zs95kjg110102 running<br><br>[root@zs95kj ~]# ping 10.20.110.102<br>PING 10.20.110.102 (10.20.110.102) 56(84) bytes of data.<br>64 bytes from 10.20.110.102: icmp_seq=1 ttl=63 time=0.775 ms<br><br>So, everything seems set up correctly for live guest migration of this VirtualDomain resource. <br><br>What I am really looking for is a way to ensure 100% availability of a "live guest migratable" pacemaker remote node guest<br>in a situation where the interface (in this case vlan1293) ring0_addr goes down. I thought that maybe configuring<br>Redundant Ring Protocol (RRP) for corosync would provide this, but from what I've seen so far it doesn't<br>look that way. If the ring0_addr interface is lost in an RRP configuration while the remote guest is connected<br>to the host using that ring0_addr, the guest gets rebooted and reestablishes the "remote-node-to-host" connection over the ring1_addr, <br>which is great as long as you don't care if the guest gets rebooted. Corosync is doing its job of preventing the<br>cluster node from being fenced by failing over its heartbeat messaging to ring1, however the remote_node guests take<br>a short term hit due to the remote-node-to-host reconnect. <br><br>In the event of a ring0_addr failure, I don't see any attempt by pacemaker to migrate the remote_node to another cluster node, <br>but maybe this is by design, since there is no alternate path for the guest to use for LGM (i.e. ring0 is a single point of failure). <br>If the guest could be migrated over an alternate route, it would prevent the guest outage.<br><br>Maybe my question is... is there any way to facilitate an alternate Live Guest Migration path in the event of a ring0_addr failure? <br>This might also apply to a single ring protocol as well. <br><br>Thanks, <br><br>Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie, N.Y.<br> INTERNET: swgreenl@us.ibm.com <br><br><br><img width="16" height="16" src="cid:1__=8FBB0A4FDFC3474B8f9e8a93df938690918c8FB@" border="0" alt="Inactive hide details for "Ulrich Windl" ---03/02/2017 02:39:23 AM--->>> "Scott Greenlese" <swgreenl@us.ibm.com> schrieb am 01."><font color="#424282">"Ulrich Windl" ---03/02/2017 02:39:23 AM--->>> "Scott Greenlese" <swgreenl@us.ibm.com> schrieb am 01.03.2017 um 22:07 in Nachricht</font><br><br><font size="2" color="#5F5F5F">From: </font><font size="2">"Ulrich Windl" <Ulrich.Windl@rz.uni-regensburg.de></font><br><font size="2" color="#5F5F5F">To: </font><font size="2"><users@clusterlabs.org></font><br><font size="2" color="#5F5F5F">Date: </font><font size="2">03/02/2017 02:39 AM</font><br><font size="2" color="#5F5F5F">Subject: </font><font size="2">[ClusterLabs] Antw: Expected recovery behavior of remote-node guest when corosync ring0 is lost in a passive mode RRP config?</font><br><hr width="100%" size="2" align="left" noshade style="color:#8091A5; "><br><br><br><tt>>>> "Scott Greenlese" <swgreenl@us.ibm.com> schrieb am 01.03.2017 um 22:07 in<br>Nachricht<br><OFFC50C6DC.1138528D-ON002580D6.006F49AA-852580D6.00740858@notes.na.collabserv.c <br>m>:<br><br>> Hi..<br>> <br>> I am running a few corosync "passive mode" Redundant Ring Protocol (RRP)<br>> failure scenarios, where<br>> my cluster has several remote-node VirtualDomain resources running on each<br>> node in the cluster,<br>> which have been configured to allow Live Guest Migration (LGM) operations.<br>> <br>> While both corosync rings are active, if I drop ring0 on a given node where<br>> I have remote node (guests) running,<br>> I noticed that the guest will be shutdown / re-started on the same host,<br>> after which the connection is re-established<br>> and the guest proceeds to run on that same cluster node.<br><br>Could it be you forgot "allow-migrate=true" at the resource level or some migration IP address at the node level?<br>I only have SLES11 here...<br><br>> <br>> I am wondering why pacemaker doesn't try to "live" migrate the remote node<br>> (guest) to a different node, instead<br>> of rebooting the guest? Is there some way to configure the remote nodes<br>> such that the recovery action is<br>> LGM instead of reboot when the host-to-remote_node connect is lost in an<br>> RRP situation? I guess the<br>> next question is, is it even possible to LGM a remote node guest if the<br>> corosync ring fails over from ring0 to ring1<br>> (or vise-versa)?<br>> <br>> # For example, here's a remote node's VirtualDomain resource definition.<br>> <br>> [root@zs95kj]# pcs resource show zs95kjg110102_res<br>> Resource: zs95kjg110102_res (class=ocf provider=heartbeat<br>> type=VirtualDomain)<br>> Attributes: config=/guestxml/nfs1/zs95kjg110102.xml<br>> hypervisor=qemu:///system migration_transport=ssh<br>> Meta Attrs: allow-migrate=true remote-node=zs95kjg110102<br>> remote-addr=10.20.110.102<br>> Operations: start interval=0s timeout=480<br>> (zs95kjg110102_res-start-interval-0s)<br>> stop interval=0s timeout=120<br>> (zs95kjg110102_res-stop-interval-0s)<br>> monitor interval=30s (zs95kjg110102_res-monitor-interval-30s)<br>> migrate-from interval=0s timeout=1200<br>> (zs95kjg110102_res-migrate-from-interval-0s)<br>> migrate-to interval=0s timeout=1200<br>> (zs95kjg110102_res-migrate-to-interval-0s)<br>> [root@zs95kj VD]#<br>> <br>> <br>> <br>> <br>> # My RRP rings are active, and configured "rrp_mode="passive"<br>> <br>> [root@zs95kj ~]# corosync-cfgtool -s<br>> Printing ring status.<br>> Local node ID 2<br>> RING ID 0<br>> id = 10.20.93.12<br>> status = ring 0 active with no faults<br>> RING ID 1<br>> id = 10.20.94.212<br>> status = ring 1 active with no faults<br>> <br>> <br>> <br>> # Here's the corosync.conf ..<br>> <br>> [root@zs95kj ~]# cat /etc/corosync/corosync.conf<br>> totem {<br>> version: 2<br>> secauth: off<br>> cluster_name: test_cluster_2<br>> transport: udpu<br>> rrp_mode: passive<br>> }<br>> <br>> nodelist {<br>> node {<br>> ring0_addr: zs95kjpcs1<br>> ring1_addr: zs95kjpcs2<br>> nodeid: 2<br>> }<br>> <br>> node {<br>> ring0_addr: zs95KLpcs1<br>> ring1_addr: zs95KLpcs2<br>> nodeid: 3<br>> }<br>> <br>> node {<br>> ring0_addr: zs90kppcs1<br>> ring1_addr: zs90kppcs2<br>> nodeid: 4<br>> }<br>> <br>> node {<br>> ring0_addr: zs93KLpcs1<br>> ring1_addr: zs93KLpcs2<br>> nodeid: 5<br>> }<br>> <br>> node {<br>> ring0_addr: zs93kjpcs1<br>> ring1_addr: zs93kjpcs2<br>> nodeid: 1<br>> }<br>> }<br>> <br>> quorum {<br>> provider: corosync_votequorum<br>> }<br>> <br>> logging {<br>> to_logfile: yes<br>> logfile: /var/log/corosync/corosync.log<br>> timestamp: on<br>> syslog_facility: daemon<br>> to_syslog: yes<br>> debug: on<br>> <br>> logger_subsys {<br>> debug: off<br>> subsys: QUORUM<br>> }<br>> }<br>> <br>> <br>> <br>> <br>> # Here's the vlan / route situation on cluster node zs95kj:<br>> <br>> ring0 is on vlan1293<br>> ring1 is on vlan1294<br>> <br>> [root@zs95kj ~]# route -n<br>> Kernel IP routing table<br>> Destination Gateway Genmask Flags Metric Ref Use<br>> Iface<br>> 0.0.0.0 10.20.93.254 0.0.0.0 UG 400 0 0<br>> vlan1293 << default route to guests from ring0<br>> 9.0.0.0 9.12.23.1 255.0.0.0 UG 400 0 0<br>> vlan508<br>> 9.12.23.0 0.0.0.0 255.255.255.0 U 400 0 0<br>> vlan508<br>> 10.20.92.0 0.0.0.0 255.255.255.0 U 400 0 0<br>> vlan1292<br>> 10.20.93.0 0.0.0.0 255.255.255.0 U 0 0 0<br>> vlan1293 << ring0 IPs<br>> 10.20.93.0 0.0.0.0 255.255.255.0 U 400 0 0<br>> vlan1293<br>> 10.20.94.0 0.0.0.0 255.255.255.0 U 0 0 0<br>> vlan1294 << ring1 IPs<br>> 10.20.94.0 0.0.0.0 255.255.255.0 U 400 0 0<br>> vlan1294<br>> 10.20.101.0 0.0.0.0 255.255.255.0 U 400 0 0<br>> vlan1298<br>> 10.20.109.0 10.20.94.254 255.255.255.0 UG 400 0 0<br>> vlan1294 << Route to guests on 10.20.109 from ring1<br>> 10.20.110.0 10.20.94.254 255.255.255.0 UG 400 0 0<br>> vlan1294 << Route to guests on 10.20.110 from ring1<br>> 169.254.0.0 0.0.0.0 255.255.0.0 U 1007 0 0<br>> enccw0.0.02e0<br>> 169.254.0.0 0.0.0.0 255.255.0.0 U 1016 0 0<br>> ovsbridge1<br>> 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0<br>> virbr0<br>> <br>> <br>> <br>> # On remote node, you can see we have a connection back to the host.<br>> <br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> crm_log_init: Changed active directory to /var/lib/heartbeat/cores/root<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: lrmd<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: notice:<br>> lrmd_init_remote_tls_server: Starting a tls listener on port 3121.<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: notice:<br>> bind_and_listen: Listening on address ::<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_ro<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_rw<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_shm<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: attrd<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: stonith-ng<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: crmd<br>> Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted: info: main:<br>> Starting<br>> Feb 28 14:30:27 [928] zs95kjg110102 pacemaker_remoted: notice:<br>> lrmd_remote_listen: LRMD client connection established. 0x9ec18b50 id:<br>> 93e25ef0-4ff8-45ac-a6ed-f13b64588326<br>> <br>> zs95kjg110102:~ # netstat -anp<br>> Active Internet connections (servers and established)<br>> Proto Recv-Q Send-Q Local Address Foreign Address State<br>> PID/Program name<br>> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN<br>> 946/sshd<br>> tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN<br>> 1022/master<br>> tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN<br>> 931/xinetd<br>> tcp 0 0 0.0.0.0:5801 0.0.0.0:* LISTEN<br>> 931/xinetd<br>> tcp 0 0 0.0.0.0:5901 0.0.0.0:* LISTEN<br>> 931/xinetd<br>> tcp 0 0 :::21 :::* LISTEN<br>> 926/vsftpd<br>> tcp 0 0 :::22 :::* LISTEN<br>> 946/sshd<br>> tcp 0 0 ::1:25 :::* LISTEN<br>> 1022/master<br>> tcp 0 0 :::44931 :::* LISTEN<br>> 1068/xdm<br>> tcp 0 0 :::80 :::* LISTEN<br>> 929/httpd-prefork<br>> tcp 0 0 :::3121 :::* LISTEN<br>> 928/pacemaker_remot<br>> tcp 0 0 10.20.110.102:3121 10.20.93.12:46425<br>> ESTABLISHED 928/pacemaker_remot<br>> udp 0 0 :::177 :::*<br>> 1068/xdm<br>> <br>> <br>> <br>> <br>> ## Drop the ring0 (vlan1293) interface on cluster node zs95kj, causing fail<br>> over to ring1 (vlan1294)<br>> <br>> [root@zs95kj]# date;ifdown vlan1293<br>> Tue Feb 28 15:54:11 EST 2017<br>> Device 'vlan1293' successfully disconnected.<br>> <br>> <br>> <br>> ## Confirm that ring0 is now offline (a.k.a. "FAULTY")<br>> <br>> [root@zs95kj]# date;corosync-cfgtool -s<br>> Tue Feb 28 15:54:49 EST 2017<br>> Printing ring status.<br>> Local node ID 2<br>> RING ID 0<br>> id = 10.20.93.12<br>> status = Marking ringid 0 interface 10.20.93.12 FAULTY<br>> RING ID 1<br>> id = 10.20.94.212<br>> status = ring 1 active with no faults<br>> [root@zs95kj VD]#<br>> <br>> <br>> <br>> <br>> # See that the resource stayed local to cluster node zs95kj.<br>> <br>> [root@zs95kj]# date;pcs resource show |grep zs95kjg110102<br>> Tue Feb 28 15:55:32 EST 2017<br>> zs95kjg110102_res (ocf::heartbeat:VirtualDomain): Started zs95kjpcs1<br>> You have new mail in /var/spool/mail/root<br>> <br>> <br>> <br>> # On the remote node, show new entries in pacemaker.log showing connection<br>> re-established.<br>> <br>> Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted: notice:<br>> crm_signal_dispatch: Invoking handler for signal 15: Terminated<br>> Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted: info:<br>> lrmd_shutdown: Terminating with 1 clients<br>> Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_withdraw: withdrawing server sockets<br>> Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted: info:<br>> crm_xml_cleanup: Cleaning up memory from libxml2<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> crm_log_init: Changed active directory to /var/lib/heartbeat/cores/root<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: lrmd<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: notice:<br>> lrmd_init_remote_tls_server: Starting a tls listener on port 3121.<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: notice:<br>> bind_and_listen: Listening on address ::<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_ro<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_rw<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: cib_shm<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: attrd<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: stonith-ng<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info:<br>> qb_ipcs_us_publish: server name: crmd<br>> Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted: info: main:<br>> Starting<br>> Feb 28 15:55:38 [942] zs95kjg110102 pacemaker_remoted: notice:<br>> lrmd_remote_listen: LRMD client connection established. 0xbed1ab50 id:<br>> b19ed532-6f61-4d9c-9439-ffb836eea34f<br>> zs95kjg110102:~ #<br>> <br>> <br>> <br>> zs95kjg110102:~ # netstat -anp |less<br>> Active Internet connections (servers and established)<br>> Proto Recv-Q Send-Q Local Address Foreign Address State<br>> PID/Program name<br>> tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN<br>> 961/sshd<br>> tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN<br>> 1065/master<br>> tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN<br>> 946/xinetd<br>> tcp 0 0 0.0.0.0:5801 0.0.0.0:* LISTEN<br>> 946/xinetd<br>> tcp 0 0 0.0.0.0:5901 0.0.0.0:* LISTEN<br>> 946/xinetd<br>> tcp 0 0 10.20.110.102:22 10.20.94.32:57749<br>> ESTABLISHED 1134/0<br>> tcp 0 0 :::21 :::* LISTEN<br>> 941/vsftpd<br>> tcp 0 0 :::22 :::* LISTEN<br>> 961/sshd<br>> tcp 0 0 ::1:25 :::* LISTEN<br>> 1065/master<br>> tcp 0 0 :::80 :::* LISTEN<br>> 944/httpd-prefork<br>> tcp 0 0 :::3121 :::* LISTEN<br>> 942/pacemaker_remot<br>> tcp 0 0 :::34836 :::* LISTEN<br>> 1070/xdm<br>> tcp 0 0 10.20.110.102:3121 10.20.94.212:49666<br>> ESTABLISHED 942/pacemaker_remot<br>> udp 0 0 :::177 :::*<br>> 1070/xdm<br>> <br>> <br>> <br>> ## On host node, zs95kj show system messages indicating remote node (guest)<br>> shutdown / start ... (but no attempt to LGM).<br>> <br>> [root@zs95kj ~]# grep "Feb 28" /var/log/messages |grep zs95kjg110102<br>> <br>> Feb 28 15:55:07 zs95kj crmd[121380]: error: Operation<br>> zs95kjg110102_monitor_30000: Timed Out (node=zs95kjpcs1, call=2,<br>> timeout=30000ms)<br>> Feb 28 15:55:07 zs95kj crmd[121380]: error: Unexpected disconnect on<br>> remote-node zs95kjg110102<br>> Feb 28 15:55:17 zs95kj crmd[121380]: notice: Operation<br>> zs95kjg110102_stop_0: ok (node=zs95kjpcs1, call=38, rc=0, cib-update=370,<br>> confirmed=true)<br>> Feb 28 15:55:17 zs95kj attrd[121378]: notice: Removing all zs95kjg110102<br>> attributes for zs95kjpcs1<br>> Feb 28 15:55:17 zs95kj VirtualDomain(zs95kjg110102_res)[173127]: INFO:<br>> Issuing graceful shutdown request for domain zs95kjg110102.<br>> Feb 28 15:55:23 zs95kj systemd-machined: Machine qemu-38-zs95kjg110102<br>> terminated.<br>> Feb 28 15:55:23 zs95kj crmd[121380]: notice: Operation<br>> zs95kjg110102_res_stop_0: ok (node=zs95kjpcs1, call=858, rc=0,<br>> cib-update=378, confirmed=true)<br>> Feb 28 15:55:24 zs95kj systemd-machined: New machine qemu-64-zs95kjg110102.<br>> Feb 28 15:55:24 zs95kj systemd: Started Virtual Machine<br>> qemu-64-zs95kjg110102.<br>> Feb 28 15:55:24 zs95kj systemd: Starting Virtual Machine<br>> qemu-64-zs95kjg110102.<br>> Feb 28 15:55:25 zs95kj crmd[121380]: notice: Operation<br>> zs95kjg110102_res_start_0: ok (node=zs95kjpcs1, call=859, rc=0,<br>> cib-update=385, confirmed=true)<br>> Feb 28 15:55:38 zs95kj crmd[121380]: notice: Operation<br>> zs95kjg110102_start_0: ok (node=zs95kjpcs1, call=44, rc=0, cib-update=387,<br>> confirmed=true)<br>> [root@zs95kj ~]#<br>> <br>> <br>> Once the remote node established re-connection, there was no further remote<br>> node / resource instability.<br>> <br>> Anyway, just wondering why there was no attempt to migrate this remote node<br>> guest as opposed to a reboot? Is it necessary to reboot the guest in<br>> order to be managed<br>> by pacemaker and corosync over the ring1 interface if ring0 goes down?<br>> Is live guest migration even possible if ring0 goes away and ring1 takes<br>> over?<br>> <br>> Thanks in advance..<br>> <br>> Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie,<br>> N.Y.<br>> INTERNET: swgreenl@us.ibm.com <br><br><br><br><br>_______________________________________________<br>Users mailing list: Users@clusterlabs.org<br></tt><tt><a href="http://lists.clusterlabs.org/mailman/listinfo/users">http://lists.clusterlabs.org/mailman/listinfo/users</a></tt><tt><br><br>Project Home: </tt><tt><a href="http://www.clusterlabs.org">http://www.clusterlabs.org</a></tt><tt><br>Getting started: </tt><tt><a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a></tt><tt><br>Bugs: </tt><tt><a href="http://bugs.clusterlabs.org">http://bugs.clusterlabs.org</a></tt><tt><br><br></tt><br><br><BR>
</body></html>