[ClusterLabs] Expected recovery behavior of remote-node guest when corosync ring0 is lost in a passive mode RRP config?

Scott Greenlese swgreenl at us.ibm.com
Wed Mar 1 16:07:23 EST 2017


Hi..

I am running a few corosync "passive mode" Redundant Ring Protocol (RRP)
failure scenarios, where
my cluster has several remote-node VirtualDomain resources running on each
node in the cluster,
which have been configured to allow Live Guest Migration (LGM) operations.

While both corosync rings are active, if I drop ring0 on a given node where
I have remote node (guests) running,
I noticed that the guest will be shutdown / re-started on the same host,
after which the connection is re-established
and the guest proceeds to run on that same cluster node.

I am wondering why pacemaker doesn't try to "live" migrate the remote node
(guest) to a different node, instead
of rebooting the guest?  Is there some way to configure the remote nodes
such that the recovery action is
LGM instead of reboot when the host-to-remote_node connect is lost in an
RRP situation?   I guess the
next question is, is it even possible to LGM a remote node guest if the
corosync ring fails over from ring0 to ring1
(or vise-versa)?

# For example, here's a remote node's VirtualDomain resource definition.

[root at zs95kj]# pcs resource show  zs95kjg110102_res
 Resource: zs95kjg110102_res (class=ocf provider=heartbeat
type=VirtualDomain)
  Attributes: config=/guestxml/nfs1/zs95kjg110102.xml
hypervisor=qemu:///system migration_transport=ssh
  Meta Attrs: allow-migrate=true remote-node=zs95kjg110102
remote-addr=10.20.110.102
  Operations: start interval=0s timeout=480
(zs95kjg110102_res-start-interval-0s)
              stop interval=0s timeout=120
(zs95kjg110102_res-stop-interval-0s)
              monitor interval=30s (zs95kjg110102_res-monitor-interval-30s)
              migrate-from interval=0s timeout=1200
(zs95kjg110102_res-migrate-from-interval-0s)
              migrate-to interval=0s timeout=1200
(zs95kjg110102_res-migrate-to-interval-0s)
[root at zs95kj VD]#




# My RRP rings are active, and configured "rrp_mode="passive"

[root at zs95kj ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
        id      = 10.20.93.12
        status  = ring 0 active with no faults
RING ID 1
        id      = 10.20.94.212
        status  = ring 1 active with no faults



# Here's the corosync.conf ..

[root at zs95kj ~]# cat /etc/corosync/corosync.conf
totem {
    version: 2
    secauth: off
    cluster_name: test_cluster_2
    transport: udpu
    rrp_mode: passive
}

nodelist {
    node {
        ring0_addr: zs95kjpcs1
        ring1_addr: zs95kjpcs2
        nodeid: 2
    }

    node {
        ring0_addr: zs95KLpcs1
        ring1_addr: zs95KLpcs2
        nodeid: 3
    }

    node {
        ring0_addr: zs90kppcs1
        ring1_addr: zs90kppcs2
        nodeid: 4
    }

    node {
        ring0_addr: zs93KLpcs1
        ring1_addr: zs93KLpcs2
        nodeid: 5
    }

    node {
        ring0_addr: zs93kjpcs1
        ring1_addr: zs93kjpcs2
        nodeid: 1
    }
}

quorum {
    provider: corosync_votequorum
}

logging {
    to_logfile: yes
    logfile: /var/log/corosync/corosync.log
    timestamp: on
    syslog_facility: daemon
    to_syslog: yes
    debug: on

    logger_subsys {
        debug: off
        subsys: QUORUM
    }
}




# Here's the vlan / route situation on cluster node zs95kj:

ring0 is on vlan1293
ring1 is on vlan1294

[root at zs95kj ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use
Iface
0.0.0.0         10.20.93.254    0.0.0.0         UG    400    0        0
vlan1293  << default route to guests from ring0
9.0.0.0         9.12.23.1       255.0.0.0       UG    400    0        0
vlan508
9.12.23.0       0.0.0.0         255.255.255.0   U     400    0        0
vlan508
10.20.92.0      0.0.0.0         255.255.255.0   U     400    0        0
vlan1292
10.20.93.0      0.0.0.0         255.255.255.0   U     0      0        0
vlan1293  << ring0 IPs
10.20.93.0      0.0.0.0         255.255.255.0   U     400    0        0
vlan1293
10.20.94.0      0.0.0.0         255.255.255.0   U     0      0        0
vlan1294   << ring1 IPs
10.20.94.0      0.0.0.0         255.255.255.0   U     400    0        0
vlan1294
10.20.101.0     0.0.0.0         255.255.255.0   U     400    0        0
vlan1298
10.20.109.0     10.20.94.254    255.255.255.0   UG    400    0        0
vlan1294  << Route to guests on 10.20.109 from ring1
10.20.110.0     10.20.94.254    255.255.255.0   UG    400    0        0
vlan1294  << Route to guests on 10.20.110 from ring1
169.254.0.0     0.0.0.0         255.255.0.0     U     1007   0        0
enccw0.0.02e0
169.254.0.0     0.0.0.0         255.255.0.0     U     1016   0        0
ovsbridge1
192.168.122.0   0.0.0.0         255.255.255.0   U     0      0        0
virbr0



# On remote node, you can see we have a connection back to the host.

Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
crm_log_init:  Changed active directory to /var/lib/heartbeat/cores/root
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: lrmd
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:   notice:
lrmd_init_remote_tls_server:   Starting a tls listener on port 3121.
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:   notice:
bind_and_listen:       Listening on address ::
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_ro
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_rw
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_shm
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: attrd
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: stonith-ng
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: crmd
Feb 28 14:30:22 [928] zs95kjg110102 pacemaker_remoted:     info: main:
Starting
Feb 28 14:30:27 [928] zs95kjg110102 pacemaker_remoted:   notice:
lrmd_remote_listen:    LRMD client connection established. 0x9ec18b50 id:
93e25ef0-4ff8-45ac-a6ed-f13b64588326

zs95kjg110102:~ # netstat -anp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
946/sshd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN
1022/master
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN
931/xinetd
tcp        0      0 0.0.0.0:5801            0.0.0.0:*               LISTEN
931/xinetd
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               LISTEN
931/xinetd
tcp        0      0 :::21                   :::*                    LISTEN
926/vsftpd
tcp        0      0 :::22                   :::*                    LISTEN
946/sshd
tcp        0      0 ::1:25                  :::*                    LISTEN
1022/master
tcp        0      0 :::44931                :::*                    LISTEN
1068/xdm
tcp        0      0 :::80                   :::*                    LISTEN
929/httpd-prefork
tcp        0      0 :::3121                 :::*                    LISTEN
928/pacemaker_remot
tcp        0      0 10.20.110.102:3121      10.20.93.12:46425
ESTABLISHED 928/pacemaker_remot
udp        0      0 :::177                  :::*
1068/xdm




## Drop the ring0 (vlan1293) interface on cluster node zs95kj, causing fail
over to ring1 (vlan1294)

[root at zs95kj]# date;ifdown vlan1293
Tue Feb 28 15:54:11 EST 2017
Device 'vlan1293' successfully disconnected.



## Confirm that ring0 is now offline (a.k.a. "FAULTY")

[root at zs95kj]# date;corosync-cfgtool -s
Tue Feb 28 15:54:49 EST 2017
Printing ring status.
Local node ID 2
RING ID 0
        id      = 10.20.93.12
        status  = Marking ringid 0 interface 10.20.93.12 FAULTY
RING ID 1
        id      = 10.20.94.212
        status  = ring 1 active with no faults
[root at zs95kj VD]#




# See that the resource stayed local to cluster node zs95kj.

[root at zs95kj]# date;pcs resource show |grep zs95kjg110102
Tue Feb 28 15:55:32 EST 2017
 zs95kjg110102_res      (ocf::heartbeat:VirtualDomain): Started zs95kjpcs1
You have new mail in /var/spool/mail/root



# On the remote node, show new entries in pacemaker.log showing connection
re-established.

Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted:   notice:
crm_signal_dispatch:   Invoking handler for signal 15: Terminated
Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted:     info:
lrmd_shutdown: Terminating with  1 clients
Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_withdraw:   withdrawing server sockets
Feb 28 15:55:17 [928] zs95kjg110102 pacemaker_remoted:     info:
crm_xml_cleanup:       Cleaning up memory from libxml2
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
crm_log_init:  Changed active directory to /var/lib/heartbeat/cores/root
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: lrmd
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:   notice:
lrmd_init_remote_tls_server:   Starting a tls listener on port 3121.
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:   notice:
bind_and_listen:       Listening on address ::
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_ro
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_rw
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: cib_shm
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: attrd
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: stonith-ng
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info:
qb_ipcs_us_publish:    server name: crmd
Feb 28 15:55:37 [942] zs95kjg110102 pacemaker_remoted:     info: main:
Starting
Feb 28 15:55:38 [942] zs95kjg110102 pacemaker_remoted:   notice:
lrmd_remote_listen:    LRMD client connection established. 0xbed1ab50 id:
b19ed532-6f61-4d9c-9439-ffb836eea34f
zs95kjg110102:~ #



zs95kjg110102:~ # netstat -anp |less
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
961/sshd
tcp        0      0 127.0.0.1:25            0.0.0.0:*               LISTEN
1065/master
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN
946/xinetd
tcp        0      0 0.0.0.0:5801            0.0.0.0:*               LISTEN
946/xinetd
tcp        0      0 0.0.0.0:5901            0.0.0.0:*               LISTEN
946/xinetd
tcp        0      0 10.20.110.102:22        10.20.94.32:57749
ESTABLISHED 1134/0
tcp        0      0 :::21                   :::*                    LISTEN
941/vsftpd
tcp        0      0 :::22                   :::*                    LISTEN
961/sshd
tcp        0      0 ::1:25                  :::*                    LISTEN
1065/master
tcp        0      0 :::80                   :::*                    LISTEN
944/httpd-prefork
tcp        0      0 :::3121                 :::*                    LISTEN
942/pacemaker_remot
tcp        0      0 :::34836                :::*                    LISTEN
1070/xdm
tcp        0      0 10.20.110.102:3121      10.20.94.212:49666
ESTABLISHED 942/pacemaker_remot
udp        0      0 :::177                  :::*
1070/xdm



## On host node, zs95kj show system messages indicating remote node (guest)
shutdown / start ...  (but no attempt to LGM).

[root at zs95kj ~]# grep "Feb 28" /var/log/messages |grep zs95kjg110102

Feb 28 15:55:07 zs95kj crmd[121380]:   error: Operation
zs95kjg110102_monitor_30000: Timed Out (node=zs95kjpcs1, call=2,
timeout=30000ms)
Feb 28 15:55:07 zs95kj crmd[121380]:   error: Unexpected disconnect on
remote-node zs95kjg110102
Feb 28 15:55:17 zs95kj crmd[121380]:  notice: Operation
zs95kjg110102_stop_0: ok (node=zs95kjpcs1, call=38, rc=0, cib-update=370,
confirmed=true)
Feb 28 15:55:17 zs95kj attrd[121378]:  notice: Removing all zs95kjg110102
attributes for zs95kjpcs1
Feb 28 15:55:17 zs95kj VirtualDomain(zs95kjg110102_res)[173127]: INFO:
Issuing graceful shutdown request for domain zs95kjg110102.
Feb 28 15:55:23 zs95kj systemd-machined: Machine qemu-38-zs95kjg110102
terminated.
Feb 28 15:55:23 zs95kj crmd[121380]:  notice: Operation
zs95kjg110102_res_stop_0: ok (node=zs95kjpcs1, call=858, rc=0,
cib-update=378, confirmed=true)
Feb 28 15:55:24 zs95kj systemd-machined: New machine qemu-64-zs95kjg110102.
Feb 28 15:55:24 zs95kj systemd: Started Virtual Machine
qemu-64-zs95kjg110102.
Feb 28 15:55:24 zs95kj systemd: Starting Virtual Machine
qemu-64-zs95kjg110102.
Feb 28 15:55:25 zs95kj crmd[121380]:  notice: Operation
zs95kjg110102_res_start_0: ok (node=zs95kjpcs1, call=859, rc=0,
cib-update=385, confirmed=true)
Feb 28 15:55:38 zs95kj crmd[121380]:  notice: Operation
zs95kjg110102_start_0: ok (node=zs95kjpcs1, call=44, rc=0, cib-update=387,
confirmed=true)
[root at zs95kj ~]#


Once the remote node established re-connection, there was no further remote
node / resource instability.

Anyway, just wondering why there was no attempt to migrate this remote node
guest as opposed to a reboot?   Is it necessary to reboot the guest in
order to be managed
by pacemaker and corosync over the ring1 interface if ring0 goes down?
Is live guest migration even possible if ring0 goes away and ring1 takes
over?

Thanks in advance..

Scott Greenlese ... KVM on System Z - Solutions Test, IBM Poughkeepsie,
N.Y.
  INTERNET:  swgreenl at us.ibm.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170301/0f7042e6/attachment-0002.html>


More information about the Users mailing list