[ClusterLabs] Error at testing live migration
Wilson Acero
rasalax at hotmail.com
Mon Mar 30 15:37:17 UTC 2015
Thanks a lot. I will download and compile pacemaker 1.1.13 and let you know how it goes. Thanks again.
From: rasalax at hotmail.com
To: users at clusterlabs.org
Subject: RE: Error at testing live migration
Date: Fri, 27 Mar 2015 16:40:18 -0500
Hi Ken, thanks for your answer. Before making the live migration tests I ran tests to see how Pacemaker manages the virtual machine shutdown. Using the command "pcs cluster standby nodoX" there were no errors, but rebooting or shutting down the node, the virtualmachine resource, gfs2wa and iscsi resources failed and the node became UNCLEAN. After a lot of tests I did modified my /usr/lib/systemd/system/corosync.service file, and add this entries.
After=iscsid.service
After=remote-fs.target
After=libvirtd.service
It solved the shutting down /reboot error, giving Pacemaker enough time to shutting down the virtual machine, restarting it on another node, and continue with the rebooting of the node, but when testing the live migration, it fails.
I added your modification on /usr/lib/systemd/system/pacemaker.service, but it did not work.
Searching about this error I found out that systemd now includes the service "systemd-machined.service" a service to monitor, start or shut down a virtual machine using the command machinectl. I tried to disable the process but libvirt needs it to run a virtual machine.
[root at nodo3 system]# machinectl
MACHINE CONTAINER SERVICE
qemu-centos2 vm libvirt-qemu
1 machines listed.
[root at nodo3 system]#
[root at nodo3 system]# systemctl status systemd-machined.service
systemd-machined.service - Virtual Machine and Container Registration Service
Loaded: loaded (/usr/lib/systemd/system/systemd-machined.service; static)
Active: active (running) since Fri 2015-03-27 16:13:20 ECT; 22min ago
Docs: man:systemd-machined.service(8)
http://www.freedesktop.org/wiki/Software/systemd/machined
Main PID: 2982 (systemd-machine)
CGroup: /system.slice/systemd-machined.service
ââ2982 /usr/lib/systemd/systemd-machined
Mar 27 16:13:20 nodo3.redwa.local systemd[1]: Starting Virtual Machine and Container Registration Service...
Mar 27 16:13:20 nodo3.redwa.local systemd[1]: Started Virtual Machine and Container Registration Service.
Mar 27 16:13:20 nodo3.redwa.local systemd-machined[2982]: New machine qemu-centos2.
I guess that service is guilty, but I don't know how to deal with it.
Thanks a lot.
From: rasalax at hotmail.com
To: users at clusterlabs.org
Subject: Error at testing live migration
Date: Fri, 27 Mar 2015 12:46:47 -0500
Hi everybody,
I have a pacemaker + corosync cluster that manages a virtual machine (kvm) the virtual machine drives are stored in a shared storage (gfs2 + lvm+ iscsi LUN). The resource agent is VirtualDomain.
When I test the live migration with a command 'pcs resource move vmcentos2 nodo2' or putting the node on standby, the migration works with no problem.
But when I want to test the live migration rebooting or shutting down the node that runs the virtual machine, migration fails. Is this a expected behaviour or a bug?
My cluster configuration is:
OS=Centos 7 Pacemaker 1.1.10-32.el7_0.1Corosync Cluster Engine, version '2.3.3'
[root at nodo2 ~]# pcs statusCluster name: clusterwaLast updated: Fri Mar 27 12:20:04 2015Last change: Thu Mar 26 16:11:11 2015 via crm_resource on nodo2Stack: corosyncCurrent DC: nodo2 (2) - partition with quorumVersion: 1.1.10-32.el7_0.1-368c7265 Nodes configured29 Resources configured
Online: [ nodo2 nodo3 nodo4 ]Containers: [ centos1.7:vmcentos3 ]
Full list of resources:
wti_wa (stonith:fence_wti): Started nodo3 Clone Set: dlmwa-clone [dlmwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] Clone Set: clvmwa-clone [clvmwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] Clone Set: gfs2wa-clone [gfs2wa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] vmcentos2 (ocf::heartbeat:VirtualDomain): Started nodo2
Clone Set: iscsiwa-clone [iscsiwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ]
PCSD Status: nodo2: Online nodo3: Online nodo4: Online
Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
Many thanks. Many thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150330/709b0223/attachment.htm>
More information about the Users
mailing list