[ClusterLabs] Error at testing live migration
Ken Gaillot
kgaillot at redhat.com
Fri Mar 27 18:13:37 UTC 2015
On 03/27/2015 01:46 PM, Wilson Acero wrote:
> Hi everybody,
> I have a pacemaker + corosync cluster that manages a virtual machine (kvm) the virtual machine drives are stored in a shared storage (gfs2 + lvm+ iscsi LUN). The resource agent is VirtualDomain.
> When I test the live migration with a command 'pcs resource move vmcentos2 nodo2' or putting the node on standby, the migration works with no problem.
> But when I want to test the live migration rebooting or shutting down the node that runs the virtual machine, migration fails. Is this a expected behaviour or a bug?
I'm not sure it's either :-) but one thing that came up recently is that
systemd shuts down services in parallel unless they have "After"
ordering in the unit file. So if a resource depends on a service
initiated by systemd (not the cluster), you can get an unclean
shutdown/reboot.
In the upcoming 1.1.13 release, pacemaker will be marked as "After" dbus
which handles certain common resource dependencies. You can see whether
that fixes your case by editing
/usr/lib/systemd/system/pacemaker.service and adding this line under [Unit]:
After=dbus.service
If your live migration requires any other systemd-initiated services to
be up, you'd have to add "After" lines for them as well. Pacemaker can't
know the dependencies of every possible resource, so I think people will
always have to modify that themselves if they have affected resources
and want to reboot or shutdown without stopping the cluster first.
> My cluster configuration is:
> OS=Centos 7 Pacemaker 1.1.10-32.el7_0.1Corosync Cluster Engine, version '2.3.3'
> [root at nodo2 ~]# pcs statusCluster name: clusterwaLast updated: Fri Mar 27 12:20:04 2015Last change: Thu Mar 26 16:11:11 2015 via crm_resource on nodo2Stack: corosyncCurrent DC: nodo2 (2) - partition with quorumVersion: 1.1.10-32.el7_0.1-368c7265 Nodes configured29 Resources configured
> Online: [ nodo2 nodo3 nodo4 ]Containers: [ centos1.7:vmcentos3 ]
> Full list of resources:
> wti_wa (stonith:fence_wti): Started nodo3 Clone Set: dlmwa-clone [dlmwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] Clone Set: clvmwa-clone [clvmwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] Clone Set: gfs2wa-clone [gfs2wa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ] vmcentos2 (ocf::heartbeat:VirtualDomain): Started nodo2
> Clone Set: iscsiwa-clone [iscsiwa] Started: [ nodo2 nodo3 nodo4 ] Stopped: [ centos1.7 centosSC3 ]
> PCSD Status: nodo2: Online nodo3: Online nodo4: Online
> Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled
> Many thanks. Many thanks.
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list