[ClusterLabs] Error at testing live migration

Ken Gaillot kgaillot at redhat.com
Fri Mar 27 18:13:37 UTC 2015


On 03/27/2015 01:46 PM, Wilson Acero wrote:
> Hi everybody, 
> I have a pacemaker + corosync cluster that manages a virtual machine (kvm) the virtual machine drives are stored in  a shared storage (gfs2 + lvm+ iscsi LUN). The resource agent is VirtualDomain. 
> When I test the live migration with a command  'pcs resource move vmcentos2 nodo2' or putting the node on standby, the migration works with no problem. 
> But when I want to test the live migration rebooting or  shutting down the node that runs the virtual machine, migration fails. Is this a expected behaviour or a bug?

I'm not sure it's either :-) but one thing that came up recently is that
systemd shuts down services in parallel unless they have "After"
ordering in the unit file. So if a resource depends on a service
initiated by systemd (not the cluster), you can get an unclean
shutdown/reboot.

In the upcoming 1.1.13 release, pacemaker will be marked as "After" dbus
which handles certain common resource dependencies. You can see whether
that fixes your case by editing
/usr/lib/systemd/system/pacemaker.service and adding this line under [Unit]:

   After=dbus.service

If your live migration requires any other systemd-initiated services to
be up, you'd have to add "After" lines for them as well. Pacemaker can't
know the dependencies of every possible resource, so I think people will
always have to modify that themselves if they have affected resources
and want to reboot or shutdown without stopping the cluster first.

> My cluster configuration is:
> OS=Centos 7 Pacemaker 1.1.10-32.el7_0.1Corosync Cluster Engine, version '2.3.3'
> [root at nodo2 ~]# pcs statusCluster name: clusterwaLast updated: Fri Mar 27 12:20:04 2015Last change: Thu Mar 26 16:11:11 2015 via crm_resource on nodo2Stack: corosyncCurrent DC: nodo2 (2) - partition with quorumVersion: 1.1.10-32.el7_0.1-368c7265 Nodes configured29 Resources configured
> Online: [ nodo2 nodo3 nodo4 ]Containers: [ centos1.7:vmcentos3 ]
> Full list of resources:
>  wti_wa (stonith:fence_wti):    Started nodo3 Clone Set: dlmwa-clone [dlmwa]     Started: [ nodo2 nodo3 nodo4 ]     Stopped: [ centos1.7 centosSC3 ] Clone Set: clvmwa-clone [clvmwa]     Started: [ nodo2 nodo3 nodo4 ]     Stopped: [ centos1.7 centosSC3 ] Clone Set: gfs2wa-clone [gfs2wa]     Started: [ nodo2 nodo3 nodo4 ]     Stopped: [ centos1.7 centosSC3 ] vmcentos2      (ocf::heartbeat:VirtualDomain): Started nodo2
>  Clone Set: iscsiwa-clone [iscsiwa]     Started: [ nodo2 nodo3 nodo4 ]     Stopped: [ centos1.7 centosSC3 ]
> PCSD Status:  nodo2: Online  nodo3: Online  nodo4: Online
> Daemon Status:  corosync: active/enabled  pacemaker: active/enabled  pcsd: active/enabled
> Many thanks. Many thanks. 		 	   		  
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list