[Pacemaker] problem with VM in pacemaker cluster

Yuriy Demchenko demchenko.ya at gmail.com
Thu Apr 11 09:56:21 UTC 2013

Solved my problem

First error was in constraint: i've put constraint with "cxml" resource 
alone, not with cloned "cxml-clone" - that's why "cxml" were moved first 
on "standby" command. after redefining constraint to "cxml-clone than 
testVM" putting active node in standby went smooth - VM moved correctly, 
no errors.
Second problem was because of "libvirt-guests" service, that is 
suspending my VM's on host reboot command. "chkconfig libvirt-guests 
off" command isnt enough, as it leaves symlinks "K01libvirt-guests" in 
/etc/rc.d/rcX.d . Removing that symlinks from rc3.d and rc6.d solved 
problem - now reboot process started with pacemaker shutdown and 
resources moved correctly to other nodes.

Yuriy Demchenko

On 04/10/2013 02:59 PM, Yuriy Demchenko wrote:
> Hi,
> I've set up 3-node cluster (2 active nodes + 1 standby for quorum), 
> cman+pacemaker
> Resources - "cxml-clone" gfs2 filesystem (cloned, run on both nodes) 
> and "testVM" via heartbeat:VirtualDomain (domain xml located on gfs2 
> fs, cLVM disk backend), set up constraints: "cxml-clone" started first 
> than "testVM" (symmetrical, according to description it'll be stopped 
> in reverse order).
> While manual migration of VM runs fine (pcs resource move testVM 
> node-2/node-1) - succesfull live migration, VM runs uninterrupted, but 
> when I'm trying to reboot/put in standby node running VM - everything 
> is crashing, migration fails and node fenced.
> From logs i can see that resource "cxml" stopped first (or 
> simultaneously, at least not waiting for vm migration to complete), 
> then migration fails because xml not available.
>> Apr 10 14:03:20 node-2 lrmd[2679]: notice: operation_finished: 
>> cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: Running stop for 
>> /dev/cstore/cxml on /mnt ]
>> Apr 10 14:03:20 node-2 lrmd[2679]:   notice: operation_finished: 
>> cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: Trying to unmount /mnt ]
>> Apr 10 14:03:20 node-2 lrmd[2679]:   notice: operation_finished: 
>> cxml_stop_0:3282 [ 2013/04/10_14:03:20 INFO: unmounted /mnt 
>> successfully ]
>> Apr 10 14:03:20 node-2 crmd[2682]:   notice: process_lrm_event: LRM 
>> operation cxml_stop_0 (call=77, rc=0, cib-update=37, confirmed=true) ok
>> Apr 10 14:03:21 node-2 lrmd[2679]:   notice: operation_finished: 
>> testVM_migrate_to_0:3281 [ 2013/04/10_14:03:20 INFO: testvm: Starting 
>> live migration to node-1 (using remote hypervisor URI 
>> qemu+ssh://node-1/system ). ]
>> Apr 10 14:03:21 node-2 lrmd[2679]:   notice: operation_finished: 
>> testVM_migrate_to_0:3281 [ error: Requested operation is not valid: 
>> domain is already active as 'testvm' ]
>> Apr 10 14:03:21 node-2 lrmd[2679]:   notice: operation_finished: 
>> testVM_migrate_to_0:3281 [ 2013/04/10_14:03:21 ERROR: testvm: live 
>> migration to qemu+ssh://node-1/system  failed: 1 ]
>> Apr 10 14:03:21 node-2 crmd[2682]:   notice: process_lrm_event: LRM 
>> operation testVM_migrate_to_0 (call=75, rc=1, cib-update=38, 
>> confirmed=true) unknown error
>> Apr 10 14:03:21 node-2 lrmd[2679]:   notice: operation_finished: 
>> testVM_stop_0:3392 [ 2013/04/10_14:03:21 ERROR: Configuration file 
>> /mnt/testvm.xml does not exist or is not readable. ]
> But wtf?! I've set up constraint, and "testVM" should be stopped/moved 
> first, not "cxml"
> What is wrong with my configuration, am I missing something?
> logs and CIB in attach

More information about the Pacemaker mailing list