[Pacemaker] Live migration with order constraints

Fri Nov 11 01:23:55 CET 2011

> ManageVE has migration support using chkpt/restore since resource-agents
> version 1.0.4. .... but if I understand the OpenVZ migration concept
> correct ... please someone correct me if I'm wrong! ... there is no need
> for a shared storage.
> 
> The vzmigrate script rsyncs complete data, config and state between
> nodes .... no shared storage needed.
> 
> Of course you would need twice the diskspace, but this is also true for
> DRBD replication. Extending ManageVE to use vzmigrate for live migration
> looks quite straight forward to me.

My apologies - I did not notice ManageVE has migrate actions, as its usage help
does not list them (somebody forgot to add them). However it is not so easy as
it seems. The ManageVE makes a checkpoint and restores the machine (not
vzmigrate, as I will explain further on), but it also needs a shared or
migratable storage to place the dumpfile on. So the MigrateVE does have exactly
the same issue with migration as I mentioned. You can look at the source - it
has a comment, which says exactly that.

The vzmigrate script, on the other hand, works a bit differently: it transfers
the whole virtual machine over the network. Now this approach has three
obvious drawbacks. First, the need to send huge amount of data over, so it will
be very very slow (I've seen such migration take hours if the virtual machine
is very large, say a terabyte), and, moreover, will slowdown the complete disk
subsystem of the current active node. Second, it will not be live at all, since
it needs to suspend the machine and synchronize what's left unsynchronized 
during the first run (all the modifications took place during the first rsync).
It will also need to recalculate quota, which takes a lot of time as well (for
a terabyte virtual machine I would estimate quota to calculate up to an hour, 
depending on the disk subsystem). And third, most importantly, there will be
zero fault tolerance, as the copy on the second node is not being synchronized
with the current primary. Now, I do not intend to say that vzmigrate is evil
or incorrect: it has its purposes, and I've used it to migrate virtual
machines to new disks (where disks can not be shared) many times, and I was
very very happy with just how it works... but it is just not suitable for this
particular purpose.

A filesystem on an active-passive DRBD, on the other hand, provides full online
synchronization, so not only the second node could take over once the primary
failed, but also live migration would be just a matter of dumping the memory
file, unmounting the filesystem, remounting it on the other node and reading
the memory file - fast, clean and simple.

Thanks,
Dmitry