[Pacemaker] migration fix for ocf:heartbeat:Xen

Tue Aug 23 11:54:35 EDT 2011

> Message: 7
> Date: Thu, 11 Aug 2011 21:07:00 +0000
> From: "Daugherity, Andrew W" <adaugherity at tamu.edu>
> To: "pacemaker at oss.clusterlabs.org" <pacemaker at oss.clusterlabs.org>
> Subject: [Pacemaker] migration fix for ocf:heartbeat:Xen
> Message-ID: <93B5E618-AD19-4993-8066-CB4F8E4EF322 at tamu.edu>
> Content-Type: text/plain; charset="us-ascii"
> 
> I have discovered that sometimes when migrating a VM, the migration itself will succeed, but the migrate_from call on the target node will fail, as apparently the status hasn't settled down yet.  This is more likely to happen when stopping pacemaker on a node, causing all its VMs to migrate away.  Migration succeeds, but then (sometimes) the status call in migrate_from fails, and the VM is unnecessarily stopped and started.  Note that it is NOT a timeout problem, as the migrate_from operation (which only checks status) takes less than a second.
> 
> I noticed the VirtualDomain RA does a loop rather than just checking the status once as the Xen RA does, so I patched a similar thing into the Xen RA, and that solved my problem.
(patch/logs snipped)

No comments?  What does it take to get this patch accepted?  I'd much rather use the mainline version than have to reapply my patch after every HAE update.  I guess I could open an SR with Novell but this is ultimately an upstream issue.

Andrew Daugherity
Systems Analyst
Division of Research, Texas A&M University
adaugherity at tamu.edu