[Pacemaker] Manging Virtual Machine's resource

Lars Marowsky-Bree lmb at suse.de
Fri May 16 14:15:08 EDT 2008

On 2008-05-16T12:15:03, Lon Hohberger <lhh at redhat.com> wrote:

> rgmanager:
>  * parent/child relationships for implicit start-after/stop-before
>    * attribute inheritance (we have talked about this in the past;
>      it isn't hard, and may be beneficial)
>    * specification of child resource type ordering to prevent major
>      "gotchas" when defining resource groups (e.g. putting a
>      script on a file system but putting them in the wrong order,
>      causing errors)
>  * 'primary' attribute specification (not OCF compliant) is used to
> identify resource instances

That's all just meta-data, right?

>  * use of LSB 'status' to implement OCF 'monitor' function (status isn't
> specified in the RA API, but the monitor function as specified appears
> to map to the LSB status function... so most of our agents do
> monitor->status, though depth is still supported - maybe yours are the
> same; haven't fully investigated)

monitor is _not_ 1:1 the LSB status. That's exactly why we're not using
status.  ;-)

In particular, 3 vs 7 is a crucial difference, and we didn't want to
have to special-case the exit codes depending on the action being

>  * multiple references to the same resource instance - reference counts
> are used to prevent starting the same resource on the same node multiple
> times

We use explicit dependencies and thus can reference the same
primitive/clone/group in as-many places as needed.

>  * rgmanager allows reconfiguration of resource parameters without
> restarting the resource; maybe pacemaker does too; haven't checked; uses
> <parameter name="xxx" reconfig="1" .../> in the meta-data to enable it.

Our instance_attributes support a "reload" setting.

> pacemaker:
>  * promote / demote resource operations
>  * UUIDs used to identify resource instances (I like this better than
> what we do with type:primary_attr in rgmanager)

Yeah, well, the UUIDs are not the grandest idea we ever had - nowadays
at least the GUI tries to generate a shorter unique id w/o the full
cumbersomeness of UUIDs.

>  * clone resources and operations used to start (more or less) the same
> resource on multiple nodes

> General:
>  * resource migrate is likely done differently; not sure though (maybe
> you can tell me?):
>     <resource-agent> migrate <target_host_name>

Our model is both push and pull compatible. On the source, we execute a
"migrate_to" command (the target_host is passed via the environment),
and on the target, a "migrate_from". (That makes sense if you consider
this as _commands_ given to the nodes, otherwise it seems kind of the
wrong way around ;-)

The migrate_from also is our way of checking whether the migration
succeeded; I guess in your case you then run a monitor/status on the

> There will be more that I will come across, no doubt.  Those are just
> the ones on the surface.  I do not believe any of them are hard to deal
> with.

Right. I was in particular interested in understanding those differences
which affect the RA API, as that could possibly affect the usability of
RAs written for RHCS vs those written for ours. I think it's probably a
good idea to find some time to sit down and chat how to resolve these.

I've got a presentation from last year's BrainShare on what our scripts
do, that should be a usable starting point. Not much has changed since.

A further matter might be the shell scripts calling out to various
scripts which assume things in the environment - ie, we supply
ocf-shellfuncs (a shell source file) which defines ocf_log() and a few

> I think we both diverged in a compatible way here:
>  * <parameter ... required="1" .../> means this parameter must be
> specified for a given resource instance.

A compatible divergence can't possibly be a diverge ;-)

> I believe the idea was to use virtual machines resources, with those
> virtual machines in a cluster of their own.

Ah, OK.

> To clarify the requirements as stated: they were in the context of an
> existing implementation.
> Generally, with clustered virtual machines that can run on more than one
> physical node, at a bare minimum, you need to know only a few things on
> the physical hosts in order to implement fencing:
>  * where a particular vm is and its current state, or
>  * where that vm "was", and
>    * the state of the host running the vm, and
>    * if "bad" or "Dead", whether fencing has completed
> Certainly, pacemaker knows all of the above!

Right, of course. The external/xen STONITH script which we already have
could likely use crm_resource to find out and/or control the state of
the resource representing the DomU in the Dom0 cluster.

Now I see what you're saying.

> I doubt it would be difficult to make the existing fence agent/host
> preferentially use pacemaker to locate & kill VMs when possible (as
> opposed to simply talking to libvirt + AIS Checkpoint APIs as it does
> now).

I think at least some interaction here would be needed, because
otherwise, pacemaker/LRM would eventually run the monitor action, find
out that it's gone and restart it, which might not be what is desired


Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

More information about the Pacemaker mailing list