[Pacemaker] dopd on openais

Sat Jun 6 10:33:48 EDT 2009

On 2009-06-06T10:59:44, Lars Ellenberg <lars.ellenberg at linbit.com> wrote:

> > On join of a drbd_<UUID> group, you could see who else is there and
> > connect to them, and also figure out if people try to start on more than
> > 2 nodes etc.
> now since when do you want a dopd bypassing the crm?
> to ensure that would be the crm's job, no?

I don't think of this as a bypass. OCFS2/DLM use similar mechanisms to
ensure their internal integrity as well.

With drbd supporting active/active or active/passive, for example, the
CRM/RA can't reliably tell whether the number of activated nodes is
corrected (and this will get worse if >2 nodes ever are supported),
without resorting to parse drbd's configuration file, which is icky (and
relies on the configfile being identical on all nodes).

And also this would reduce the amount of configuration necessary - ie,
if the IP addresses were inherited from the OpenAIS configuration. (By
default; of course this could be overridden.)

Actually, how about storing the configuration of each drbd instance in
the instance's meta-data?

With internal meta-data, for example, one could then simply say: "start
device XXX".

If the meta-data then was distributed using OpenAIS (say, in a
checkpoint, quite easy to do I'm told ;-), on the second node, the
initial setup would be reduced to "drbdadm clone <drbd-id>
<local-device>"

There could be a start-or-clone command too (maybe even the default?)
which would do the right thing (either resync if a copy already existed
or do a full clone), easing recovery of failed nodes.

And if the configuration is distributed using OpenAIS, doing a "drbdadm
configure change syncer-speed 10M" would immediately affect all nodes
w/o needing to manually modify drbd.conf everywhere.

Eventually, such a distributed user-space daemon could also allow you to
shift the meta-data handling and processing from the kernel. Might
appeal to some. ;-)

> what we actually are doing right now is placing location constraints on
> the master role into the cib from the "fence-peer" handler, and removing
> them again from the "after-sync-target" handler.  sort of works.

Here I think we need a more extensive discussion. Why doesn't your
handler modify the master score instead, but add additional
constraints?

Is this because the RA also messes with the master score and would
overwrite your choices?

If there was a drbdd handling all such issues, the RA could be
significantly simplified (ie, just real
start/stop/promote/demote/monitor commands and all the ugly hacks going
away but instead properly handled internally to drbd). I've never been
particularly happy with that aspect of it.

OCFS2/DLM actually also hook into the fencing subsystem, and trigger
such ops and wait for their completion, which I think should be
applicable to your scenario too.

There's a lot of random thoughts in the above, hence my hope that
someone (i.e., you ;-) would come up with a coherent design for how the
_optimal_ integration between drbd/OpenAIS/Pacemaker could look like ...
;-)

Regards,
    Lars

-- 
SuSE Labs, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde