[Pacemaker] Trouble with ordering

Lars Ellenberg lars.ellenberg at linbit.com
Fri Sep 30 04:40:11 EDT 2011

On Fri, Sep 30, 2011 at 10:06:51AM +0200, Gerald Vogt wrote:
> Hi!
> I am running a cluster with 3 nodes. These nodes provide dns service.
> The purpose of the cluster is to have our two dns service ip addresses
> online at all times. I use IPaddr2 and that part works.
> Now I try to extend our setup to check the dns service itself. So far,
> if a dns server on any node stops or hangs the cluster won't notice.
> Thus, I wrote a custom ocf script to check whether the dns service on
> a node is operational (i.e. if the dns server is listening on the ip
> address and whether it responds to a dns request).
> All cluster nodes are slave dns servers, therefore the dns server
> process is running at all times to get zone transfers from the dns
> master.
> Obviously, the dns service resource must be colocated with the IP
> address resource. However, as the dns server is running at all times,
> the dns service resource must be started or stopped after the ip
> address. This leads me to something like this:
> primitive ns1-ip ocf:heartbeat:IPaddr2 ...
> primitive ns1-dns ocf:custom:dns op monitor interval="30s"
> colocation dns-ip1 inf: ns1-dns ns1-ip
> order ns1-ip-dns inf: ns1-ip ns1-dns symmetrical=false

maybe, if this is what you mean, add:
order ns1-ip-dns inf: ns1-ip:stop ns1-dns:stop symmetrical=false

> Problem 1: it seems as if the order constraint does not wait for an
> operation on the first resource to finish before it starts the
> operation on the second. When I migrate an IP address to another node
> the stop operation on ns1-dns will fail because the ip address is
> still active on the network interface. I have worked around this by
> checking for the IP address on the interface in the stop part of my
> dns script and sleeping 5 seconds if it is still there before checking
> again and continuing.
> Shouldn't the stop on ns1-ip first finish before the node initiates
> the stop on ns1-dns?
> Problem 2: if the dns service fails, e.g. hangs, the monitor operation
> fails. Thus, the cluster wants to migrate the ip address and service
> to another node. However, it first initiates a stop on ns1-dns and
> then on ns1-ip.
> What I need is ns1-ip to stop before ns1-dns. But this seems
> impossible to configure. The order constraint only says what operation
> is executed on ns1-dns depending on the status of ns1-ip. It says what
> happens after something. It cannot say what happens before something.
> Is that correct? Or am I missing a configuration option?
> Thanks,
> Gerald

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

More information about the Pacemaker mailing list