[Pacemaker] Trouble with ordering

Fri Sep 30 11:20:34 EDT 2011

On 30.09.11 15:03, Serge Dubrouski wrote:
> May be you didn't look carefully but that script does exactly that, it
> monitors process and service. Also if you want cluster to control your
> service, it has to be able to start and stop it. You can configure your
> service as a clone and it'll be up on several nodes.
> But if you don't want to use it you don't have to.

You are right. I did not look at the monitor function. I checked the
status function and thought it would be in there if it checked it.

Technically, I don't want the cluster to control the service in the
meaning of starting and stopping. The cluster controls the IP addresses
and moves them between nodes. The dns service resource is supposed to
provide a check that the dns service is working on the node and migrate
the service and most important the IP address if it becomes unresponsive.

I didn't look at the concept of clones, yet. Maybe I took a completely
wrong approach to what I am trying to do.

The cluster until recently only operated the two DNS service IP
addresses ns1-ip and ns2-ip for our LAN. Three nodes are used to provide
redundancy in case one node fails. This way our two DNS server IPs are
active at all times.

Bind is running on all three nodes. Bind is configured to scan for
interface changes every 60s. The three nodes are configured as slave
servers, getting notified of zone updates by the master server.

This works in regard to node failures and similar. If a node crashes the
IP address is moved to another node.

The problem is if the node is still up but the named process becomes
unresponsive and is hanging. The cluster wouldn't notice this.

If I understand your script correctly, it starts and stops the named
process. If I do this, the node which is not running the dns server
won't get zone updates, i.e. if it starts it has outdated zone files.

Now if the master server is accessible and running at the time of start
the dns server gets updated quickly. The trouble is if the master is
down, too, the dns server will provide outdated dns information until
the master is running again.

That seems to me the problem when the bind process is started and
stopped on the nodes and that I was trying to avoid. IMHO the named
process can be running all the time, thus getting zone notifies in the
usual manner.

But maybe I am not getting what clones do. I think so far I didn't quite
get what they do exactly from the guides in respect to what I am trying
to achieve.

Maybe you can give me a hint how I would achieve this with a clone,
running named on all nodes at all times and moving the service IP
addresses between nodes in case a node or dns server fails or hangs.

Thanks!

Gerald