[Pacemaker] Problem with failover/failback under Ubuntu 10.04 for Active/Passive OpenNMS

Mon Jul 5 13:34:11 EDT 2010

On Mon, 05 Jul 2010 18:34:24 +0300, Dan Frincu <dfrincu at streamwide.ro>
wrote:

> /The errors from the log file are DRBD specific, they occur when you're
> trying to mount a resource in a Secondary state. 
> Increase the "op start interval" for both the DRBD and Filesystem
> primitives to ~15 seconds. Having configured a start 
> interval of 0 (zero) seconds, the change of DRBD resource from Primary
to
> Secondary on node2 and then promotion to 
> Primary on node1 is not instantaneous, therefore Pacemaker attempts to
> mount the filesystem without having the DRBD 
> resource in a Primary state, it goes into that huuuge 300 second
timeout,
> but as it waits for one resource (DRBD) to 
> timeout, it executes the next one, which is the mount, which fails, with
> the given errors, for the aforementioned reasons.
> 
> I'd also suggest adding an "op monitor" for each resource, with a
> reasonable interval and timeout, and also a mail alert.
> 
> Regards,
> Dan

Ok, that almost solved the problem.
But now the Filesystem primitives run in an endless loop.
The get unmounted and mounted again.

> therefore Pacemaker attempts to
> mount the filesystem without having the DRBD 
> resource in a Primary state

Hm, until now I thought this is handled by
the 3 "order" restrictions.

I see I have to find out which intervalls and timeouts I need to adjust.
Thanks for giving me a hint to the right direction so quickly.

If you have some other ideas to improve the config, just let me now.

Cheers, Sven