[Pacemaker] Pacemaker failover problem

Tue Mar 9 02:13:41 EST 2010

On Tue, Mar 9, 2010 at 12:27 AM, Erich Weiler <weiler at soe.ucsc.edu> wrote:
> I think I may have found an answer.  I had this in my config:
>
> order LDAP-after-IP inf: LDAP-IP LDAP-clone
>
> And, according to the logs, it *looks* like what happens when genome-ldap1
> goes gown, the IP goes over to genome-ldap2, AND THEN tries to start LDAP
> there, even though LDAP is already started there because it is an anonymous
> clone.  LDAP cannot start (because it is already started) and throws an
> error exit code, and presumably pacemaker freaks out because of that and
> shuts down LDAP on all nodes.  Then the floating IP disappears because of
> the line:
>
> colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
>
> which is expected at that point.  It seems that when I tested this with
> older versions of pacemaker, this didn't happen.  Should 'order' statements
> be avoided entirely when dealing with anonymous clones?  Is that behavior
> expected?

The ordering constraint should have caused the cluster to stop LDAP first.
Have you checked both scripts are fully LSB compliant?
   http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html