[Pacemaker] why does pacemaker execute fence action immediately when the target node becomes UNCLEAN?

Jake Smith jsmith at argotec.com
Wed Jan 2 14:32:37 EST 2013


----- Original Message -----
> From: "Digimer" <lists at alteeve.ca>
> To: "Jake Smith" <jsmith at argotec.com>, "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Wednesday, January 2, 2013 2:19:09 PM
> Subject: Re: [Pacemaker] why does pacemaker execute fence action immediately when the target node becomes UNCLEAN?
> 
> I suspect that, if you tested it, you would see that corosync fails
> over
> to the second ring when the first ring's bond breaks/recovers. Same
> in
> reverse. So you're protected against bond=!1 by that second layer of
> redundancy.
> 
> In my testing, only mode=1 was able to fail and recover without
> failing
> the totem ring.

Ahh makes sense - I'm sure I wasn't looking for that specific issue when I was testing the rings together.

Thanks!

> 
> On 01/02/2013 02:06 PM, Jake Smith wrote:
> > 
> > ----- Original Message -----
> >> From: "Digimer" <lists at alteeve.ca>
> >> To: "The Pacemaker cluster resource manager"
> >> <pacemaker at oss.clusterlabs.org>
> >> Cc: "Lars Marowsky-Bree" <lmb at suse.com>
> >> Sent: Wednesday, January 2, 2013 11:49:13 AM
> >> Subject: Re: [Pacemaker] why does pacemaker execute fence action
> >> immediately when the target node becomes UNCLEAN?
> >>
> >> On 01/02/2013 07:17 AM, Lars Marowsky-Bree wrote:
> >>> On 2012-12-20T15:36:45, bin chen <free2coder at gmail.com> wrote:
> >>>
> >>>> I have defined a fence resource ,and cloned it.But when a node
> >>>> becomes
> >>>> UNCLEAN(I disconneted its network),the fence action will be
> >>>> executed
> >>>> immediately.Is there a method to avoid it(for example,a network
> >>>> tolerance
> >>>> time for network flash time )?For if the network is not stable,
> >>>>  I don`t want cluster nodes be fenced again and again.:)
> >>>
> >>> Increase the membership timeout of the underlying layer.
> >>>
> >>> And make the network more stable. (Bonding, etc.)
> >>>
> >>>
> >>>
> >>> Regards,
> >>>     Lars
> >>
> >> To build on Lars' comments;
> >>
> >> "underlying layer" == corosync, which is tweaked in the
> >> corosync.conf
> >> file. As for bonding, only use mode=1. The other modes don't
> >> fail/recover fast enough.
> > 
> > Don't want to derail the main topic but aren't bonding modes 0 and
> > 4 acceptable also?
> > 
> > Those are what my first/second rings use and I've had no problems
> > (knock on wood)
> > 
> > Jake
> > 
> >>
> >> cheers
> >>
> >> --
> >> Digimer
> >> Papers and Projects: https://alteeve.ca/w/
> >> What if the cure for cancer is trapped in the mind of a person
> >> without
> >> access to education?
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >>
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person
> without
> access to education?
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 




More information about the Pacemaker mailing list