[Pacemaker] sbd fencing race

Dejan Muhamedagic dejanmm at fastmail.fm
Wed Nov 26 09:11:16 EST 2014


On Wed, Nov 26, 2014 at 11:13:41AM +0100, emmanuel segura wrote:
> But i would like to know if pacemaker needs to start sbd on the node
> where sbd resource isnt running to fence the other nodes, because i
> don't see any start action in the second node:

That's strange. I'd expect that a stonith resource needs to be
started (enabled) first. Perhaps that changed, as it seems to be
the case judging by the logs below. I cannot offer any more
advice here, but would still like to know the circumstances and
how it happened that the nodes shot each other.

Thanks,

Dejan


> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> 
> message_2cd.txt:Nov 23 11:43:28 node01 sbd: [69794]: WARN: CIB: We do
> NOT have quorum!
> message_2cd.txt:Nov 23 11:43:28 node01 sbd: [69791]: WARN: Pacemaker
> health check: UNHEALTHY
> message_2cd.txt:Nov 23 11:43:28 node01 pengine: [69823]: notice:
> LogActions: Leave   stonith-sbd    (Started node01)
> message_2ch.txt:Nov 23 11:43:28 s02srv002ch sbd: [97640]: WARN: CIB:
> We do NOT have quorum!
> 
> :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> 
> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [97640]: WARN: CIB: We do
> NOT have quorum!
> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [97637]: WARN: Pacemaker
> health check: UNHEALTHY
> message_2ch.txt:Nov 23 11:43:28 node02 pengine: [97679]: WARN:
> custom_action: Action stonith-sbd_stop_0 on node01 is unrunnable
> (offline)
> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Delivery
> process handling /dev/mapper/SBD01B0298700230
> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Writing
> reset to node slot node01
> message_2ch.txt:Nov 23 11:43:28 node02 sbd: [157717]: info: Messaging delay: 40
> 
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> 
> Thanks
> 
> 2014-11-26 10:26 GMT+01:00 Dejan Muhamedagic <dejanmm at fastmail.fm>:
> > Hi,
> >
> > On Tue, Nov 25, 2014 at 04:20:32PM +0100, emmanuel segura wrote:
> >> Hi list,
> >>
> >> The last night, i had a cluster in fencing race using sbd as stonith
> >
> > Can you give a bit more details.
> >
> >> device, i would like to know what is the effect to use start-delay in
> >> my stonith resource in this way:
> >>
> >> primitive stonith-sbd stonith:external/sbd \
> >>         params sbd_device="/dev/mapper/SBD \
> >>         op start interval="0" start-delay="5"
> >
> > Yes, that could help with a stonith deathmatch. Normally, you
> > have a stonith resource running on one node. On split brain, the
> > other node also starts the resource in order to shoot the first
> > node. That's where start-delay comes into play.
> >
> > Ultimate resource for the issue: http://ourobengr.com/ha/
> >
> > Cheers,
> >
> > Dejan
> >
> >> Thanks
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> -- 
> esta es mi vida e me la vivo hasta que dios quiera
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list