[ClusterLabs] why is node fenced ?
bernd.lentes at helmholtz-muenchen.de
Thu Aug 15 17:36:54 EDT 2019
----- Am 14. Aug 2019 um 19:07 schrieb kgaillot kgaillot at redhat.com:
>> That's my setting:
>> expected_votes: 2
>> two_node: 1
>> wait_for_all: 0
>> I did that because i want be able to start the cluster although one
>> node has e.g. a hardware problem.
>> Is that ok ?
> Well that's why you're seeing what you're seeing, which is also why
> wait_for_all was created :)
> You definitely don't need no-quorum-policy=ignore in any case. With
> two_node, corosync will continue to provide quorum to pacemaker when
> one node goes away, so from pacemaker's view no-quorum-policy never
> kicks in.
> With wait_for_all enabled, the newly joining node wouldn't get quorum
> initially, so it wouldn't fence the other node. So that's the trade-
> off, preventing this situation vs being able to start one node alone
> intentionally. Personally, I'd leave wait_for_all on normally, and
> manually change it to 0 whenever I was intentionally taking one node
> down for an extended time.
That sounds like a good idea, i will think about it.
> Of course all of that is just recovery, and doesn't explain why the
> nodes can't see each other to begin with.
Yes. I don't have any idea. The bonds, each with two eth's for the corosync rings are connected directly from host to host,
no switch between, just a wire. Two wires break at the same moment ... i can't believe.
And the bonds are monitored via SNMP, so i'm immediately informed when they have trouble.
I didn't get any e-Mail.
Maybe heavy load at that time ? I have atop running, logging every second,i will have a look
in the respective logs.
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling
Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671
More information about the Users