[ClusterLabs] Antw: Re: vip is not removed after node lost connection with the other two nodes

Wed Jul 5 04:57:36 EDT 2017

>>> Ken Gaillot <kgaillot at redhat.com> schrieb am 23.06.2017 um 19:12 in
Nachricht
<ad8f47c0-5c82-97ac-2aec-33dd57f59b0e at redhat.com>:
> On 06/23/2017 11:52 AM, Dimitri Maziuk wrote:
>> On 06/23/2017 11:24 AM, Jan Pokorný wrote:
>> 
>>> People using ifdown or the iproute-based equivalent seem far
>>> too prevalent, even if for long time bystanders the idea looks
>>> continually disproved ad nauseam.
>> 
>> Has anyone had a network card fail recently and what does that look like
>> on modern kernels? -- That's an honest question, I have not seen that in
>> forever (fingers crossed knock on wood).

The last NIC failure we had caused an error interrupt on the PCIe bus
(effectively a kernel reboot) ;-)

>> 
>> I.e. is the expectation that real life failure will be "nice" to
>> corosync actually warranted?
> 
> I don't think there is such an expectation. If I understand correctly,
> the issue with using ifdown as a test is two-fold: it's not a good
> simulation of a typical network outage, and corosync is unable to
> recover from an interface that goes down and later comes back up, so you
> can only test the "down" part. Implementing some sort of recovery
> mechanism in that situation is a goal for corosync 3, I believe.
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org 
> http://lists.clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org