[ClusterLabs] Antw: Re: EL6, cman, rrp, unicast and iptables

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Sep 14 06:56:08 UTC 2015


>>> Noel Kuntze <noel at familie-kuntze.de> schrieb am 12.09.2015 um 20:15 in
Nachricht <55F46BC3.7000100 at familie-kuntze.de>:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> 
> Hello Digimer,
> 
>> I am a strong believer in "keep it as simple as possible". In a case
>> like this, it's hard to argue that any option is simple, but given that
>> RRP is baked into the HA stack, I decided to trust it over QoS. I am
>> perfectly open to contrary ideas though. Can you help me understand why
>> you think tc's additional complexity is worth it? I'm willing to fully
>> believe it is, but I want to understand the pros and cons first.
> 
> If you want or not, the kernel always has a queuing policy on any network 
> device.
> On CentOS 7, it's a pfifo_fast[1][2] queue, which is a classless fifo queue, 
> but with 3 bands (0,1,2).
> Bands with lower numbers have priority over higher numbers. As long as 
> packets are in band 0, band 1 won't be worked upon and so on.

Then that's not FIFO, but priority scheduling. Eveybody knows the starvation problem of priotity scheduling.

>  The band a packet gets put into depends on the TOS/DSCP mark on the packet 
> (TOS and DSCP use the same field in an IP packet.
> They're just different standards for the values in it).
> The TOS/DSCP field of applications that don't set a specific value for that 
> on the network socket they use (SSH can do that using the IPQoS[3]).
> 
> So obviously, by influencing the TOS/DSCP field value with iptables, we can 
> influence what packets get send out when.
> The goal here is to prioritize Corosync totem traffic (UDP port 5404 and 
> 5405) with the correct
> TOS/DSCP value, so it ends up in band 0 and all the other stuff in band 3. 
> This leaves you the option to
> put dlm traffic into band 2.

Imagine some cluster filesystem has it's own timeount / fencing mechanism. Then TOTEM when going wild can cause starvation of other services. It's the wrong design IMHO.
I wonder whether this can explain the mysterous cLVM retransmit list growing under some loads.

[...]

Regards,
Ulrich






More information about the Users mailing list