[ClusterLabs] EL6, cman, rrp, unicast and iptables

Noel Kuntze noel at familie-kuntze.de
Sat Sep 12 14:15:31 EDT 2015

Hash: SHA256

Hello Digimer,

> I am a strong believer in "keep it as simple as possible". In a case
> like this, it's hard to argue that any option is simple, but given that
> RRP is baked into the HA stack, I decided to trust it over QoS. I am
> perfectly open to contrary ideas though. Can you help me understand why
> you think tc's additional complexity is worth it? I'm willing to fully
> believe it is, but I want to understand the pros and cons first.

If you want or not, the kernel always has a queuing policy on any network device.
On CentOS 7, it's a pfifo_fast[1][2] queue, which is a classless fifo queue, but with 3 bands (0,1,2).
Bands with lower numbers have priority over higher numbers. As long as packets are in band 0, band 1 won't be worked upon and so on.
 The band a packet gets put into depends on the TOS/DSCP mark on the packet (TOS and DSCP use the same field in an IP packet.
They're just different standards for the values in it).
The TOS/DSCP field of applications that don't set a specific value for that on the network socket they use (SSH can do that using the IPQoS[3]).

So obviously, by influencing the TOS/DSCP field value with iptables, we can influence what packets get send out when.
The goal here is to prioritize Corosync totem traffic (UDP port 5404 and 5405) with the correct
TOS/DSCP value, so it ends up in band 0 and all the other stuff in band 3. This leaves you the option to
put dlm traffic into band 2.

The priority would thus be: totem > dlm > migration

As pfifo_fast maps different priorities into different bands based on the priomap,
it must be looked at to figure out what TOS/DSCP value must be set.
The second table on the linked LART article[2] gives you the priorities and what bands they're
mapped to. It tells us that TOS values from 0x10 to 0x16 are mapped to band 0. So we need to set Corosync
traffic to that TOS value. DLM must be set to a value that maps to band 1 and so on.

AFAIK, the TOS target in iptables is a non-terminating target, so packets that matched the rule continue
to traverse through the chain. We don't want that here, because it will screw with our prioritization order.
So we ACCEPT traffic that we TOS'ed.

Example iptables rule:

iptables -t mangle -A OUTPUT -p udp -m multiport --dports 5404,5405 -j TOS --set-tos 0x10
iptables -t mangle -A OUTPUT -p udp -m multiport --dports 5405,5405 -j ACCEPT
iptables -t mangle -A OUTPUT -p sctp -j TOS --set-tos 0x18
iptables -t mangle -A OUTPUT -p sctp -j ACCEPT
iptables -t mangle -A OUTPUT -j TOS --set-tos 0x8
(The default policy of *mangle OUTPUT is ACCEPT, so there's no need for an additional rule at the end to accept the rest)


[root at c7-arch-mirror-1 ~]# cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[root at c7-arch-mirror-1 ~]# tc qdisc
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

[2] http://lartc.org/howto/lartc.qdisc.classless.html#AEN658
[3] `man ssh_config`, search for IPQoS

- -- 

Mit freundlichen Grüßen/Kind Regards,
Noel Kuntze

GPG Key ID: 0x63EC6658
Fingerprint: 23CA BB60 2146 05E7 7278 6592 3839 298F 63EC 6658

Version: GnuPG v2


More information about the Users mailing list