[ClusterLabs] IPaddr2 works for 12 seconds then stops
Daniel Ragle
daniel at Biblestuph.com
Thu Oct 11 13:25:52 EDT 2018
I'm adding a VIP to my active/active two node cluster using IPaddr2.
These are on updated CentOS 7.5 machines.
When I bring up the IP, I'm able to ping it from an external machine for
about 12 seconds and then I get no further responses. This happens each
time I restart the VIP clone.
I can bring up the IP as a static alias IP on either of the two servers
and it works fine; I.E., I can then ping it from my external server
continuously. It's only when I try to cluster the IP that I have the issue.
For the 12 second window it *does* work in, it appears as though it
works only on one of the two servers (and always the same one). My
twelve seconds of pings runs continuously then stops; while attempts to
hit the Web server works hit or miss depending on my source port (I'm
using sourceip-sourceport). I.E., as if anything that would be handled
by the other server isn't making it through. But after the 12 seconds
neither server responds to the requests against the VIP (but they do
respond fine to their own static IPs at all times).
During the 12 seconds that it works I get these in the logs of the
server that *is* responding:
Oct 11 12:17:43 node2 kernel: ipt_CLUSTERIP: unknown protocol 1
Oct 11 12:17:44 node2 kernel: ipt_CLUSTERIP: unknown protocol 1
Oct 11 12:17:45 node2 kernel: ipt_CLUSTERIP: unknown protocol 1
Looking at the CLUSTERIP rules created, they *seem* to be ok to me (I
also tried shutting down the cluster and setting up the rules/adding the
IP "by hand" with the same results):
[root at node2 ~]# iptables -L -n | grep -i cluster
CLUSTERIP all -- 0.0.0.0/0 192.168.120.101 CLUSTERIP
hashmode=sourceip-sourceport clustermac=01:00:5E:5C:4B:8A total_nodes=2
local_node=2 hash_init=0
[root at colovs2 ~]# cat /proc/net/ipt_CLUSTERIP/192.168.120.101
2
[root at node1 ~]# iptables -L -n | grep -i cluster
CLUSTERIP all -- 0.0.0.0/0 192.168.120.101 CLUSTERIP
hashmode=sourceip-sourceport clustermac=01:00:5E:5C:4B:8A total_nodes=2
local_node=1 hash_init=0
[root at node1 ~]# cat /proc/net/ipt_CLUSTERIP/192.168.120.101
1
Above was with a MAC that I forced into my VIP setup (see below), but I
also tried with no MAC address provided (using the IPaddr2 default) with
the same result.
The logs just seem to note the initialization with (apparently) nothing
else interesting:
[root at node1 corosync]# cat /var/log/messages | grep -i ipaddr2
Oct 11 12:44:31 node1 IPaddr2(VIP:0)[105006]: INFO: Adding inet address
192.168.120.101/24 with broadcast address 192.168.120.255 to device bond0
Oct 11 12:44:31 node1 IPaddr2(VIP:0)[105006]: INFO: Bringing device bond0 up
Oct 11 12:44:31 node1 IPaddr2(VIP:0)[105006]: INFO:
/usr/libexec/heartbeat/send_arp -i 200 -c 5 -p
/var/run/resource-agents/send_arp-192.168.120.101 -I bond0 -m
01005e5c4b8a 192.168.120.101
And then:
[root at node1 corosync]# ip addr show bond0
10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP group default qlen 1000
link/ether 14:18:77:32:e3:d4 brd ff:ff:ff:ff:ff:ff
inet 192.168.120.80/24 brd 192.168.120.255 scope global bond0
valid_lft forever preferred_lft forever
inet 192.168.120.101/24 brd 192.168.120.255 scope global secondary
bond0
valid_lft forever preferred_lft forever
inet6 fe80::1618:77ff:fe32:e3d4/64 scope link
valid_lft forever preferred_lft forever
Finally:
[root at node1 corosync]# pcs resource show VIP-clone
Clone: VIP-clone
Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true
interleave=true
Resource: VIP (class=ocf provider=heartbeat type=IPaddr2)
Attributes: cidr_netmask=24 ip=192.168.120.101 nic=bond0
clusterip_hash=sourceip-sourceport mac=01:00:5e:5c:4b:8a
Meta Attrs: resource-stickiness=0
Utilization: weight=100
Operations: monitor interval=10 timeout=20 (VIP-monitor-interval-10)
start interval=0s timeout=20s (VIP-start-interval-0s)
stop interval=0s timeout=20s (VIP-stop-interval-0s)
Don't know where to look next. Any ideas?
More information about the Users
mailing list