[ClusterLabs] What's wrong with IPsrcaddr?

Fri Mar 18 05:07:30 EDT 2022

On Thu, Mar 17, 2022 at 9:20 AM ZZ Wave <zzwave at gmail.com> wrote:
>
> Thank you for the idea about a bug in resource script.
>
> ...
> NETWORK=`$IP2UTIL route list dev $INTERFACE scope link $PROTO match $ipaddress|grep -m 1 -o '^[^ ]*'`
> ...
>
> $NETWORK was surprisingly empty when a bug occurs, something wrong was with $PROTO variable. Command above returns the correct route without it and I've removed it. Now it works like a charm. Maybe it's something Debian10-specific.

There have been some recent fixes upstream.
  - https://github.com/ClusterLabs/resource-agents/commit/50a596bf

>
> чт, 17 мар. 2022 г. в 17:46, Andrei Borzenkov <arvidjaar at gmail.com>:
>>
>> On 17.03.2022 14:14, ZZ Wave wrote:
>> >> Define "network connectivity to node2".
>> >
>> > pacemaker instances can reach each other, I think.
>>
>> This is called split brain, the only way to resolve it is fencing.
>>
>> > In case of connectivity
>> > loss (turn off network interface manually, disconnect eth cable etc), it
>> > should turn off virtsrc and then virtip on active node, turn virtip on and
>> > then virtsrc on second node, and vice-versa. IPaddr2 alone works fine this
>> > way "out of a box", but IPsrcaddr doesn't :(
>> >
>>
>> According to scarce logs you provided stop request for IPsrcaddr
>> resource failed which is fatal. You do not use fencing so pacemaker
>> blocks any further change of resource state.
>>
>> I cannot say whether this is resource agent bug or agent legitimately
>> cannot perform stop action. Personally I would claim that if
>> corresponding routing entry is not present, resource is stopped so
>> failing stop request because no route entry was found sounds like a bug.
>>
>> > Is my setup correct for this anyway?
>>
>> You need to define "this". Your definition of "network connectivity"
>> ("pacemaker instances can reach each other") does not match what you
>> describe later. Most likely you want failover if current node does not
>> some *external* connectivity.
>>
>> > Howtos and google give me only "just
>> > add both resources to group or to colocation+order and that's all", but it
>> > definitely doesn't work the way I expect.
>> >
>>
>> So your expectations are wrong. You need to define more precisely what
>> is network connectivity in your case and how you check for it.
>>
>> >> What are static IPs?
>> >
>> > node1 192.168.80.21/24
>> > node2 192.168.80.22/24
>> > floating 192.168.80.23/24
>> > gw 192.168.80.1
>> >
>>
>> I did not ask for IP addresses. I asked for your explanation what
>> "static IP" means to you and how is it different from "gloating IP".
>>
>> >> I do not see anything wrong here.
>> >
>> > Let me explain. After initial setup, virtip and virtsrc successfully apply
>> > on node1. There are both .23 alias and def route src. After a network
>> > failure, there is NO default route at all on both nodes and IPsrcaddr
>> > fails, as it requires default route.
>> >
>>
>> I already explained above why IPsrcaddr was not migrated.
>>
>> >
>> > ср, 16 мар. 2022 г. в 19:23, Andrei Borzenkov <arvidjaar at gmail.com>:
>> >
>> >> On 16.03.2022 12:24, ZZ Wave wrote:
>> >>> Hello. I'm trying to implement floating IP with pacemaker but I can't
>> >>> get IPsrcaddr to work correctly. I want a following thing - floating
>> >>> IP and its route SRC is started on node1. If node1 loses network
>> >>> connectivity to node2, node1 should instantly remove floating IP and
>> >>> restore default route,
>> >>
>> >> Define "network connectivity to node2".
>> >>
>> >>> node2 brings these things up. And vice-versa when node1 returns.
>> >>> Static IPs should be intact in any way.
>> >>>
>> >>
>> >> What are static IPs?
>> >>
>> >>> What I've done:
>> >>>
>> >>> pcs host auth node1 node2
>> >>> pcs cluster setup my_cluster node1 node2 --force
>> >>> pcs cluster enable node1 node2
>> >>> pcs cluster start node1 node2
>> >>> pcs property set stonith-enabled=false
>> >>> pcs property set no-quorum-policy=ignore
>> >>> pcs resource create virtip ocf:heartbeat:IPaddr2 ip=192.168.80.23
>> >>> cidr_netmask=24 op monitor interval=30s
>> >>> pcs resource create virtsrc ocf:heartbeat:IPsrcaddr
>> >>> ipaddress=192.168.80.23 cidr_netmask=24 op monitor interval=30
>> >>> pcs constraint colocation add virtip with virtsrc
>> >>> pcs constraint order virtip then virtsrc
>> >>>
>> >>> It sets IP and src correctly on node1 one time after this setup, but
>> >>> in case of failover to node2 a havoc occurs -

Your colocation constraint should be "virtsrc with virtip", not
"virtip with virtsrc". virtsrc depends on virtip, not vice-versa.

It would be easier to put the resources in a group (with virtip first
and virtsrc second) instead of using constraints.

>> >>
>> >> Havoc is not useful technical description. Explain what is wrong.
>> >>
>> >>> https://pastebin.com/GZMtG480
>> >>>
>> >>> What's wrong?
>> >>
>> >> You tell us. I do not see anything wrong here.
>> >>
>> >>> Help me please :)
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> Manage your subscription:
>> >>> https://lists.clusterlabs.org/mailman/listinfo/users
>> >>>
>> >>> ClusterLabs home: https://www.clusterlabs.org/
>> >>
>> >> _______________________________________________
>> >> Manage your subscription:
>> >> https://lists.clusterlabs.org/mailman/listinfo/users
>> >>
>> >> ClusterLabs home: https://www.clusterlabs.org/
>> >>
>> >
>> >
>> > _______________________________________________
>> > Manage your subscription:
>> > https://lists.clusterlabs.org/mailman/listinfo/users
>> >
>> > ClusterLabs home: https://www.clusterlabs.org/
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/

-- 
Regards,

Reid Wahl (He/Him), RHCA
Senior Software Maintenance Engineer, Red Hat
CEE - Platform Support Delivery - ClusterHA