[ClusterLabs] Two-Node Failover IP-Address and Gateway

brainheadz brainheadz at gmail.com
Mon Jan 22 14:09:09 EST 2018


Hello Andrei,

yes this fixes the issue. But is there a way to automate this process
without a manual intervention?

Node1 fails.

Node2 takes over the vip_bad and ipsrcaddr.

Node1 is back online.

vip_bad and ipsrcaddr are moved back to Node1.

Node2 sets the correct default_gw and it's own source address again
(configured via ip_bad_2 and vip_bad_2_location).
^- this happens if i execute the cleanup manually

# crm resource cleanup default_gw_clone
Cleaning up default_gw:0 on fw-managed-01, removing fail-count-default_gw
Cleaning up default_gw:0 on fw-managed-02, removing fail-count-default_gw
Waiting for 2 replies from the CRMd.. OK

# crm status
Last updated: Mon Jan 22 19:43:22 2018          Last change: Mon Jan 22
19:43:17 2018 by hacluster via crmd on fw-managed-01
Stack: corosync
Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with quorum
2 nodes and 6 resources configured

Online: [ fw-managed-01 fw-managed-02 ]

Full list of resources:

 vip_managed    (ocf::heartbeat:IPaddr2):       Started fw-managed-01
 vip_bad        (ocf::heartbeat:IPaddr2):       Started fw-managed-01
 Clone Set: default_gw_clone [default_gw]
     Started: [ fw-managed-01 fw-managed-02 ]
 src_address    (ocf::heartbeat:IPsrcaddr):     Started fw-managed-01
 vip_bad_2      (ocf::heartbeat:IPaddr2):       Started fw-managed-02

Failed Actions:
* src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18,
status=complete, exitreason='[/usr/lib/heartbeat/findif -C] failed',
    last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms

root at fw-managed-02:~# ip r
default via 100.200.123.161 dev bad
100.200.123.160/29 dev bad  proto kernel  scope link  src 100.200.123.165
172.18.0.0/16 dev tun0  proto kernel  scope link  src 172.18.0.1
172.30.40.0/24 dev managed  proto kernel  scope link  src 172.30.40.252
root at fw-managed-02:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=60 time=3.57 ms
^C

On Mon, Jan 22, 2018 at 7:29 PM, Andrei Borzenkov <arvidjaar at gmail.com>
wrote:

> 22.01.2018 20:54, brainheadz пишет:
> > Hello,
> >
> > I've got 2 public IP's and 2 Hosts.
> >
> > Each IP is assigned to one host. The interfaces are not configured by the
> > system, I am using pacemaker to do this.
> >
> > fw-managed-01: 100.200.123.166/29
> > fw-managed-02: 100.200.123.165/29
> >
> > gateway: 100.200.123.161
> >
> > I am trying to get some form of active/passive cluster. fw-managed-01 is
> > the active node. If it fails, fw-managed-02 has to take over the VIP and
> > change it's IPsrcaddr. This works so far. But if fw-managed-01 comes back
> > online, the default Gateway isn't set again on the node fw-managed-02.
> >
> > I'm quite new to this topic. The Cluster would work that way, but the
> > passive Node can never reach the internet cause of the missing default
> > gateway.
> >
> > Can anyone explain to what I am missing or doing wrong here?
> >
> > Output
> >
> > # crm configure show
> > node 1: fw-managed-01
> > node 2: fw-managed-02
> > primitive default_gw Route \
> >         op monitor interval=10s \
> >         params destination=default device=bad gateway=100.200.123.161
> > primitive src_address IPsrcaddr \
> >         op monitor interval=10s \
> >         params ipaddress=100.200.123.166
> > primitive vip_bad IPaddr2 \
> >         op monitor interval=10s \
> >         params nic=bad ip=100.200.123.166 cidr_netmask=29
> > primitive vip_bad_2 IPaddr2 \
> >         op monitor interval=10s \
> >         params nic=bad ip=100.200.123.165 cidr_netmask=29
> > primitive vip_managed IPaddr2 \
> >         op monitor interval=10s \
> >         params ip=172.30.40.254 cidr_netmask=24
> > clone default_gw_clone default_gw \
> >         meta clone-max=2 target-role=Started
> > location cli-prefer-default_gw default_gw_clone role=Started inf:
> > fw-managed-01
>
> As far as I can tell this restricts clone to one node only. As it starts
> with cli- this was done using something like "crm resource move" or
> similar. Try
>
> crm resource clear default_gw_clone
>
> > location src_address_location src_address inf: fw-managed-01
> > location vip_bad_2_location vip_bad_2 inf: fw-managed-02
> > location vip_bad_location vip_bad inf: fw-managed-01
> > order vip_before_default_gw inf: vip_bad:start src_address:start
> > symmetrical=true
> > location vip_managed_location vip_managed inf: fw-managed-01
> > property cib-bootstrap-options: \
> >         have-watchdog=false \
> >         dc-version=1.1.14-70404b0 \
> >         cluster-infrastructure=corosync \
> >         cluster-name=debian \
> >         stonith-enabled=false \
> >         no-quorum-policy=ignore \
> >         last-lrm-refresh=1516362207 \
> >         start-failure-is-fatal=false
> >
> > # crm status
> > Last updated: Mon Jan 22 18:47:12 2018          Last change: Fri Jan 19
> > 17:04:12 2018 by root via cibadmin on fw-managed-01
> > Stack: corosync
> > Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with
> quorum
> > 2 nodes and 6 resources configured
> >
> > Online: [ fw-managed-01 fw-managed-02 ]
> >
> > Full list of resources:
> >
> >  vip_managed    (ocf::heartbeat:IPaddr2):       Started fw-managed-01
> >  vip_bad        (ocf::heartbeat:IPaddr2):       Started fw-managed-01
> >  Clone Set: default_gw_clone [default_gw]
> >      default_gw (ocf::heartbeat:Route): FAILED fw-managed-02 (unmanaged)
> >      Started: [ fw-managed-01 ]
> >  src_address    (ocf::heartbeat:IPsrcaddr):     Started fw-managed-01
> >  vip_bad_2      (ocf::heartbeat:IPaddr2):       Started fw-managed-02
> >
> > Failed Actions:
> > * default_gw_stop_0 on fw-managed-02 'not installed' (5): call=26,
> > status=complete, exitreason='Gateway address 100.200.123.161 is
> > unreachable.',
> >     last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=31ms
> > * src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18,
> > status=complete, exitreason='[/usr/lib/heartbeat/findif -C] failed',
> >     last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms
> >
> >
> > best regards,
> > Tobias
> >
> >
> >
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > http://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180122/5050bb0e/attachment-0003.html>


More information about the Users mailing list