<div dir="ltr"><div><div><div><div><div><div><div>Hello Andrei,<br><br></div>yes this fixes the issue. But is there a way to automate this process without a manual intervention?<br><br></div>Node1 fails.<br><br></div>Node2 takes over the vip_bad and ipsrcaddr.<br><br></div>Node1 is back online.<br><br></div>vip_bad and ipsrcaddr are moved back to Node1.<br><br></div>Node2 sets the correct default_gw and it's own source address again (configured via ip_bad_2 and vip_bad_2_location).<br></div>^- this happens if i execute the cleanup manually<br><div><div><div><div><div><div><div><div><div><div><br># crm resource cleanup default_gw_clone<br>Cleaning up default_gw:0 on fw-managed-01, removing fail-count-default_gw<br>Cleaning up default_gw:0 on fw-managed-02, removing fail-count-default_gw<br>Waiting for 2 replies from the CRMd.. OK<br><br># crm status<br>Last updated: Mon Jan 22 19:43:22 2018 Last change: Mon Jan 22 19:43:17 2018 by hacluster via crmd on fw-managed-01<br>Stack: corosync<br>Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with quorum<br>2 nodes and 6 resources configured<br><br>Online: [ fw-managed-01 fw-managed-02 ]<br><br>Full list of resources:<br><br> vip_managed (ocf::heartbeat:IPaddr2): Started fw-managed-01<br> vip_bad (ocf::heartbeat:IPaddr2): Started fw-managed-01<br> Clone Set: default_gw_clone [default_gw]<br> Started: [ fw-managed-01 fw-managed-02 ]<br> src_address (ocf::heartbeat:IPsrcaddr): Started fw-managed-01<br> vip_bad_2 (ocf::heartbeat:IPaddr2): Started fw-managed-02<br><br>Failed Actions:<br>* src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18, status=complete, exitreason='[/usr/lib/heartbeat/findif -C] failed',<br> last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms<br><br>root@fw-managed-02:~# ip r<br>default via 100.200.123.161 dev bad<br><a href="http://100.200.123.160/29">100.200.123.160/29</a> dev bad proto kernel scope link src 100.200.123.165<br><a href="http://172.18.0.0/16">172.18.0.0/16</a> dev tun0 proto kernel scope link src 172.18.0.1<br><a href="http://172.30.40.0/24">172.30.40.0/24</a> dev managed proto kernel scope link src 172.30.40.252<br>root@fw-managed-02:~# ping 8.8.8.8<br>PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.<br>64 bytes from <a href="http://8.8.8.8">8.8.8.8</a>: icmp_seq=1 ttl=60 time=3.57 ms<br>^C<br></div></div></div></div></div></div></div></div></div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Jan 22, 2018 at 7:29 PM, Andrei Borzenkov <span dir="ltr"><<a href="mailto:arvidjaar@gmail.com" target="_blank">arvidjaar@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">22.01.2018 20:54, brainheadz пишет:<br>
<div><div class="h5">> Hello,<br>
><br>
> I've got 2 public IP's and 2 Hosts.<br>
><br>
> Each IP is assigned to one host. The interfaces are not configured by the<br>
> system, I am using pacemaker to do this.<br>
><br>
> fw-managed-01: <a href="http://100.200.123.166/29" rel="noreferrer" target="_blank">100.200.123.166/29</a><br>
> fw-managed-02: <a href="http://100.200.123.165/29" rel="noreferrer" target="_blank">100.200.123.165/29</a><br>
><br>
> gateway: 100.200.123.161<br>
><br>
> I am trying to get some form of active/passive cluster. fw-managed-01 is<br>
> the active node. If it fails, fw-managed-02 has to take over the VIP and<br>
> change it's IPsrcaddr. This works so far. But if fw-managed-01 comes back<br>
> online, the default Gateway isn't set again on the node fw-managed-02.<br>
><br>
> I'm quite new to this topic. The Cluster would work that way, but the<br>
> passive Node can never reach the internet cause of the missing default<br>
> gateway.<br>
><br>
> Can anyone explain to what I am missing or doing wrong here?<br>
><br>
> Output<br>
><br>
> # crm configure show<br>
> node 1: fw-managed-01<br>
> node 2: fw-managed-02<br>
> primitive default_gw Route \<br>
> op monitor interval=10s \<br>
> params destination=default device=bad gateway=100.200.123.161<br>
> primitive src_address IPsrcaddr \<br>
> op monitor interval=10s \<br>
> params ipaddress=100.200.123.166<br>
> primitive vip_bad IPaddr2 \<br>
> op monitor interval=10s \<br>
> params nic=bad ip=100.200.123.166 cidr_netmask=29<br>
> primitive vip_bad_2 IPaddr2 \<br>
> op monitor interval=10s \<br>
> params nic=bad ip=100.200.123.165 cidr_netmask=29<br>
> primitive vip_managed IPaddr2 \<br>
> op monitor interval=10s \<br>
> params ip=172.30.40.254 cidr_netmask=24<br>
> clone default_gw_clone default_gw \<br>
> meta clone-max=2 target-role=Started<br>
> location cli-prefer-default_gw default_gw_clone role=Started inf:<br>
> fw-managed-01<br>
<br>
</div></div>As far as I can tell this restricts clone to one node only. As it starts<br>
with cli- this was done using something like "crm resource move" or<br>
similar. Try<br>
<br>
crm resource clear default_gw_clone<br>
<div><div class="h5"><br>
> location src_address_location src_address inf: fw-managed-01<br>
> location vip_bad_2_location vip_bad_2 inf: fw-managed-02<br>
> location vip_bad_location vip_bad inf: fw-managed-01<br>
> order vip_before_default_gw inf: vip_bad:start src_address:start<br>
> symmetrical=true<br>
> location vip_managed_location vip_managed inf: fw-managed-01<br>
> property cib-bootstrap-options: \<br>
> have-watchdog=false \<br>
> dc-version=1.1.14-70404b0 \<br>
> cluster-infrastructure=<wbr>corosync \<br>
> cluster-name=debian \<br>
> stonith-enabled=false \<br>
> no-quorum-policy=ignore \<br>
> last-lrm-refresh=1516362207 \<br>
> start-failure-is-fatal=false<br>
><br>
> # crm status<br>
> Last updated: Mon Jan 22 18:47:12 2018 Last change: Fri Jan 19<br>
> 17:04:12 2018 by root via cibadmin on fw-managed-01<br>
> Stack: corosync<br>
> Current DC: fw-managed-01 (version 1.1.14-70404b0) - partition with quorum<br>
> 2 nodes and 6 resources configured<br>
><br>
> Online: [ fw-managed-01 fw-managed-02 ]<br>
><br>
> Full list of resources:<br>
><br>
> vip_managed (ocf::heartbeat:IPaddr2): Started fw-managed-01<br>
> vip_bad (ocf::heartbeat:IPaddr2): Started fw-managed-01<br>
> Clone Set: default_gw_clone [default_gw]<br>
> default_gw (ocf::heartbeat:Route): FAILED fw-managed-02 (unmanaged)<br>
> Started: [ fw-managed-01 ]<br>
> src_address (ocf::heartbeat:IPsrcaddr): Started fw-managed-01<br>
> vip_bad_2 (ocf::heartbeat:IPaddr2): Started fw-managed-02<br>
><br>
> Failed Actions:<br>
> * default_gw_stop_0 on fw-managed-02 'not installed' (5): call=26,<br>
> status=complete, exitreason='Gateway address 100.200.123.161 is<br>
> unreachable.',<br>
> last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=31ms<br>
> * src_address_monitor_0 on fw-managed-02 'unknown error' (1): call=18,<br>
> status=complete, exitreason='[/usr/lib/<wbr>heartbeat/findif -C] failed',<br>
> last-rc-change='Fri Jan 19 17:10:43 2018', queued=0ms, exec=75ms<br>
><br>
><br>
> best regards,<br>
> Tobias<br>
><br>
><br>
><br>
</div></div>> ______________________________<wbr>_________________<br>
> Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> <a href="http://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.clusterlabs.org/<wbr>mailman/listinfo/users</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/<wbr>doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
><br>
<br>
<br>
______________________________<wbr>_________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
<a href="http://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.clusterlabs.org/<wbr>mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" rel="noreferrer" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" rel="noreferrer" target="_blank">http://www.clusterlabs.org/<wbr>doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" rel="noreferrer" target="_blank">http://bugs.clusterlabs.org</a><br>
</blockquote></div><br></div>