[ClusterLabs] Virtual ip resource restarted on node with down network device
Lars Ellenberg
lars.ellenberg at linbit.com
Mon Sep 19 10:20:05 EDT 2016
On Mon, Sep 19, 2016 at 02:57:57PM +0200, Jan Pokorný wrote:
> On 19/09/16 09:15 +0000, Auer, Jens wrote:
> > After the restart ifconfig still shows the device bond0 to be not RUNNING:
> > MDA1PFP-S01 09:07:54 2127 0 ~ # ifconfig
> > bond0: flags=5123<UP,BROADCAST,MASTER,MULTICAST> mtu 1500
> > inet 192.168.120.20 netmask 255.255.255.255 broadcast 0.0.0.0
> > ether a6:17:2c:2a:72:fc txqueuelen 30000 (Ethernet)
> > RX packets 2034 bytes 286728 (280.0 KiB)
> > RX errors 0 dropped 29 overruns 0 frame 0
> > TX packets 2284 bytes 355975 (347.6 KiB)
> > TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
>
> This seems to suggest bond0 interface is up and address-assigned
> (well, the netmask is strange). So there would be nothing
> contradictory to what I said on the address of IPaddr2.
>
> Anyway, you should rather be using "ip" command from iproute suite
> than various if* tools that come short in some cases:
> http://inai.de/2008/02/19
> This would also be consistent with IPaddr2 uses under the hood.
The resource agent only controlls and checks
the presence of a certain IP on a certain NIC
(and some parameters).
What you likely ended up with after the "restart"
is an "empty" bonding device with that IP assigned,
but without any "slave" devices, or at least
with the slave devices still set to link down.
If you really wanted the RA to also know about the slaves,
and be able to properly and fully configure a bonding,
you'd have to enhance that resource agent.
If you want the IP to move to some other node,
if it has connectivity problems, use a "ping" and/or
"ethmonitor" resource in addition to the IP.
If you wanted to test-drive cluster response against a
failing network device, your test was wrong.
If you wanted to test-drive cluster response against
a "fat fingered" (or even evil) operator or admin:
give up right there...
You'll never be able to cover it all :-)
--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R&D, Integration, Ops, Consulting, Support
DRBD® and LINBIT® are registered trademarks of LINBIT
More information about the Users
mailing list