[Pacemaker] IPv6addr failure loopback interface
Lars Ellenberg
lars.ellenberg at linbit.com
Thu Nov 17 18:56:12 UTC 2011
On Mon, Oct 24, 2011 at 02:57:24PM +0200, Arturo Borrero Gonzalez wrote:
> Hi there,
>
> I'm working on deploying an Active/Active openldap cluster.
>
> At first, I have 2 nodes.
>
> I'm having some troubles with IPv6addr when trying to assing an IPv6 to the
> loopback interface.
>
> The error is not very explicit:
>
> IPv6addr: [1563]: ERROR: no valid mecahnisms //(yes, malformed word
> included)
And grepping for that malformed word in IPv6addr.c
would have been very easy.
Following the call path leads quickly to
scan_if(),
which has a comment that says:
/* Consider link-local addresses (scope == 0x20) only when
* the inerface name is provided, and global addresses
* (scope == 0). Skip everything else.
*/
where that 0x20 is what shows up in the 4. column of /proc/net/if_inet6,
not what IPv6 scope values are defined to be.
Apparently IPv6addr will only ever try to manage an address that shares
the scope and prefix of some existing one.
I suspect that IPv6addr would even work,
as soon as you have manually assigned one fc00::/7 address to lo.
The main reason for implementing it in C was to be able to
/* Send an unsolicited advertisement packet
* Please refer to rfc4861 / rfc3542
*/
Which, well, does not appear to be useful on lo, anyways.
So if your shell thingy works for you, why not.
Anyways, if you change scan_if() (or whatever else is necessary) to e.g. just
use the provided if name, and not do sanity checks on scope and prefix, that
should be enough, and that patch should be fairly small.
Though I fail to understand the use case.
why would I want to assign globally routable IPv6 addresses to lo?
> Adding an IPv6 addr to the loopback interface is possible with ifconfig, so
> maybe I should write a new IPv6addr RA that manage IPv6addr on loopback with
> ifconfig.
> Something like this:
>
> ifconfig lo add fc00::10/7
> ifconfig lo del fc00::10/7
>
> To monitor:
>
> ifconfig lo | grep fc00::10
> if [ $? -ne 0 ] then;
> don't have that ip on loopback
> else
> we have that ip on loopback
> fi
>
> What do you think?
On Fri, Oct 28, 2011 at 10:55:46PM +0200, Arturo Borrero Gonzalez wrote:
> In a previous mail, I reported some errors with IPv6addr assigning IPv6 to
> the loopback interface.
>
> I've developed a RA that is able to manage an IPv6 in the main loopback
> interface of most linux systems: "lo".
>
>
> I put here the code, but you can also found it here:
>
> http://pastebin.com/rsqz83V3
> http://ral-arturo.blogspot.com/2011/10/ipv6addrlo-asignando-ipv6-interfaz-de.html
>
> #!/bin/bash
> #
> # OCF Resource Agent compliant resource script.
> # Arturo Borrero <aborrero at cica.es> || October 2011
> #
> # Based on the anything RA.
> #
> # GPLv3 Licensed. You can read the license in
> # http://www.gnu.org/licenses/gpl-3.0.html
> #
> # Initialization:
>
> : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
> . ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs
>
> # Custom vars:
> IFCONFIG_BIN="/sbin/ifconfig"
> GREP_BIN="grep"
> IFACE="lo"
> process=$OCF_RESOURCE_INSTANCE
> ipv6addr=$OCF_RESKEY_ipv6addr
> cidr_netmask=$OCF_RESKEY_cidr_netmask
> pidfile=$OCF_RESKEY_pidfile ; [ -z "$pidfile" ] &&
> pidfile=${HA_VARRUN}IPv6addrLO_${process}.pid
> logfile=$OCF_RESKEY_logfile ; [ -z "$logfile" ] && logfile="/var/log/syslog"
> errlogfile=$OCF_RESKEY_errlogfile ; [ -z "$errlogfile" ] &&
> errlogfile="/var/log/syslog"
>
>
> validate_ipv6(){
> ocf_log debug "Validating IPv6 addr: [\"$1\"]."
>
> echo "$1" | $GREP_BIN -E
> "^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]
> |2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])(.(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])){3}))|:)))(%.+)?\s*$"
> > /dev/null
Are you serious ;-)
> if [ $? -eq 0 ]
> then
indentation seems a bit unusual?
maybe that's a mail client issue though.
> # the ipv6 is valid
> ocf_log debug "IPv6 addr: [\"$1\"] is valid."
> return 1
Logic inversion.
return != 0 is "false" in shell,
return 0 is "true" in shell.
Think exit codes.
I won't mention this on every occasion,
you have that throughout the script.
> fi
> # the ipv6 is invalid
> ocf_log err "IPv6 addr: [\"$1\"] is not valid."
> return 0
> }
> validate_cidr(){
> ocf_log debug "Validating cidr: \"$1\"."
>
> if [ $1 -lt 129 ]
> then
> if [ $1 -gt 0 ]
> then
> # the cidr is valid
> ocf_log debug "Cidr: \"$1\" is valid."
> return 1
> fi
> fi
> ocf_log err "Cidr: \"$1\" is not valid."
> return 0
> }
>
> iface_has_ipv6()
> {
> ocf_log debug "Checking if iface \"$IFACE\" has the ipv6 [\"$ipv6addr\"]."
> if [ ! -z $1 ]
> then
> $IFCONFIG_BIN $IFACE | $GREP_BIN $1 2> /dev/null > /dev/null
Decide on what you want to do here: $1 or $ipv6addr ...
I think you always want to use $ipv6addr.
That "store parameter in tempfile" thing is broken, IMO.
I'm not sure if you can really do the grep reliably.
At least anchor it, or you may get false matches (in theory)
maybe " $ipv6addr/", or even add the mask to the pattern as well.
And you have to be pretty sure that you really specify the parameter
in the "canonical" form ifconfig uses, or you are out of luck.
> if [ $? -eq 0 ]
> then
> # the iface has the IPv6
> ocf_log info "The iface \"$IFACE\" has the ipv6 [\"$ipv6addr\"]."
> return 1
> fi
> ocf_log info "The iface \"$IFACE\" does not have the ipv6 [\"$ipv6addr\"]."
> fi
> return 0
> }
>
> IPv6addrLO_status() {
> # Will check that the system has the ipv6 saved in the pidfile
What for?
> if [ -r $pidfile ]
> then
> ocf_log debug "STATUS: The pidfile \"$pidfile\" exists."
>
> validate_ipv6 `cat $pidfile | awk -F'/' '{print $1}'`
> if [ $? -eq 1 ]
> then
> # the ipv6 stored in pidfile is valid, then check if the system has that ip
> iface_has_ipv6 `cat $pidfile`
> if [ $? -eq 1 ]
Argh.
If you'd get the "true/false" logic right for shell,
you would write
if iface_has_ipv6 ; then
# log
return $OCF_SUCCESS
else
# log
return $OCF_ERR_GENERIC
fi
no parameter no iface_has_ipv6, because, well, you are always interested in
$OCF_RESKEY_ipv6addr, and nothing else, which is in the environment anyways.
> then
> ocf_log info "The iface \"$IFACE\" has the IPv6 \"[`cat $pidfile`]\" stored
> in \"$pidfile\"."
> return $OCF_RUNNING
By accident, this would expand to nothing, and a plain return will return $?,
which in this case would always be 0, which is $OCF_SUCCESS anyways.
But that's the point: there is no $OCF_RUNNING, it's called $OCF_SUCCESS.
> else
> ocf_log err "When checking status, the iface \"$IFACE\" has nor the IPv6 of
> the \"$pidfile\" nor \"$ipv6addr\"."
> return $OCF_ERR_GENERIC
> fi
> else
> ocf_log err "The ipv6addr in \"$pidfile\" is not valid: [\"`cat
> $pidfile`\"]."
You do not want to do that.
I see NO reason to place an address in a $pidfile (or any other tempfile).
What should that be good for.
Just always only do iface_has_ipv6 $ipv6addr.
> return $OCF_ERR_GENERIC
> fi
> fi
> ocf_log debug "The pidfile \"$pidfile\" don't exists."
> return $OCF_NOT_RUNNING
> }
>
> IPv6addrLO_start() {
> if ! IPv6addrLO_status
> then
> # First, validate the input parameteres, ipv6addr and cidr_netmaks
> validate_ipv6 $ipv6addr
> if [ $? -ne 1 ]
> then
> ocf_log err "$process: The ipv6 addr: \"$ipv6addr\" is not a valid one."
> return $OCF_ERR_GENERIC
> fi
> validate_cidr $cidr_netmask
> if [ $? -ne 1 ]
> then
> ocf_log err "$process: The cidr netmask \"$cidr_netmask\" is not valid."
> return $OCF_ERR_GENERIC
> fi
>
>
> # Before assign the ip, check if we already have that ip
> # because maybe we had a sudden reboot and the ipv6 is still on lo.
> iface_has_ipv6 $ipv6addr
> if [ $? -eq 1 ]
> then
> # we have the IPv6addr on loopback
> ocf_log info "The iface \"$IFACE\" had the IPv6 addr
> [\"$ipv6addr/$cidr_netmask\"], don't assigning again."
> touch $pidfile
No, really. $pidfile is nonsense here.
> if [ $? -ne 0 ]
> then
> ocf_log war "Could not create the pidfile \"$pidfile\"."
> fi
> echo "$ipv6addr/$cidr_netmask" > $pidfile
> if [ $? -ne 0 ]
> then
> ocf_log err "Failed to manage the new pidfile for
> \"$ipv6addr/\$cidr_netmask\"."
> fi
> else
> # we don't have the IPv6addr on loopback
> ocf_log info "Starting $process"
>
> # Doing different depending on what logfile we have.
> if [ -n "$logfile" -a -n "$errlogfile" ]
> then
> # We have logfile and errlogfile, so redirect STDOUT und STDERR to different
> files
> $IFCONFIG_BIN $IFACE add $ipv6addr/$cidr_netmask >> $logfile 2>>
> $errlogfile
> else
> if [ -n "$logfile" ]
> then
> # We only have logfile so redirect STDOUT and STDERR to the same file
> $IFCONFIG_BIN $IFACE add $ipv6addr/$cidr_netmask >> $logfile 2>&1
there is ocf_run for this: it will run the command, and if it failed, things
will show up where all the other cluster logs show up.
Please do not introduce a resource agent specific log file,
or at least clearly document in the meta data that this is
to assist in debugging the RA only.
> else
> # We have neither logfile nor errlogfile, so we're not going to redirect
> anything
> $IFCONFIG_BIN $IFACE add $ipv6addr/$cidr_netmask
> fi
> fi
> echo "$ipv6addr/$cidr_netmask" > $pidfile
> fi
>
> # Check what happened here.
> if IPv6addrLO_status
You ever seen an ifconfig add have "exit 0",
then ifconfig | grep not seeing the address?
I think that's not necessary.
> then
> ocf_log info "$process: Started successfully."
> return $OCF_SUCCESS
> else
> ocf_log err "$process: Could not be started: ipv6addr[\"$ipv6addr\"]
> cidr_netmask[\"$cidr_netmask\"]."
> return $OCF_ERR_GENERIC
> fi
> else
> # If already running, consider start successful
> ocf_log debug "$process: is already running"
> return $OCF_SUCCESS
> fi
> }
>
> IPv6addrLO_stop() {
>
> ocf_log debug "$process: Running STOP function."
>
> if [ -n "$OCF_RESKEY_stop_timeout" ]
> then
> stop_timeout=$OCF_RESKEY_stop_timeout
> elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then
> # Allow 2/3 of the action timeout for the orderly shutdown
> # (The origin unit is ms, hence the conversion)
> stop_timeout=$((OCF_RESKEY_CRM_meta_timeout/1500))
> else
> stop_timeout=10
> fi
and suddenly, completely different (and much more readable) indentation.
thanks.
Still I think this is no necessary.
Or at least, I don't understand what you are trying to protect against:
Why would ifconfig del fail, and a few seconds later succeed?
If you really want to retry, this whole function should become
while iface_has_ipv6 && ! ifconfig del ; do sleep 1; done
return $OCF_SUCCESS
and the crmd/lrmd will enforce the timeout on you.
No need to go fancy and simulate a "shutdown escalation" like an IP address was
a database or something.
> if IPv6addrLO_status
> then
> $IFCONFIG_BIN $IFACE del `cat $pidfile`
> i=0
> while [ $i -lt $stop_timeout ]
> do
> if ! IPv6addrLO_status
> then
> rm -f $pidfile
> return $OCF_SUCCESS
> fi
> sleep 1
> i=`expr $i + 1`
> done
> ocf_log warn "Stop failed. Trying again."
> $IFCONFIG_BIN $IFACE del `cat $pidfile`
> rm -f $pidfile
> if ! IPv6addrLO_status
> then
> ocf_log warn "Stop success."
> return $OCF_SUCCESS
> else
> ocf_log err "Failed to stop."
> return $OCF_ERR_GENERIC
> fi
> else
> # was not running, so stop can be considered successful
> $ICONFIG_BIN $IFACE del `cat $pidfile`
> rm -f $pidfile
> return $OCF_SUCCESS
> fi
> }
>
> IPv6addrLO_monitor() {
> IPv6addrLO_status
> ret=$?
> if [ $ret -eq $OCF_SUCCESS ]
> then
> if [ -n "$OCF_RESKEY_monitor_hook" ]; then
> eval "$OCF_RESKEY_monitor_hook"
> if [ $? -ne $OCF_SUCCESS ]; then
> return ${OCF_ERR_GENERIC}
> fi
> return $OCF_SUCCESS
> else
> true
> fi
> else
> return $ret
> fi
> }
>
>
> IPv6addrLO_validate() {
>
> ocf_log debug "IPv6addrLO validating: args:[\"$*\"]"
>
> if [ -x $IFCONFIG_BIN ]
> then
> ocf_log debug "Binary \"$IFCONFIG_BIN\" exist and is executable."
> return $OCF_SUCCESS
> else
> ocf_log err "Binary \"$IFCONFIG_BIN\" does not exist or isn't executable."
> return $OCF_ERR_INSTALLED
> fi
> ocf_log err "Error while validating."
> return $OCF_ERR_GENERIC
> }
>
> IPv6addrLO_meta(){
> cat <<END
> <?xml version="1.0"?>
> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
> <resource-agent name="IPv6addrLO">
> <version>0.1</version>
> <longdesc lang="en">
> OCF RA to manage IPv6addr on loopback interface Linux
> </longdesc>
> <shortdesc lang="en">IPv6 addr on loopback linux</shortdesc>
>
> <parameters>
> <parameter name="ipv6addr" required="1">
> <longdesc lang="en">
> The ipv6 addr to asign to the loopback interface.
> </longdesc>
> <shortdesc lang="en">Ipv6 addr to the loopback interface.</shortdesc>
> <content type="string" default=""/>
> </parameter>
> <parameter name="cidr_netmask" required="1">
> <longdesc lang="en">
> The cidr netmask of the ipv6 addr.
> </longdesc>
> <shortdesc lang="en">netmask of the ipv6 addr.</shortdesc>
> <content type="string" default="128"/>
> </parameter>
> <parameter name="logfile" required="0">
> <longdesc lang="en">
> File to write STDOUT to
> </longdesc>
> <shortdesc lang="en">File to write STDOUT to</shortdesc>
> <content type="string" />
> </parameter>
> <parameter name="errlogfile" required="0">
> <longdesc lang="en">
> File to write STDERR to
> </longdesc>
> <shortdesc lang="en">File to write STDERR to</shortdesc>
> <content type="string" />
> </parameter>
> </parameters>
> <actions>
> <action name="start" timeout="20s" />
> <action name="stop" timeout="20s" />
> <action name="monitor" depth="0" timeout="20s" interval="10" />
> <action name="meta-data" timeout="5" />
> <action name="validate-all" timeout="5" />
> </actions>
> </resource-agent>
> END
> exit 0
> }
>
> case "$1" in
> meta-data|metadata|meta_data|meta)
> IPv6addrLO_meta
> ;;
> start)
> IPv6addrLO_start
> ;;
> stop)
> IPv6addrLO_stop
> ;;
> monitor)
> IPv6addrLO_monitor
> ;;
> validate-all)
> IPv6addrLO_validate
> ;;
> *)
> ocf_log err "$0 was called with unsupported arguments:"
> exit $OCF_ERR_UNIMPLEMENTED
> ;;
> esac
Cheers,
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
More information about the Pacemaker
mailing list