[Pacemaker] ifstatus OCF RA

Lars Ellenberg lars.ellenberg at linbit.com
Tue Feb 22 15:46:04 EST 2011


On Tue, Feb 22, 2011 at 07:24:41PM +0200, Vladislav Bogdanov wrote:
> #!/bin/bash
> #
> # OCF resource agent which monitors state of network interface and records it as a value in CIB
> # based on summ of speeds of its active underlying interfaces.
> #
> # Copyright (c) 2011 Vladislav Bogdanov <bubble at hoster-ok.com>
> # Partially based on 'ping' RA by Andrew Beekhof
> #
> # OCF instance parameters:
> #    OCF_RESKEY_name:         name of attribute to set in CIB
> #    OCF_RESKEY_iface:        network interface to monitor
> #    OCF_RESKEY_bridge_ports: if not null and OCF_RESKEY_iface is a bridge, list of bridge ports to consider.
> #                             Default is all ports which have designated_bridge=root_id 
> #    OCF_RESKEY_weight_base:  weight of each 10Mbps in interface speed (1Gbps = 100 * 10Mbps) in CIB score points 
> #
> # Initialization:
> 
> : ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}
> . ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs

I think we moved to lib?
Dejan?

> # Defaults
> OCF_RESKEY_name_default="ifspeed"
> OCF_RESKEY_bridge_ports_default="detect"
> OCF_RESKEY_weight_base_default=10
> OCF_RESKEY_dampen_default=5
> 
> : ${OCF_RESKEY_name=${OCF_RESKEY_name_default}}
> : ${OCF_RESKEY_bridge_ports=${OCF_RESKEY_bridge_ports_default}}
> : ${OCF_RESKEY_weight_base=${OCF_RESKEY_weight_base_default}}
> : ${OCF_RESKEY_dampen=${OCF_RESKEY_dampen_default}}
> 
> meta_data() {
>         cat <<END
> <?xml version="1.0"?>
> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
> <resource-agent name="ifspeed">
> <version>1.0</version>
> 
> <longdesc lang="en">
> Every time the monitor action is run, this resource agent records (in the CIB) speed of active network interfaces from a list.


Where "active" is ...
 what, exactly?
 Add some hint on the intended use case and purpose,
 maybe add an example or two. This is the long description, after all.

 Also note that this is
  - linux specific
  - requires kernel >= 2.6.33,
  afaict no /sys/class/net/*/speed before that

> </longdesc>
> <shortdesc lang="en">Network interface status</shortdesc>

again, "interface status" may be a bit misleading.

> Weight of each 10Mbps in interface speed (1Gbps = 100 * 10Mbps).

The way you implemented it, you get the speed, then do some math with
it, just to arrive at the same value you just read...
Why not just give one point per Mbps by default?

> With default value 1Gbps interface will be counted as 1000.
> </longdesc>
> <shortdesc lang="en">Weight of 10Mbps interface</shortdesc>

can someone come up with better wording here?

> <parameter name="dampen" unique="0">
> <longdesc lang="en">
> The time to wait (dampening) further changes occur

This english apparently needs fixing
already wherever it was copied from ;-)


> </longdesc>
> <shortdesc lang="en">Dampening interval</shortdesc>
> <content type="integer" default="${OCF_RESKEY_dampen_default}"/>
> </parameter>
> 
> <parameter name="debug" unique="0">
> <longdesc lang="en">
> Enables to use default attrd_updater verbose logging on every call.

 "If enabled, ..."

> </longdesc>
> <shortdesc lang="en">Verbose logging</shortdesc>
> <content type="string" default="false"/>
> </parameter>
> 
> </parameters>
> 
> <actions>
> <action name="start"   timeout="30" />
> <action name="stop"    timeout="30" />
> <action name="reload"  timeout="30" />
> <action name="monitor" depth="0"  timeout="30" interval="10"/>
> <action name="meta-data"  timeout="5" />
> <action name="validate-all"  timeout="30" />
> </actions>
> </resource-agent>
> END
> }
> 
> usage() {
>     cat <<END
> usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all|meta-data}
> 
> Expects to have a fully populated OCF RA-compliant environment set.
> END

BTW:
cat does an extra fork/exec, and "here documents" go via tmpfiles,
which can break things, or slow things down quite a bit.
most of the time, I prefer the typically built in echo "
...
.... "

But that has nothing to do with this RA, even less with its usage().
Actually it comes down to a matter of taste entirely,
so just ignore this statement ;-)

> }
> 
> start() {
>     monitor
>     if [ $? -eq $OCF_SUCCESS ]; then
>         return $OCF_SUCCESS
>     fi

	monitor && return

>     ha_pseudo_resource ${ha_pseudo_resource_name} start
>     update

	the update has been done in monitor already?

> }
> 
> stop() {
>     ha_pseudo_resource ${ha_pseudo_resource_name} stop
>     attrd_updater -D -n ${OCF_RESKEY_name} -d ${OCF_RESKEY_dampen} ${attrd_options}
>     return $OCF_SUCCESS

do we want to consider the exit code of attrd_updater, or not?
if not, what do we do about non-zero exit code?
does it have a useful reliable exit code, at all?
you don't ignore it in update.
Though you ignore the return value of update ...

Yeah, right, ping does it the same way probably (I did not check; does
it?) -- good to have a reason to review that, too ;-)

> }
> 
> monitor() {
> 	local ret
>     ha_pseudo_resource ${ha_pseudo_resource_name} monitor
>     ret=$?
>     if [ ${ret} -eq $OCF_SUCCESS ] ; then
>         update
>     fi
>     return ${ret}

	ha_pseudo_resource $ha_pseudo_resource_name monitor || return
	update

	and, if that's really necessary, ignore the return code of update:
	return $OCF_SUCCESS

> }
> 
> validate() {

check for linux,
and kernel >= 2.6.33,
and sysfs present,
and if not, exit with not installed or something?

Ah. just noticed you do that explicitly early on, anyways.
>     # Check the check interval
>     if ocf_is_decimal "${OCF_RESKEY_CRM_meta_interval}" && [ ${OCF_RESKEY_CRM_meta_interval} -gt 0 ]; then
>         :
>     else
>         ocf_log err "Invalid check interval ${OCF_RESKEY_interval}. It should be positive integer!"
>         exit $OCF_ERR_CONFIGURED
>     fi
> 
>     # Check the intarfaces list

s/tar/ter

>     if [ "x" = "x${OCF_RESKEY_iface}" ]; then 

You declared yourself as bash.
no need for this "x" = "x$VAR" nonsense in bash...
in fact, even in sh, I find test -z preferable.

>         ocf_log err "Empty iface parameter.  Please specify some network interface to check"

also check for strange characters in iface,
it will be used as regex/sed/awk expression later.

>         exit $OCF_ERR_CONFIGURED
>     fi
> 
>     return $OCF_SUCCESS
> }
> 
> iface_get_speed() {
>     local iface=$1
>     local operstate
>     local carrier
>     local speed
> 
>     if [ ! -e "/sys/class/net/${iface}" ] ; then
>         echo 0
>     elif iface_is_bridge ${iface} ; then
>         bridge_get_speed ${iface}
>     elif iface_is_bond ${iface} ; then
>         bond_get_speed ${iface}
>     elif iface_is_vlan ${iface} ; then
>         iface_get_speed $( vlan_get_phy ${iface} )
>     else
>         read operstate < "/sys/class/net/${iface}/operstate"
>         read carrier < "/sys/class/net/${iface}/carrier"
>         if [ "${operstate}" != "up" ] || [ "${carrier}" != "1" ] ; then
>             speed="0"
>         else
>             read speed < "/sys/class/net/${iface}/speed"
>         fi
>         echo ${speed}

I'd prefer to have speed non-local,
and be a return value.

then you can do away with the $(iface_get_speed) later as well.

>     fi
> }
> 
> iface_is_vlan() {
>     local iface=$1
>     [ -e "/proc/net/vlan/${iface}" ] && return 0 || return 1
> }
> 
> iface_is_bridge() {
>     local iface=$1
>     [ -e "/sys/class/net/${iface}/bridge" ] && return 0 || return 1
> }
> 
> iface_is_bond() {
>     local iface=$1
>     [ -e "/sys/class/net/${iface}/bonding" ] && return 0 || return 1
> }

All these functions:
iface_is_vlan()		{ [ -e "/proc/net/vlan/$1" ]; }
iface_is_bridge()	{ [ -e "/sys/class/net/$1/bridge" ]; }
iface_is_bond()		{ [ -e "/sys/class/net/$1/bonding" ]; }

> vlan_get_phy() {
>     local iface=$1
>     grep "^${iface} " "/proc/net/vlan/config" | sed -r 's/.*\| +(.*)/\1/'

probably safe to assume that on a box running linux kernel >= 2.6.33
the sed supports -r, but you never know ;-)
also, if you use sed aleady, have it do the pattern matching as well?

vlan_get_phy() { sed -ne "s/^$1 .*| *//p" < /proc/net/vlan/config; }

> }
> 
> bridge_is_stp_enabled() {
>     local iface=$1
>     local stp
>     read stp < "/sys/class/net/${iface}/bridge/stp_state"
>     [ "${stp}" = "1" ] && return 0 || return 1

how about
bridge_is_stp_enabled() { grep 1 < /sys/class/net/$1/bridge/stp_state; }

maybe add 2>/dev/null ?

or, as we are bash:
bridge_is_stp_enabled()	{ [[ $(</sys/class/net/$1/bridge/stp_state) = 1 ]]; }

but some may prefer the more verbose thing.
if so, at least do away with the && return || return.
shell function return value is the "exit status" of the last command
executed, so [[ $stp = 1 ]] is sufficient.

> }
> 
> bridge_get_root_ports() {
>     local bridge=$1
>     local root_id
>     local root_ports=""
>     local bridge_id
> 
>     read root_id < "/sys/class/net/${bridge}/bridge/root_id"
> 
>     for port in /sys/class/net/${bridge}/brif/* ; do
>         read bridge_id < "${port}/designated_bridge"
>         if [ "${bridge_id}" = "${root_id}" ] ; then
>             root_ports="${root_ports} ${port##*/}"
>         fi
>     done

Yes, I think here the more verbose style is appropriate ;-)

>     echo "${root_ports# }"

Though this is not necessary: echo $root_ports would do this just fine.
and, again, I'd prefer a non-local variable to pass the value
over some $(subshell) thingy.

> }
> 
> # From /inlude/linux/if_bridge.h:
> #define BR_STATE_DISABLED 0
> #define BR_STATE_LISTENING 1
> #define BR_STATE_LEARNING 2
> #define BR_STATE_FORWARDING 3
> #define BR_STATE_BLOCKING 4
> 
> bridge_get_active_ports() {
>     local bridge=$1
>     shift 1
>     local ports="$*"
>     local active_ports=""
>     local port_state
>     local stp_state=bridge_is_stp_enabled ${bridge}

some double quote missing there?

>     local warn=0
> 
>     if [ -z "${ports}" ] || [ "${ports}" = "detect" ] ; then
>         ports=$( bridge_get_root_ports ${bridge} )

	if you did non-local root_ports, this would become
	bridge_get_root_ports $bridge
	# root ports now in $root_ports,
	# so you can assign port=$root_ports,
	# or just use $root_ports as is.

>     fi
> 
>     for port in $ports ; do

there. no need to trim white space anyways, as I thought.

>         if [ ! -e "/sys/class/net/${bridge}/brif/${port}" ] ; then
>             ocf_log warning "Port ${port} doesn't belong to bridge ${bridge}"
>             continue
>         fi
>         read port_state < "/sys/class/net/${bridge}/brif/${port}/state"
>         if [ "${port_state}" = "3" ] ; then
>             if [ -n "${active_ports}" ] && ${stp_state} ; then

no need to re-check stp state for each iteration.
do that once, and reference the result here.

>                 warn=1
>             fi
>             active_ports="${active_ports} ${port}"
>         fi
>     done
>     if [ ${warn} -eq 1 ] ; then
>         ocf_log warning "More then one upstream port in bridge '${bridge}' is in forwarding state while STP is enabled: ${active_ports}" 
>     fi
>     echo "${active_ports# }"

again, no need to trim white space, and I prefer non-local documented
variables over $(subshell echo) style assignments.

> }
> 
> bridge_get_speed() {
>     local $iface=$1
> 
>     if ! iface_is_bridge ${iface} ; then
>         echo 0
>         return
>     fi
> 
>     local ports=$( bridge_get_active_ports ${iface} ${OCF_RESKEY_bridge_ports} )
>     for port in ${ports} ; do
>         : $(( aggregate_speed += $( iface_get_speed ${port} ) ))
>     done
>     echo ${aggregate_speed}

This can be rewritten without subshells.

It could even be done without bashisms,
(there is only one bashism in there, afaics)
but who cares, we are bash anyways ;-)

> }
> 
> bond_get_slaves() {
>     local iface=$1
>     local slaves
>     read slaves < "/sys/class/net/${iface}/bonding/slaves"
>     echo ${slaves}

if that is what you wanted, why not just say cat?
or, drop this function completely, as it only adds noise,
and later do
	slaves=$(< /sys/class/net/${iface}/bonding/slaves)
avoiding any subshells?

> }
> 
> bond_get_active_iface() {
>     local iface=$1
>     local active
>     read active < "/sys/class/net/${iface}/bonding/active_slave"
>     echo ${active}
> }

similar.

> 
> bond_is_balancing() {
>     local iface=$1
>     read mode mode_index < "/sys/class/net/${iface}/bonding/mode"


You declare all variables in all your tiny to-be-used-as-subshell
functions as local (needlessly, as they are visible in that subshell
only anyways... but keep them local, we want to get rid of the subshell
echo style assignments), but then here omit the local? how come?

not that it matters, really, they are used only here, anyways.

>     case ${mode} in
>         "balance-rr"|"balance-xor"|"802.3ad"|"balance-tlb"|"balance-alb")
>             return 0
>             ;;
>         *)
>             return 1
>             ;;
>     esac
> }
> 
> bond_get_speed() {
>     local iface=$1
>     local aggregate_speed=0
> 
>     if ! iface_is_bond ${iface} ; then
>         echo 0
>         return
>     fi
> 
>     local slaves=$( bond_get_slaves ${iface} )
>     if bond_is_balancing ${iface} ; then
>         for slave in ${slaves} ; do
>             : $(( aggregate_speed += $( iface_get_speed ${slave} ) ))
>         done
>         # Bonding is unable to get speed*n
>         : $(( aggregate_speed = aggregate_speed*8/10 ))
>     else
>         : $(( aggregate_speed = $( iface_get_speed $( bond_get_active_iface ${iface} ) ) ))
>     fi
>     echo ${aggregate_speed} 

Again, I'd prefer this to be handled without subshells.

> }
> 
> update() {
>     local speed=$( iface_get_speed ${OCF_RESKEY_iface} )
> 
>     : $(( score = speed * ${OCF_RESKEY_weight_base} / 10 ))

hmmm...
score = speed * 10 / 10
but I complained about that earlier already ;-)

>     attrd_updater -n ${OCF_RESKEY_name} -v ${score} -d ${OCF_RESKEY_dampen} ${attrd_options}
>     rc=$?
>     case ${rc} in
>         0)
>             ocf_is_true ${OCF_RESKEY_debug} && ocf_log debug "Updated ${OCF_RESKEY_name} = ${score}"
>             ;;
>         *)
>             ocf_log warn "Could not update ${OCF_RESKEY_name} = ${score}: rc=${rc}"
>             ;;
>     esac
>     return ${rc}

all that fun just to ignore the return value of this function anyways?


> }
> 
> if [ `uname` != "Linux" ] ; then
>     ocf_log err "This RA works only on linux."
>     exit $OCF_ERR_INSTALLED
> fi
> 
> if ! ocf_is_true ${OCF_RESKEY_CRM_meta_globally_unique} ; then
>     : ${ha_pseudo_resource_name:="ifspeed-${OCF_RESKEY_name}"}
> else
>     : ${ha_pseudo_resource_name:="ifspeed-${OCF_RESOURCE_INSTANCE}"}
> fi

At least one of those clone possibilities is probably nonsense,
but wellm, that's how this ha_pseudo_resource stuff works.

> attrd_options='-q'
> if ocf_is_true ${OCF_RESKEY_debug} ; then
>     attrd_options=''
> fi
> 
> case $__OCF_ACTION in
>     meta-data)
>         meta_data
>         exit $OCF_SUCCESS
>         ;;
>     start)
>         start
>         ;;
>     stop)
>         stop
>         ;;
>     monitor)
>         monitor
>         ;;
>     reload)
>         start
>         ;;
>     validate-all)
>         validate
>         ;;
>     usage|help)
>         usage
>         exit $OCF_SUCCESS
>         ;;
>     *)
>         usage
>         exit $OCF_ERR_UNIMPLEMENTED
>         ;;
> esac
> exit $?

The $? is not necessary there.



Good job!


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.




More information about the Pacemaker mailing list