[Pacemaker] ifstatus OCF RA

Vladislav Bogdanov bubble at hoster-ok.com
Wed Feb 23 09:53:04 UTC 2011


Hi Lars,

thank you for your time and for so detailed review.

Just to dot half of i's (where it is about coding style):
1. I strongly prefer to cleanly separate data access from main logic by API.
2. I prefer to have non-void functions to return result explicitly
("main" too). This will prevent "correct" return code from being lost on
subsequent code modifications.
3. I insist on all variables inside functions to be local. This helps to
avoid some hardly-debugable logic errors.
4. I prefer not to touch global variables inside of lower-layer
functions. To be honest, I prefer to pass everything needed by function
in its parameters. This sometimes is hard to do, so sometimes I prefer
to use global variables relatively high at stack rather then pass them
down through several more functions.
5) I prefer to use sub-shells for recursive function calls.

Please look at one more attached revision, and also in comments inline.


...
>> <longdesc lang="en">
>> Every time the monitor action is run, this resource agent records (in the CIB) speed of active network interfaces from a list.
> 
> 
> Where "active" is ...
>  what, exactly?
>  Add some hint on the intended use case and purpose,
>  maybe add an example or two. This is the long description, after all.

Done.

> 
>  Also note that this is
>   - linux specific
>   - requires kernel >= 2.6.33,
>   afaict no /sys/class/net/*/speed before that

Ahm, was not aware about that. I need to look again at this because I
need this to run on RHEL6 too. Anybody knows, does it has this sysfs?
Let's delay with this for a bit.

> 
>> </longdesc>
>> <shortdesc lang="en">Network interface status</shortdesc>
> 
> again, "interface status" may be a bit misleading.

Fixed.

> 
>> Weight of each 10Mbps in interface speed (1Gbps = 100 * 10Mbps).
> 
> The way you implemented it, you get the speed, then do some math with
> it, just to arrive at the same value you just read...
> Why not just give one point per Mbps by default?

Rewrote.
> 
>> With default value 1Gbps interface will be counted as 1000.
>> </longdesc>
>> <shortdesc lang="en">Weight of 10Mbps interface</shortdesc>
> 
> can someone come up with better wording here?

I tried.

> 
>> <parameter name="dampen" unique="0">
>> <longdesc lang="en">
>> The time to wait (dampening) further changes occur
> 
> This english apparently needs fixing
> already wherever it was copied from ;-)

Hopefully done (although english is not my native lang).

> 
> 
>> </longdesc>
>> <shortdesc lang="en">Dampening interval</shortdesc>
>> <content type="integer" default="${OCF_RESKEY_dampen_default}"/>
>> </parameter>
>>
>> <parameter name="debug" unique="0">
>> <longdesc lang="en">
>> Enables to use default attrd_updater verbose logging on every call.
> 
>  "If enabled, ..."

Re-worded.
> 
>> </longdesc>
>> <shortdesc lang="en">Verbose logging</shortdesc>
>> <content type="string" default="false"/>
>> </parameter>
>>
>> </parameters>
>>
>> <actions>
>> <action name="start"   timeout="30" />
>> <action name="stop"    timeout="30" />
>> <action name="reload"  timeout="30" />
>> <action name="monitor" depth="0"  timeout="30" interval="10"/>
>> <action name="meta-data"  timeout="5" />
>> <action name="validate-all"  timeout="30" />
>> </actions>
>> </resource-agent>
>> END
>> }
>>
>> usage() {
>>     cat <<END
>> usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all|meta-data}
>>
>> Expects to have a fully populated OCF RA-compliant environment set.
>> END
> 
> BTW:
> cat does an extra fork/exec, and "here documents" go via tmpfiles,
> which can break things, or slow things down quite a bit.
> most of the time, I prefer the typically built in echo "
> ...
> .... "
> 
> But that has nothing to do with this RA, even less with its usage().
> Actually it comes down to a matter of taste entirely,
> so just ignore this statement ;-)

Let's leave it as-is.

> 
>> }
>>
>> start() {
>>     monitor
>>     if [ $? -eq $OCF_SUCCESS ]; then
>>         return $OCF_SUCCESS
>>     fi
> 
> 	monitor && return

Ahm, I'd leave it as-is. It is a bit more readable and probably does not
cost any extra CPU cycles.

> 
>>     ha_pseudo_resource ${ha_pseudo_resource_name} start
>>     update
> 
> 	the update has been done in monitor already?

Nope. ha_pseudo_resource ${ha_pseudo_resource_name} monitor returns 7
there so update is not called.

> 
>> }
>>
>> stop() {
>>     ha_pseudo_resource ${ha_pseudo_resource_name} stop
>>     attrd_updater -D -n ${OCF_RESKEY_name} -d ${OCF_RESKEY_dampen} ${attrd_options}
>>     return $OCF_SUCCESS
> 
> do we want to consider the exit code of attrd_updater, or not?
> if not, what do we do about non-zero exit code?
> does it have a useful reliable exit code, at all?
> you don't ignore it in update.
> Though you ignore the return value of update ...
> 
> Yeah, right, ping does it the same way probably (I did not check; does
> it?) -- good to have a reason to review that, too ;-)

We did our best, and failure of attr_updater means much more serious
problems than this RA han handle.

> 
>> }
>>
>> monitor() {
>> 	local ret
>>     ha_pseudo_resource ${ha_pseudo_resource_name} monitor
>>     ret=$?
>>     if [ ${ret} -eq $OCF_SUCCESS ] ; then
>>         update
>>     fi
>>     return ${ret}
> 
> 	ha_pseudo_resource $ha_pseudo_resource_name monitor || return
> 	update
> 
> 	and, if that's really necessary, ignore the return code of update:
> 	return $OCF_SUCCESS

No, we are interested in NOT_RUNNING too.
Let's leave as-is.

> 
>> }
>>
>> validate() {
> 
> check for linux,
> and kernel >= 2.6.33,
> and sysfs present,
> and if not, exit with not installed or something?

Again, let's delay it for a bit.

> 
> Ah. just noticed you do that explicitly early on, anyways.
>>     # Check the check interval
>>     if ocf_is_decimal "${OCF_RESKEY_CRM_meta_interval}" && [ ${OCF_RESKEY_CRM_meta_interval} -gt 0 ]; then
>>         :
>>     else
>>         ocf_log err "Invalid check interval ${OCF_RESKEY_interval}. It should be positive integer!"
>>         exit $OCF_ERR_CONFIGURED
>>     fi
>>
>>     # Check the intarfaces list
> 
> s/tar/ter

Thanks.

> 
>>     if [ "x" = "x${OCF_RESKEY_iface}" ]; then 
> 
> You declared yourself as bash.
> no need for this "x" = "x$VAR" nonsense in bash...
> in fact, even in sh, I find test -z preferable.

Blindly copied it. Done.

> 
>>         ocf_log err "Empty iface parameter.  Please specify some network interface to check"
> 
> also check for strange characters in iface,
> it will be used as regex/sed/awk expression later.

Let it be TODO.

> 
>>         exit $OCF_ERR_CONFIGURED
>>     fi
>>
>>     return $OCF_SUCCESS
>> }
>>
>> iface_get_speed() {
>>     local iface=$1
>>     local operstate
>>     local carrier
>>     local speed
>>
>>     if [ ! -e "/sys/class/net/${iface}" ] ; then
>>         echo 0
>>     elif iface_is_bridge ${iface} ; then
>>         bridge_get_speed ${iface}
>>     elif iface_is_bond ${iface} ; then
>>         bond_get_speed ${iface}
>>     elif iface_is_vlan ${iface} ; then
>>         iface_get_speed $( vlan_get_phy ${iface} )
>>     else
>>         read operstate < "/sys/class/net/${iface}/operstate"
>>         read carrier < "/sys/class/net/${iface}/carrier"
>>         if [ "${operstate}" != "up" ] || [ "${carrier}" != "1" ] ; then
>>             speed="0"
>>         else
>>             read speed < "/sys/class/net/${iface}/speed"
>>         fi
>>         echo ${speed}
> 
> I'd prefer to have speed non-local,

I'd not. Let's be safe with recursions.

> and be a return value.

Does bash already support 10000 to be function return code? ;)

> 
> then you can do away with the $(iface_get_speed) later as well.
> 
>>     fi
>> }
>>
>> iface_is_vlan() {
>>     local iface=$1
>>     [ -e "/proc/net/vlan/${iface}" ] && return 0 || return 1
>> }
>>
>> iface_is_bridge() {
>>     local iface=$1
>>     [ -e "/sys/class/net/${iface}/bridge" ] && return 0 || return 1
>> }
>>
>> iface_is_bond() {
>>     local iface=$1
>>     [ -e "/sys/class/net/${iface}/bonding" ] && return 0 || return 1
>> }
> 
> All these functions:
> iface_is_vlan()		{ [ -e "/proc/net/vlan/$1" ]; }
> iface_is_bridge()	{ [ -e "/sys/class/net/$1/bridge" ]; }
> iface_is_bond()		{ [ -e "/sys/class/net/$1/bonding" ]; }

Please see above about implicit return codes. This should not take any
CPU cycles.

> 
>> vlan_get_phy() {
>>     local iface=$1
>>     grep "^${iface} " "/proc/net/vlan/config" | sed -r 's/.*\| +(.*)/\1/'
> 
> probably safe to assume that on a box running linux kernel >= 2.6.33
> the sed supports -r, but you never know ;-)
> also, if you use sed aleady, have it do the pattern matching as well?
> 
> vlan_get_phy() { sed -ne "s/^$1 .*| *//p" < /proc/net/vlan/config; }

Ahm, good point, thanks.

> 
>> }
>>
>> bridge_is_stp_enabled() {
>>     local iface=$1
>>     local stp
>>     read stp < "/sys/class/net/${iface}/bridge/stp_state"
>>     [ "${stp}" = "1" ] && return 0 || return 1
> 
> how about
> bridge_is_stp_enabled() { grep 1 < /sys/class/net/$1/bridge/stp_state; }

This will be slower.

> 
> maybe add 2>/dev/null ?
> 
> or, as we are bash:
> bridge_is_stp_enabled()	{ [[ $(</sys/class/net/$1/bridge/stp_state) = 1 ]]; }

I'd say this is unreadable for me. I'd avoid non-trivial constructs with
no performance gain. And again, implicit vs explicit.

> 
> but some may prefer the more verbose thing.
> if so, at least do away with the && return || return.
> shell function return value is the "exit status" of the last command
> executed, so [[ $stp = 1 ]] is sufficient.

But more error-prone on subsequent code modifications.

> 
>> }
>>
>> bridge_get_root_ports() {
>>     local bridge=$1
>>     local root_id
>>     local root_ports=""
>>     local bridge_id
>>
>>     read root_id < "/sys/class/net/${bridge}/bridge/root_id"
>>
>>     for port in /sys/class/net/${bridge}/brif/* ; do
>>         read bridge_id < "${port}/designated_bridge"
>>         if [ "${bridge_id}" = "${root_id}" ] ; then
>>             root_ports="${root_ports} ${port##*/}"
>>         fi
>>     done
> 
> Yes, I think here the more verbose style is appropriate ;-)
> 
>>     echo "${root_ports# }"

I'd leave this as-is. And, I remade this chunk of code, so this
construct is (probably) really needed now. Anyways, I hate implicit
assumptions.

> 
> Though this is not necessary: echo $root_ports would do this just fine.
> and, again, I'd prefer a non-local variable to pass the value
> over some $(subshell) thingy.

Did it another way.

> 
>> }
>>
>> # From /inlude/linux/if_bridge.h:
>> #define BR_STATE_DISABLED 0
>> #define BR_STATE_LISTENING 1
>> #define BR_STATE_LEARNING 2
>> #define BR_STATE_FORWARDING 3
>> #define BR_STATE_BLOCKING 4
>>
>> bridge_get_active_ports() {
>>     local bridge=$1
>>     shift 1
>>     local ports="$*"
>>     local active_ports=""
>>     local port_state
>>     local stp_state=bridge_is_stp_enabled ${bridge}
> 
> some double quote missing there?

I should be temporarily insane while writing that ;)
Fixed.

> 
>>     local warn=0
>>
>>     if [ -z "${ports}" ] || [ "${ports}" = "detect" ] ; then
>>         ports=$( bridge_get_root_ports ${bridge} )
> 
> 	if you did non-local root_ports, this would become
> 	bridge_get_root_ports $bridge
> 	# root ports now in $root_ports,
> 	# so you can assign port=$root_ports,
> 	# or just use $root_ports as is.

Rewrote this.

> 
>>     fi
>>
>>     for port in $ports ; do
> 
> there. no need to trim white space anyways, as I thought.

I'd leave it.

> 
>>         if [ ! -e "/sys/class/net/${bridge}/brif/${port}" ] ; then
>>             ocf_log warning "Port ${port} doesn't belong to bridge ${bridge}"
>>             continue
>>         fi
>>         read port_state < "/sys/class/net/${bridge}/brif/${port}/state"
>>         if [ "${port_state}" = "3" ] ; then
>>             if [ -n "${active_ports}" ] && ${stp_state} ; then
> 
> no need to re-check stp state for each iteration.
> do that once, and reference the result here.

This nonsense is fixed.

> 
>>                 warn=1
>>             fi
>>             active_ports="${active_ports} ${port}"
>>         fi
>>     done
>>     if [ ${warn} -eq 1 ] ; then
>>         ocf_log warning "More then one upstream port in bridge '${bridge}' is in forwarding state while STP is enabled: ${active_ports}" 
>>     fi
>>     echo "${active_ports# }"
> 
> again, no need to trim white space, and I prefer non-local documented
> variables over $(subshell echo) style assignments.
> 
>> }
>>
>> bridge_get_speed() {
>>     local $iface=$1
>>
>>     if ! iface_is_bridge ${iface} ; then
>>         echo 0
>>         return
>>     fi
>>
>>     local ports=$( bridge_get_active_ports ${iface} ${OCF_RESKEY_bridge_ports} )

Still leaving this as subshell, will rethink later.

>>     for port in ${ports} ; do
>>         : $(( aggregate_speed += $( iface_get_speed ${port} ) ))
>>     done
>>     echo ${aggregate_speed}
> 
> This can be rewritten without subshells.

Error-prone because of recursions.
I'd leave it as-is.

> 
> It could even be done without bashisms,
> (there is only one bashism in there, afaics)
> but who cares, we are bash anyways ;-)
> 
>> }
>>
>> bond_get_slaves() {
>>     local iface=$1
>>     local slaves
>>     read slaves < "/sys/class/net/${iface}/bonding/slaves"
>>     echo ${slaves}
> 
> if that is what you wanted, why not just say cat?

It is slower.

> or, drop this function completely, as it only adds noise,
> and later do
> 	slaves=$(< /sys/class/net/${iface}/bonding/slaves)
> avoiding any subshells?

Data access separation.

> 
>> }
>>
>> bond_get_active_iface() {
>>     local iface=$1
>>     local active
>>     read active < "/sys/class/net/${iface}/bonding/active_slave"
>>     echo ${active}
>> }
> 
> similar.
> 
>>
>> bond_is_balancing() {
>>     local iface=$1
>>     read mode mode_index < "/sys/class/net/${iface}/bonding/mode"
> 
> 
> You declare all variables in all your tiny to-be-used-as-subshell
> functions as local (needlessly, as they are visible in that subshell
> only anyways... but keep them local, we want to get rid of the subshell
> echo style assignments), but then here omit the local? how come?

Having function variables local is simply a good habit I think. Like
having curly braces around one-statement blocks in C.
I rewrote some functions so they can be used without subshell at a cost
of additional variable.

> 
> not that it matters, really, they are used only here, anyways.
> 
>>     case ${mode} in
>>         "balance-rr"|"balance-xor"|"802.3ad"|"balance-tlb"|"balance-alb")
>>             return 0
>>             ;;
>>         *)
>>             return 1
>>             ;;
>>     esac
>> }
>>
>> bond_get_speed() {
>>     local iface=$1
>>     local aggregate_speed=0
>>
>>     if ! iface_is_bond ${iface} ; then
>>         echo 0
>>         return
>>     fi
>>
>>     local slaves=$( bond_get_slaves ${iface} )
>>     if bond_is_balancing ${iface} ; then
>>         for slave in ${slaves} ; do
>>             : $(( aggregate_speed += $( iface_get_speed ${slave} ) ))
>>         done
>>         # Bonding is unable to get speed*n
>>         : $(( aggregate_speed = aggregate_speed*8/10 ))
>>     else
>>         : $(( aggregate_speed = $( iface_get_speed $( bond_get_active_iface ${iface} ) ) ))
>>     fi
>>     echo ${aggregate_speed} 
> 
> Again, I'd prefer this to be handled without subshells.

Recursion.

> 
>> }
>>
>> update() {
>>     local speed=$( iface_get_speed ${OCF_RESKEY_iface} )
>>
>>     : $(( score = speed * ${OCF_RESKEY_weight_base} / 10 ))
> 
> hmmm...
> score = speed * 10 / 10
> but I complained about that earlier already ;-)
> 
>>     attrd_updater -n ${OCF_RESKEY_name} -v ${score} -d ${OCF_RESKEY_dampen} ${attrd_options}
>>     rc=$?
>>     case ${rc} in
>>         0)
>>             ocf_is_true ${OCF_RESKEY_debug} && ocf_log debug "Updated ${OCF_RESKEY_name} = ${score}"
>>             ;;
>>         *)
>>             ocf_log warn "Could not update ${OCF_RESKEY_name} = ${score}: rc=${rc}"
>>             ;;
>>     esac
>>     return ${rc}
> 
> all that fun just to ignore the return value of this function anyways?

No. It was not ignored.
But I fixed implicit to explicit return in start().

> 
> 
>> }
>>
>> if [ `uname` != "Linux" ] ; then
>>     ocf_log err "This RA works only on linux."
>>     exit $OCF_ERR_INSTALLED
>> fi
>>
>> if ! ocf_is_true ${OCF_RESKEY_CRM_meta_globally_unique} ; then
>>     : ${ha_pseudo_resource_name:="ifspeed-${OCF_RESKEY_name}"}
>> else
>>     : ${ha_pseudo_resource_name:="ifspeed-${OCF_RESOURCE_INSTANCE}"}
>> fi
> 
> At least one of those clone possibilities is probably nonsense,
> but wellm, that's how this ha_pseudo_resource stuff works.

Dunno, just blindly copied this.

> 
>> attrd_options='-q'
>> if ocf_is_true ${OCF_RESKEY_debug} ; then
>>     attrd_options=''
>> fi
>>
>> case $__OCF_ACTION in
>>     meta-data)
>>         meta_data
>>         exit $OCF_SUCCESS
>>         ;;
>>     start)
>>         start
>>         ;;
>>     stop)
>>         stop
>>         ;;
>>     monitor)
>>         monitor
>>         ;;
>>     reload)
>>         start
>>         ;;
>>     validate-all)
>>         validate
>>         ;;
>>     usage|help)
>>         usage
>>         exit $OCF_SUCCESS
>>         ;;
>>     *)
>>         usage
>>         exit $OCF_ERR_UNIMPLEMENTED
>>         ;;
>> esac
>> exit $?
> 
> The $? is not necessary there.

I prefer to have it explicitly.

> 
> 
> 
> Good job!
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ifspeed
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20110223/4079da61/attachment-0004.ksh>


More information about the Pacemaker mailing list