[Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

Fri Jun 28 12:42:59 EDT 2013

On 06/28/2013 11:34 AM, Lars Marowsky-Bree wrote:
> On 2013-06-28T11:29:35, Digimer <lists at alteeve.ca> wrote:
> 
>> In rhcs, you can control the fence device's action using 'action="..."'
>> attribute in the <device ...> element. So for us rhcs migrants, we
>> expect that action="..." in the fence primitive will have the same
>> effect. As of now, as you know, this is ignored in favour of the global
>> action.
> 
> Ah, OK, you were talking about honoring action="" on the primitive. yes,
> that makes more sense. I was thinking something different.
> 
> 
>>>> I've debated writing a "fence_apc_multi" that takes "reboot" and two or
>>>> more PDU addresses/ports and does the break out for you.
>>> That is so ugly, please, no :-(
>> It's also "ugly" to have four fence primitives per node's PDU. At least
>> this way I can abstract the ugliness away from the users and make the
>> pcmk config more readable.
> 
> Yes, that is horribly ugly. And why it needs to be fixed in the fence
> code proper.
> 
>> In fact, I've been thinking of a general purpose wrapper that takes the
>> desired fence agent as an attribute. I can call it simply 'fence_multi'.
> 
> Please. No. :-( That'll still be horrible to configure. Just think about
> how user interfaces would have to handle this!
> 
> If you're going to fix it, please, I beg you, fix it properly in the
> stonith / fence topology code.
> 
> Of course, who writes the code wins, but this is headed down such a
> bandaid path ...

Agreed, but as I mentioned in my other reply, I can't code in C so I
can't fix stonith. If beekhof (or someone else) has the cycles to do so,
it will obviously be the preferred solution.

As for the syntax, and yes it will be a little cludgy, I was thinking of;

fence_multi -f fence_apc_snmp -a a:pdu1,b:pdu2 -p a:1,b:1 -o reboot

or

fence_agent=fence_apc_snmp
ipaddr=a:pdu1,b:pdu2
port=a:1,b:1
action=reboot

Any other switched/STDIN values would be passed as-is to the requested
fence_agent. The 'fence_multi' would break up "reboot" into however many
"off" calls, verify the off, then call "on" and not worry about the "on"
action's success or failure.

Given the options, I think this is fairly clean and scaleable.

Thoughts (other than "don't do it")?

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?