[Pacemaker] stonith_admin does not work as expected

Andrew Beekhof andrew at beekhof.net
Wed Nov 13 22:16:10 UTC 2013


On 13 Nov 2013, at 11:33 pm, andreas graeper <agraeper at googlemail.com> wrote:

> hi,
> pacemaker version is 1.1.7

quite a bit of work has gone into fencing since then, any chance you could try something newer?

> 
> the fence-agent (i thought was one of the standards) calls
> snmpget -a <ipaddr>:<udpport> -c <comunity> oid
> snmpset -a <ipaddr>:<udpport> -c <comunity> oid i 0|1
> 
> therefor it needs/uses commandline arguments
> -o action
> -n port (slot-index)
> -a ipaddr
> -c community
> (udpport is not necessary, cause fix == 161)
> 
> or (as logs tell me) the fence-agent gets its parameters from stdin
> fence_ifmib <<EOF
>  action=
>  port=
>  ipaddr=
>  comunity=
> EOF
> another unvalid 'nodename=xyz' is given.
> the fence-agents was written for another device, and cause our device
> does not support
> a function (OID_PORT used to get port-index from port-name) we have to
> use port- numbers. but except other tiny limitations it works great
> 
> 
>     <primitive class="stonith" id="fence_1" type="fence_ifmib_epc8212">
>        <instance_attributes id="fence_1-instance_attributes">
>          <nvpair id="fence_1-instance_attributes-ipaddr"
> name="ipaddr" value="172.27.51.33"/>
>          <nvpair id="fence_1-instance_attributes-community"
> name="community" value="xxx"/>
>          <nvpair id="fence_1-instance_attributes-port" name="port" value="1"/>
>          <nvpair id="fence_1-instance_attributes-action"
> name="action" value="off"/>
>          <nvpair
> id="fence_1-instance_attributes-pcmk_poweroff_action"
> name="pcmk_poweroff_action" value="off"/>
>          <nvpair id="fence_1-instance_attributes-pcmk_host_list"
> name="pcmk_host_list" value="lisel1"/>
>          <nvpair id="fence_1-instance_attributes-pcmk_host_check"
> name="pcmk_host_check" value="static-list"/>
>          <nvpair id="fence_1-instance_attributes-verbose"
> name="verbose" value="true"/>
> 
>      <primitive class="stonith" id="fence_2" type="fence_ifmib_epc8212">
>        <instance_attributes id="fence_2-instance_attributes">
>          <nvpair id="fence_2-instance_attributes-ipaddr"
> name="ipaddr" value="172.27.51.33"/>
>          <nvpair id="fence_2-instance_attributes-community"
> name="community" value="xxx"/>
>          <nvpair id="fence_2-instance_attributes-port" name="port" value="2"/>
>          <nvpair id="fence_2-instance_attributes-action"
> name="action" value="off"/>
>          <nvpair
> id="fence_2-instance_attributes-pcmk_poweroff_action"
> name="pcmk_poweroff_action" value="off"/>
>          <nvpair id="fence_2-instance_attributes-pcmk_host_list"
> name="pcmk_host_list" value="lisel2"/>
>          <nvpair id="fence_2-instance_attributes-pcmk_host_check"
> name="pcmk_host_check" value="static-list"/>
>          <nvpair id="fence_2-instance_attributes-verbose"
> name="verbose" value="true"/>
> 
>      <rsc_location id="location-fence_1-lisel1--INFINITY"
> node="lisel1" rsc="fence_1" score="-INFINITY"/>
>      <rsc_location id="location-fence_2-lisel2--INFINITY"
> node="lisel2" rsc="fence_2" score="-INFINITY"/>
> 
> 
> old master is back now as slave.
> now on (new) master stonith_admin does not see the device/fence-agent.
> (see last message)
> 
> how can i repair this ?
> 
> thanks
> andreas
> 
> 
> 
> 
> 
> 2013/11/11, Andrew Beekhof <andrew at beekhof.net>:
>> Impossible to comment without knowing the pacemaker version, full config,
>> and how fence_ifmib works (I assume its a custom agent?)
>> 
>> On 12 Nov 2013, at 1:21 am, andreas graeper <agraeper at googlemail.com>
>> wrote:
>> 
>>> hi,
>>> two nodes.
>>> n1 (slave) fence_2:stonith:fence_ifmib
>>> n2 (master) fence_1:stonith:fence_ifmib
>>> 
>>> n1 was fenced cause suddenly not reachable. (reason still unknown)
>>> 
>>> n2 > stonith_admin -L -> 'fence_1'
>>> n2 > stonith_admin -U fence_1       timed out
>>> n2 > stonith_admin -L -> 'no devices found'
>>> 
>>> crm_mon shows fence_1 is running
>>> 
>>> after manual unfencing n1 with smnpset the slave n1 is up again, but
>>> still
>>> stonith_admin -L tells 'no devices found' on n2
>>> same on n1: 'fence_2 \n 1 devices found'
>>> 
>>> what went wrong with stonith_admin ?
>>> 
>>> when calling crm_mon -rA1 at the end 'Node Attributes' are listed :
>>> 
>>> * Node lisel1:
>>>   + master-p_drbd_r0:0              	: 5
>>> * Node lisel2:
>>>   + master-p_drbd_r0:0              	: 5
>>>   + master-p_drbd_r0:1              	: 5
>>> 
>>> looks strange ? resources are
>>> ms_drbd_r0 on primary
>>> p_drbd_r0 on secondary
>>> ?! or how this is to interpret ?
>>> 
>>> thanks in advance
>>> andreas
>>> 
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>> 
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> 
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131114/0dc450a1/attachment-0004.sig>


More information about the Pacemaker mailing list