[Pacemaker] stonith

Andrei Borzenkov arvidjaar at gmail.com
Sun Apr 19 09:37:11 EDT 2015


В Sun, 19 Apr 2015 14:23:27 +0200
Andreas Kurz <andreas.kurz at gmail.com> пишет:

> On 2015-04-17 12:36, Thomas Manninger wrote:
> > Hi list,
> >  
> > i have a pacemaker/corosync2 setup with 4 nodes, stonith configured over
> > ipmi interface.
> >  
> > My problem is, that sometimes, a wrong node is stonithed.
> > As example:
> > I have 4 servers: node1, node2, node3, node4
> >  
> > I start a hardware- reset on node node1, but node1 and node3 will be
> > stonithed.
> 
> You have to tell pacemaker exactly what stonith-resource can fence what
> node if the stonith agent you are using does not support the "list" action.
> 

pacmeker is expected to get this information dynamically from stonith
agent.

> Do this by adding "pcmk_host_check=static-list" and "pcmk_host_list" to
> every stonith-resource like:
> 

Default for pcmk_host_check is "dynamic"; why it does not work in this
case? I use external/ipmi muself and I do not remember ever fiddling
with static list.

> primitive p_stonith_node3 stonith:external/ipmi \
>   op monitor interval=3s timeout=20s \
>   params hostname=node3 ipaddr=10.100.0.6 passwd_method=file
>   passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
>   priv=OPERATOR \
>   pcmk_host_check="static-list" pcmk_host_list="node3"
> 
> ... see "man stonithd".
> 
> Best regards,
> Andreas
> 
> >  
> > In the cluster.log, i found following entry:
> > Apr 17 11:02:41 [20473] node2   stonithd:    debug:
> > stonith_action_create:       Initiating action reboot for agent
> > fence_legacy (target=node1)
> > Apr 17 11:02:41 [20473] node2   stonithd:    debug: make_args:  
> > Performing reboot action for node 'node1' as 'port=node1'
> > Apr 17 11:02:41 [20473] node2   stonithd:    debug:
> > internal_stonith_action_execute:     forking
> > Apr 17 11:02:41 [20473] node2   stonithd:    debug:
> > internal_stonith_action_execute:     sending args
> > Apr 17 11:02:41 [20473] node2   stonithd:    debug:
> > stonith_device_execute:      Operation reboot for node node1 on
> > p_stonith_node3 now running with pid=113092, timeout=60s
> >  
> > node1 will be reseted with the stonith primitive of node3 ?? Why??
> >  
> > my stonith config:
> > primitive p_stonith_node1 stonith:external/ipmi \
> >         params hostname=node1 ipaddr=10.100.0.2 passwd_method=file
> > passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> > priv=OPERATOR \
> >         op monitor interval=3s timeout=20s \
> >         meta target-role=Started failure-timeout=30s
> > primitive p_stonith_node2 stonith:external/ipmi \
> >         op monitor interval=3s timeout=20s \
> >         params hostname=node2 ipaddr=10.100.0.4 passwd_method=file
> > passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> > priv=OPERATOR \
> >         meta target-role=Started failure-timeout=30s
> > primitive p_stonith_node3 stonith:external/ipmi \
> >         op monitor interval=3s timeout=20s \
> >         params hostname=node3 ipaddr=10.100.0.6 passwd_method=file
> > passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> > priv=OPERATOR \
> >         meta target-role=Started failure-timeout=30s
> > primitive p_stonith_node4 stonith:external/ipmi \
> >         op monitor interval=3s timeout=20s \
> >         params hostname=node4 ipaddr=10.100.0.8 passwd_method=file
> > passwd="/etc/stonith_ipmi_passwd" userid=stonith interface=lanplus
> > priv=OPERATOR \
> >         meta target-role=Started failure-timeout=30s
> >  
> > Somebody can help me??
> > Thanks!
> >  
> > Regards,
> > Thomas
> > 
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20150419/154256f0/attachment-0003.sig>


More information about the Pacemaker mailing list