[ClusterLabs] Pacemaker not always selecting the right stonith device

Martin Schlegel martin at nuboreto.org
Wed Jul 20 13:02:46 EDT 2016


Thank you Andrei, Ken & Klaus - much appreciated !

I am now including pcmk_host_list and pcmk_host_check=static-list. 

The command stonith_admin -l <node_name> is now showing the right stonith device
- the one matching the requested <node_name>, i.e. stonith_admin -l pg1 would
show only the registered device p_ston_pg1.

However, could you please have another look - I'd like to understand what I am
seeing ?

1) Why does pg3 have stonith devices registered even though none of the stonith
resources (p_ston_pg1, p_ston_pg2 or p_ston_pg3) were started on pg3 according
to the crm_mon output ?
2) Why does pg2 have p_ston_pg3 registered although it only runs p_ston_pg1
according to the crm_mon output ?

(see also the detailed output for stonith_admin further below)

Cheers,
Martin

______________

[...]
primitive p_ston_pg1 stonith:external/ipmi \
params hostname=pg1 pcmk_host_list=pg1 pcmk_host_check=static-list
ipaddr=10.148.128.35 userid=root
passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG1-ipmipass"
passwd_method=file interface=lan priv=OPERATOR

primitive p_ston_pg2 stonith:external/ipmi \
params hostname=pg2 pcmk_host_list=pg2 pcmk_host_check=static-list
ipaddr=10.148.128.19 userid=root
passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG2-ipmipass"
passwd_method=file interface=lan priv=OPERATOR

primitive p_ston_pg3 stonith:external/ipmi \
params hostname=pg3 pcmk_host_list=pg3 pcmk_host_check=static-list
ipaddr=10.148.128.59 userid=root
passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG3-ipmipass"
passwd_method=file interface=lan priv=OPERATOR
[...]


root at dsvt0-resiliency-test-7:~# crm_mon -1rR
Last updated: Wed Jul 20 14:36:13 2016 Last change: Wed Jul 20 14:24:19 2016 by
root via cibadmin on pg2
Stack: corosync
Current DC: pg2 (2) (version 1.1.14-70404b0) - partition with quorum
3 nodes and 25 resources configured

Online: [ pg1 (1) pg2 (2) pg3 (3) ]

Full list of resources:

p_ston_pg1 (stonith:external/ipmi): Started pg2
p_ston_pg2 (stonith:external/ipmi): Started pg1
p_ston_pg3 (stonith:external/ipmi): Started pg1

[...]


root at test123:~# for xnode in pg{1..3}; do ssh -q $xnode "echo -en
$xnode'\n======\n\n' ; for node in pg{1..3}; do echo -en 'Fence node '\$node'
with:\n' ; stonith_admin -l \$node ; echo '--' ; done"; done
pg1
======

Fence node pg1 with:
No devices found
--
Fence node pg2 with:
1 devices found
p_ston_pg2
--
Fence node pg3 with:
1 devices found
p_ston_pg3
--
pg2
======

Fence node pg1 with:
1 devices found
p_ston_pg1
--
Fence node pg2 with:
No devices found
--
Fence node pg3 with:
1 devices found
p_ston_pg3
--
pg3
======

Fence node pg1 with:
1 devices found
p_ston_pg1
--
Fence node pg2 with:
1 devices found
p_ston_pg2
--
Fence node pg3 with:
No devices found
--



root at test123:~# for xnode in pg{1..3}; do ssh -q $xnode "echo -en
$xnode'\n======\n\n' ; stonith_admin -L; echo "; done
pg1
======

2 devices found
p_ston_pg3
p_ston_pg2

pg2
======

2 devices found
p_ston_pg3
p_ston_pg1

pg3
======

2 devices found
p_ston_pg1
p_ston_pg2



> Andrei Borzenkov <arvidjaar at gmail.com> hat am 20. Juli 2016 um 08:26
> geschrieben:
> 
> On Tue, Jul 19, 2016 at 6:33 PM, Martin Schlegel <martin at nuboreto.org> wrote:
> >> > [...]
> >> >
> >> > primitive p_ston_pg1 stonith:external/ipmi \
> >> > params hostname=pg1 ipaddr=10.148.128.35 userid=root
> >> > passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG1-ipmipass"
> >> > passwd_method=file interface=lan priv=OPERATOR
> >> >
> >> > primitive p_ston_pg2 stonith:external/ipmi \
> >> > params hostname=pg2 ipaddr=10.148.128.19 userid=root
> >> > passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG2-ipmipass"
> >> > passwd_method=file interface=lan priv=OPERATOR
> >> >
> >> > primitive p_ston_pg3 stonith:external/ipmi \
> >> > params hostname=pg3 ipaddr=10.148.128.59 userid=root
> >> > passwd="/var/vcap/data/packages/pacemaker/ra-tmp/stonith/PG3-ipmipass"
> >> > passwd_method=file interface=lan priv=OPERATOR
> >> >
> >> > location l_pgs_resources { otherstuff p_ston_pg1 p_ston_pg2 p_ston_pg3 }
> >> > resource-discovery=exclusive \
> >> > rule #uname eq pg1 \
> >> > rule #uname eq pg2 \
> >> > rule #uname eq pg3
> >> >
> >> > location l_ston_pg1 p_ston_pg1 -inf: pg1
> >> > location l_ston_pg2 p_ston_pg2 -inf: pg2
> >> > location l_ston_pg3 p_ston_pg3 -inf: pg3
> >>
> >> These constraints prevent each device from running on its intended
> >> target, but they don't limit which nodes each device can fence. For
> >> that, each device needs a pcmk_host_list or pcmk_host_map entry, for
> >> example:
> >>
> >> primitive p_ston_pg1 ... pcmk_host_map=pg1:pg1.ipmi.example.com
> >>
> >> Use pcmk_host_list if the fence device needs the node name as known to
> >> the cluster, and pcmk_host_map if you need to translate a node name to
> >> an address the device understands.
> 
> > We used the parameter "hostname". What does it do if not that ?
> 
> hostname is resource parameter. From pacemaker point of view this is
> opaque string and only resource agent knows how to interpret it.
> 
> See discussion in another part of this thread. Agent is supposed to
> return information based on "hostname" parameter but apparently it
> does not understand when pacemaker asks it.




More information about the Users mailing list