[ClusterLabs] ProfitBricks STONITH fencing agent (plugin) development

Tiago Santos tetreis at gmail.com
Sat May 9 23:24:03 EDT 2015


Hello folks,


I've been developing an initial version of a fencing agent to allow
management of Profitbricks VMs (http://profitbricks.com/).

The initial code can be seen at
https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks


The fencing agent uses ProfitBricks SOAP API, that can be found at
https://devops.profitbricks.com/api/soap/

It uses a config file that translates the provided node name to
ProfitBricks server ID. The call is like:

/usr/lib/stonith/plugins/external/profitbricks action hostname
       Action can be: status, on, off, reset
       Hostname needs to be on config file (/etc/pb.conf)

It works fine on basic manual tests.


It also replies correctly to:

# stonith -t external/profitbricks -h


STONITH Device: external/profitbricks - ProfitBricks host
reboot/poweron/poweroff/status

For more information see http://profitbricks.com/

List of valid parameter names for external/profitbricks STONITH device:
hostname
For Config info [-p] syntax, give each of the above parameters in order as
the -p value.
Arguments are separated by white space.
Config file [-F] syntax is the same as -p, except # at the start of a line
denotes a comment



But I'm having a hard time figuring out how to make it work with STONITH
(or maybe understanding how STONITH works - sorry, I'm a total newbie on
this) and how to configure it with Pacemaker/Corosync.

All this to request your help on the following two points:


1. When I run:

# stonith -t external/profitbricks -p status node1

I get a node reset.

And when I run:

# stonith -t external/profitbricks -p node1

I get the default stonith usage help, like if my syntax is wrong.

I've read and researched a lot, but couldn't figure out what I'm doing
wrong here, although it seems to be pretty basic mistake.



2. I have a setup with Pacemaker/Corosync configured. Virtual/floating IP
(resource) works just fine and solid. Then I'm trying to add STONITH using
the ProfitBricks plugin. I have a file with the following configuration for
crm:

configure
primitive st-node1 stonith:external/profitbricks \
params hostname=node1
primitive st-node2 stonith:external/profitbricks \
params hostname=node2
location l-st-node1 st-node1 -inf: node1
location l-st-node2 st-node2 -inf: node2
commit

I call it like:

crm < file.config


Then I'll see on the fencing agent the usage error (called when $1 is not
recognised):

Usage: /usr/lib/stonith/plugins/external/profitbricks action hostname
       Action can be: status, on, off, reset
       Hostname needs to be on config file (/etc/pb.conf)


crm_mon will give me:

# crm_mon -1
Last updated: Sat May  9 20:12:47 2015
Last change: Sat May  9 20:12:40 2015 via cibadmin on node1
Stack: corosync
Current DC: node1 (1084751975) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
3 Resources configured


Online: [ node1 node2 ]


 VIP (ocf::heartbeat:IPaddr2): Started node1

Failed actions:
    st-node2_start_0 (node=node1, call=64, rc=1, status=Error,
last-rc-change=Sat May  9 20:12:41 2015
, queued=3195ms, exec=0ms
): unknown error
    st-node1_start_0 (node=node2, call=18, rc=1, status=Error,
last-rc-change=Sat May  9 20:12:40 2015
, queued=4202ms, exec=0ms
): unknown error


And inside crm both services are stopped:

# crm
crm(live)# resource list
 VIP (ocf::heartbeat:IPaddr2): Started
 st-node1 (stonith:external/profitbricks): Stopped
 st-node2 (stonith:external/profitbricks): Stopped


Would you guys help me figure out what's blocking me from going ahead on
this story?

I know the fencing agent is still not obeying all the rules described at
https://fedorahosted.org/cluster/wiki/FenceAgentAPI, but that's not the
reason why I'm getting the errors. I would like to understand what's going
on before going ahead in the hope to get better knowledge on the whole
thing.


Sorry for the long email, and thank you so much in advance for the help.


Cheers,
-- 
*Tiago Santos*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150510/735f37a4/attachment-0002.html>


More information about the Users mailing list