[ClusterLabs Developers] ProfitBricks STONITH fencing agent development

Tiago Santos tetreis at gmail.com
Sun May 10 03:19:49 UTC 2015


Hello folks,


I've been developing an initial version of a fencing agent to allow
management of Profitbricks VMs (http://profitbricks.com/).

The initial code can be seen at
https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks


The fencing agent uses ProfitBricks SOAP API, that can be found at
https://devops.profitbricks.com/api/soap/

It uses a config file that translates the provided node name to
ProfitBricks server ID. The call is like:

/usr/lib/stonith/plugins/external/profitbricks action hostname
       Action can be: status, on, off, reset
       Hostname needs to be on config file (/etc/pb.conf)

It works fine on basic manual tests.


It also replies correctly to:

# stonith -t external/profitbricks -h


STONITH Device: external/profitbricks - ProfitBricks host
reboot/poweron/poweroff/status

For more information see http://profitbricks.com/

List of valid parameter names for external/profitbricks STONITH device:
hostname
For Config info [-p] syntax, give each of the above parameters in order as
the -p value.
Arguments are separated by white space.
Config file [-F] syntax is the same as -p, except # at the start of a line
denotes a comment



But I'm having a hard time figuring out how to make it work with STONITH
(or maybe understanding how STONITH works - sorry, I'm a total newbie on
this) and how to configure it with Pacemaker/Corosync.

All this to request your help on the following two points:


1. When I run:

# stonith -t external/profitbricks -p status node1

I get a node reset.

And when I run:

# stonith -t external/profitbricks -p node1

I get the default stonith usage help, like if my syntax is wrong.

I've read and researched a lot, but couldn't figure out what I'm doing
wrong here, although it seems to be pretty basic mistake.



2. I have a setup with Pacemaker/Corosync configured. Virtual/floating IP
(resource) works just fine and solid. Then I'm trying to add STONITH using
the ProfitBricks plugin. I have a file with the following configuration for
crm:

configure
primitive st-node1 stonith:external/profitbricks \
params hostname=node1
primitive st-node2 stonith:external/profitbricks \
params hostname=node2
location l-st-node1 st-node1 -inf: node1
location l-st-node2 st-node2 -inf: node2
commit

I call it like:

crm < file.config


Then I'll see on the fencing agent the usage error (called when $1 is not
recognised):

Usage: /usr/lib/stonith/plugins/external/profitbricks action hostname
       Action can be: status, on, off, reset
       Hostname needs to be on config file (/etc/pb.conf)


crm_mon will give me:

# crm_mon -1
Last updated: Sat May  9 20:12:47 2015
Last change: Sat May  9 20:12:40 2015 via cibadmin on node1
Stack: corosync
Current DC: node1 (1084751975) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
3 Resources configured


Online: [ node1 node2 ]

 VIP (ocf::heartbeat:IPaddr2): Started node1

Failed actions:
    st-node2_start_0 (node=node1, call=64, rc=1, status=Error,
last-rc-change=Sat May  9 20:12:41 2015
, queued=3195ms, exec=0ms
): unknown error
    st-node1_start_0 (node=node2, call=18, rc=1, status=Error,
last-rc-change=Sat May  9 20:12:40 2015
, queued=4202ms, exec=0ms
): unknown error


And inside crm both services are stopped:

# crm
crm(live)# resource list
 VIP (ocf::heartbeat:IPaddr2): Started
 st-node1 (stonith:external/profitbricks): Stopped
 st-node2 (stonith:external/profitbricks): Stopped


Would you guys help me figure out what's blocking me from going ahead on
this story?

I know the fencing agent is still not obeying all the rules described at
https://fedorahosted.org/cluster/wiki/FenceAgentAPI, but that's not the
reason why I'm getting the errors. I would like to understand what's this
before going ahead also in a way to get better knowledge on the whole thing.


Sorry for the long email, and thank you so much in advance for the help.


Cheers,
-- 
*Tiago Santos*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20150510/909bda9e/attachment-0003.html>


More information about the Developers mailing list