[ClusterLabs] ProfitBricks STONITH fencing agent (plugin) development

Mon May 18 10:56:34 UTC 2015

Hi,

On 05/10/2015 05:24 AM, Tiago Santos wrote:
> Hello folks,
>
>
> I've been developing an initial version of a fencing agent to allow 
> management of Profitbricks VMs (http://profitbricks.com/).
>
> The initial code can be seen at 
> https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks
>
>
> The fencing agent uses ProfitBricks SOAP API, that can be found at 
> https://devops.profitbricks.com/api/soap/
>
> It uses a config file that translates the provided node name to 
> ProfitBricks server ID. The call is like:
>
> /usr/lib/stonith/plugins/external/profitbricks action hostname
>        Action can be: status, on, off, reset
>        Hostname needs to be on config file (/etc/pb.conf)
>
> It works fine on basic manual tests.
In general, we prefer that fence agents do not use configuration files.

If you don't have aversion against python, it should be pretty simple to 
convert it to our fencing library that will provide everything required 
(metadata, manual pages generation, ...). There is already agent that
uses SOAP  (fence_vmware_soap*) so it can be used as a basis.

If you have any question/problem, feel free to ask.

m,

(*) = it is more complex than it is required in your case as it uses 
attributes in a strange way

>
>
> It also replies correctly to:
>
> # stonith -t external/profitbricks -h
>
>
> STONITH Device: external/profitbricks - ProfitBricks host 
> reboot/poweron/poweroff/status
>
> For more information see http://profitbricks.com/
>
> List of valid parameter names for external/profitbricks STONITH device:
> hostname
> For Config info [-p] syntax, give each of the above parameters in order as
> the -p value.
> Arguments are separated by white space.
> Config file [-F] syntax is the same as -p, except # at the start of a line
> denotes a comment
>
>
>
> But I'm having a hard time figuring out how to make it work with 
> STONITH (or maybe understanding how STONITH works - sorry, I'm a total 
> newbie on this) and how to configure it with Pacemaker/Corosync.
>
> All this to request your help on the following two points:
>
>
> 1. When I run:
>
> # stonith -t external/profitbricks -p status node1
>
> I get a node reset.
>
> And when I run:
>
> # stonith -t external/profitbricks -p node1
>
> I get the default stonith usage help, like if my syntax is wrong.
>
> I've read and researched a lot, but couldn't figure out what I'm doing 
> wrong here, although it seems to be pretty basic mistake.
>
>
>
> 2. I have a setup with Pacemaker/Corosync configured. Virtual/floating 
> IP (resource) works just fine and solid. Then I'm trying to add 
> STONITH using the ProfitBricks plugin. I have a file with the 
> following configuration for crm:
>
> configure
> primitive st-node1 stonith:external/profitbricks \
> params hostname=node1
> primitive st-node2 stonith:external/profitbricks \
> params hostname=node2
> location l-st-node1 st-node1 -inf: node1
> location l-st-node2 st-node2 -inf: node2
> commit
>
> I call it like:
>
> crm < file.config
>
>
> Then I'll see on the fencing agent the usage error (called when $1 is 
> not recognised):
>
> Usage: /usr/lib/stonith/plugins/external/profitbricks action hostname
>        Action can be: status, on, off, reset
>        Hostname needs to be on config file (/etc/pb.conf)
>
>
> crm_mon will give me:
>
> # crm_mon -1
> Last updated: Sat May  9 20:12:47 2015
> Last change: Sat May  9 20:12:40 2015 via cibadmin on node1
> Stack: corosync
> Current DC: node1 (1084751975) - partition with quorum
> Version: 1.1.10-42f2063
> 2 Nodes configured
> 3 Resources configured
>
>
> Online: [ node1 node2 ]
>
>
>  VIP(ocf::heartbeat:IPaddr2):Started node1
>
> Failed actions:
>     st-node2_start_0 (node=node1, call=64, rc=1, status=Error, 
> last-rc-change=Sat May  9 20:12:41 2015
> , queued=3195ms, exec=0ms
> ): unknown error
>     st-node1_start_0 (node=node2, call=18, rc=1, status=Error, 
> last-rc-change=Sat May  9 20:12:40 2015
> , queued=4202ms, exec=0ms
> ): unknown error
>
>
> And inside crm both services are stopped:
>
> # crm
> crm(live)# resource list
>  VIP(ocf::heartbeat:IPaddr2):Started
>  st-node1(stonith:external/profitbricks):Stopped
>  st-node2(stonith:external/profitbricks):Stopped
>
>
> Would you guys help me figure out what's blocking me from going ahead 
> on this story?
>
> I know the fencing agent is still not obeying all the rules described 
> at https://fedorahosted.org/cluster/wiki/FenceAgentAPI, but that's not 
> the reason why I'm getting the errors. I would like to understand 
> what's going on before going ahead in the hope to get better knowledge 
> on the whole thing.
>
>
> Sorry for the long email, and thank you so much in advance for the help.
>
>
> Cheers,
> -- 
> *Tiago Santos*
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20150518/7014c0b2/attachment-0002.html>