[ClusterLabs Developers] ProfitBricks STONITH fencing agent development
Dejan Muhamedagic
dejanmm at fastmail.fm
Wed May 13 04:05:32 EDT 2015
Hi,
On Sun, May 10, 2015 at 12:19:49AM -0300, Tiago Santos wrote:
> Hello folks,
>
>
> I've been developing an initial version of a fencing agent to allow
> management of Profitbricks VMs (http://profitbricks.com/).
>
> The initial code can be seen at
> https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks
>
>
> The fencing agent uses ProfitBricks SOAP API, that can be found at
> https://devops.profitbricks.com/api/soap/
>
> It uses a config file that translates the provided node name to
> ProfitBricks server ID. The call is like:
>
> /usr/lib/stonith/plugins/external/profitbricks action hostname
> Action can be: status, on, off, reset
> Hostname needs to be on config file (/etc/pb.conf)
>
> It works fine on basic manual tests.
>
>
> It also replies correctly to:
>
> # stonith -t external/profitbricks -h
>
>
> STONITH Device: external/profitbricks - ProfitBricks host
> reboot/poweron/poweroff/status
>
> For more information see http://profitbricks.com/
>
> List of valid parameter names for external/profitbricks STONITH device:
> hostname
> For Config info [-p] syntax, give each of the above parameters in order as
> the -p value.
> Arguments are separated by white space.
> Config file [-F] syntax is the same as -p, except # at the start of a line
> denotes a comment
>
>
>
> But I'm having a hard time figuring out how to make it work with STONITH
> (or maybe understanding how STONITH works - sorry, I'm a total newbie on
> this) and how to configure it with Pacemaker/Corosync.
>
> All this to request your help on the following two points:
>
>
> 1. When I run:
>
> # stonith -t external/profitbricks -p status node1
>
> I get a node reset.
:)
If you run stonith without arguments, it'll print basic usage.
You can get status like this:
# stonith -t external/profitbricks hostname=node1 -S
Node list:
# stonith -t external/profitbricks hostname=node1 -l
Reset:
# stonith -t external/profitbricks hostname=node1 -T reset
> And when I run:
>
> # stonith -t external/profitbricks -p node1
>
> I get the default stonith usage help, like if my syntax is wrong.
The stonith(8) interface takes some time getting used to.
> I've read and researched a lot, but couldn't figure out what I'm doing
> wrong here, although it seems to be pretty basic mistake.
>
>
>
> 2. I have a setup with Pacemaker/Corosync configured. Virtual/floating IP
> (resource) works just fine and solid. Then I'm trying to add STONITH using
> the ProfitBricks plugin. I have a file with the following configuration for
> crm:
>
> configure
> primitive st-node1 stonith:external/profitbricks \
> params hostname=node1
> primitive st-node2 stonith:external/profitbricks \
> params hostname=node2
> location l-st-node1 st-node1 -inf: node1
> location l-st-node2 st-node2 -inf: node2
> commit
>
> I call it like:
>
> crm < file.config
>
>
> Then I'll see on the fencing agent the usage error (called when $1 is not
> recognised):
>
> Usage: /usr/lib/stonith/plugins/external/profitbricks action hostname
> Action can be: status, on, off, reset
> Hostname needs to be on config file (/etc/pb.conf)
The configuration looks fine, so there's probably something wrong
with the agent itself. I guess you'll need to debug.
> crm_mon will give me:
>
> # crm_mon -1
> Last updated: Sat May 9 20:12:47 2015
> Last change: Sat May 9 20:12:40 2015 via cibadmin on node1
> Stack: corosync
> Current DC: node1 (1084751975) - partition with quorum
> Version: 1.1.10-42f2063
> 2 Nodes configured
> 3 Resources configured
>
>
> Online: [ node1 node2 ]
>
> VIP (ocf::heartbeat:IPaddr2): Started node1
>
> Failed actions:
> st-node2_start_0 (node=node1, call=64, rc=1, status=Error,
> last-rc-change=Sat May 9 20:12:41 2015
> , queued=3195ms, exec=0ms
> ): unknown error
> st-node1_start_0 (node=node2, call=18, rc=1, status=Error,
> last-rc-change=Sat May 9 20:12:40 2015
> , queued=4202ms, exec=0ms
> ): unknown error
>
>
> And inside crm both services are stopped:
>
> # crm
> crm(live)# resource list
> VIP (ocf::heartbeat:IPaddr2): Started
> st-node1 (stonith:external/profitbricks): Stopped
> st-node2 (stonith:external/profitbricks): Stopped
>
>
> Would you guys help me figure out what's blocking me from going ahead on
> this story?
>
> I know the fencing agent is still not obeying all the rules described at
> https://fedorahosted.org/cluster/wiki/FenceAgentAPI, but that's not the
> reason why I'm getting the errors. I would like to understand what's this
> before going ahead also in a way to get better knowledge on the whole thing.
There's a slight confusion here. The agent you wrote conforms to
the Linux HA stonith API. RH fence-agents are somewhat different.
Thanks,
Dejan
>
> Sorry for the long email, and thank you so much in advance for the help.
>
>
> Cheers,
> --
> *Tiago Santos*
> _______________________________________________
> Developers mailing list
> Developers at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/developers
More information about the Developers
mailing list