<div dir="ltr"><span style="font-size:13px">Hello folks,</span><div style="font-size:13px"><br></div><div style="font-size:13px"><br></div><div style="font-size:13px">I've been developing an initial version of a fencing agent to allow management of Profitbricks VMs (<a href="http://profitbricks.com/" target="_blank">http://profitbricks.com/</a>).</div><div style="font-size:13px"><br></div><div style="font-size:13px">The initial code can be seen at <a href="https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks" target="_blank">https://github.com/tetreis/profitbricks_stonith_plugin/blob/master/profitbricks</a><br></div><div style="font-size:13px"><br></div><div style="font-size:13px"><br></div><div style="font-size:13px">The fencing agent uses ProfitBricks SOAP API, that can be found at <a href="https://devops.profitbricks.com/api/soap/" target="_blank">https://devops.profitbricks.com/api/soap/</a></div><div style="font-size:13px"><br></div><div style="font-size:13px">It uses a config file that translates the provided node name to ProfitBricks server ID. The call is like:</div><div style="font-size:13px"><br></div><div style="font-size:13px"><div>/usr/lib/stonith/plugins/external/profitbricks action hostname</div><div> Action can be: status, on, off, reset</div><div> Hostname needs to be on config file (/etc/pb.conf)</div></div><div style="font-size:13px"><br></div><div style="font-size:13px">It works fine on basic manual tests.</div><div style="font-size:13px"><br></div><div style="font-size:13px"><br></div><div style="font-size:13px">It also replies correctly to:</div><div style="font-size:13px"><br></div><div style="font-size:13px"><div># stonith -t external/profitbricks -h</div><div><br></div><div><br></div><div>STONITH Device: external/profitbricks - ProfitBricks host reboot/poweron/poweroff/status</div><div><br></div><div>For more information see <a href="http://profitbricks.com/" target="_blank">http://profitbricks.com/</a></div><div><br></div><div>List of valid parameter names for external/profitbricks STONITH device:</div><div><span style="white-space:pre-wrap"> </span>hostname</div><div><div><div>For Config info [-p] syntax, give each of the above parameters in order as</div><div>the -p value.</div><div>Arguments are separated by white space.</div><div>Config file [-F] syntax is the same as -p, except # at the start of a line</div><div>denotes a comment</div></div><div><br></div><div><br></div><div><br></div><div>But I'm having a hard time figuring out how to make it work with STONITH (or maybe understanding how STONITH works - sorry, I'm a total newbie on this) and how to configure it with Pacemaker/Corosync.</div><div><br></div><div>All this to request your help on the following two points:</div><div><br></div><div><br></div><div>1. When I run:</div><div><br></div><div># stonith -t external/profitbricks -p status node1<br></div><div><br></div><div>I get a node reset.</div><div><br></div><div>And when I run:</div><div><br></div><div># stonith -t external/profitbricks -p node1<br></div><div><br></div><div>I get the default stonith usage help, like if my syntax is wrong.</div><div><br></div><div>I've read and researched a lot, but couldn't figure out what I'm doing wrong here, although it seems to be pretty basic mistake.</div><div><br></div><div><br></div><div><br></div><div>2. I have a setup with Pacemaker/Corosync configured. Virtual/floating IP (resource) works just fine and solid. Then I'm trying to add STONITH using the ProfitBricks plugin. I have a file with the following configuration for crm:</div><div><br></div><div>configure</div></div><div><div><div>primitive st-node1 stonith:external/profitbricks \</div><div>params hostname=node1</div><div>primitive st-node2 stonith:external/profitbricks \</div><div>params hostname=node2</div><div>location l-st-node1 st-node1 -inf: node1</div><div>location l-st-node2 st-node2 -inf: node2</div><div>commit</div></div><div><br></div><div>I call it like:</div><div><br></div><div>crm < file.config</div><div><br></div><div><br></div><div>Then I'll see on the fencing agent the usage error (called when $1 is not recognised):</div><div><br></div><div><div>Usage: /usr/lib/stonith/plugins/external/profitbricks action hostname</div><div> Action can be: status, on, off, reset</div><div> Hostname needs to be on config file (/etc/pb.conf)</div></div><div><br></div><div><br></div><div>crm_mon will give me:</div><div><br></div><div><div># crm_mon -1</div><div>Last updated: Sat May 9 20:12:47 2015</div><div>Last change: Sat May 9 20:12:40 2015 via cibadmin on node1</div><div>Stack: corosync</div><div>Current DC: node1 (1084751975) - partition with quorum</div><div>Version: 1.1.10-42f2063</div><div>2 Nodes configured</div><div>3 Resources configured</div><div><br></div><div><br></div><div>Online: [ node1 node2 ]</div></div></div><div><br></div><div><div><br></div><div> VIP<span style="white-space:pre-wrap"> </span>(ocf::heartbeat:IPaddr2):<span style="white-space:pre-wrap"> </span>Started node1</div><div><br></div><div>Failed actions:</div><div> st-node2_start_0 (node=node1, call=64, rc=1, status=Error, last-rc-change=Sat May 9 20:12:41 2015</div><div>, queued=3195ms, exec=0ms</div><div>): unknown error</div><div> st-node1_start_0 (node=node2, call=18, rc=1, status=Error, last-rc-change=Sat May 9 20:12:40 2015</div><div>, queued=4202ms, exec=0ms</div><div>): unknown error</div></div><div><br></div><div><br></div><div>And inside crm both services are stopped:</div><div><br></div><div><div># crm</div><div>crm(live)# resource list</div><div> VIP<span style="white-space:pre-wrap"> </span>(ocf::heartbeat:IPaddr2):<span style="white-space:pre-wrap"> </span>Started</div><div> st-node1<span style="white-space:pre-wrap"> </span>(stonith:external/profitbricks):<span style="white-space:pre-wrap"> </span>Stopped</div><div> st-node2<span style="white-space:pre-wrap"> </span>(stonith:external/profitbricks):<span style="white-space:pre-wrap"> </span>Stopped</div></div><div><br></div><div><br></div><div>Would you guys help me figure out what's blocking me from going ahead on this story?</div><div><br></div><div>I know the fencing agent is still not obeying all the rules described at <a href="https://fedorahosted.org/cluster/wiki/FenceAgentAPI" target="_blank" style="color:rgb(31,141,214);text-decoration:none;font-family:proxima-nova,'Helvetica Neue',Arial,sans-serif;font-size:14px;line-height:19.6000003814697px;white-space:pre-wrap">https://fedorahosted.org/cluster/wiki/FenceAgentAPI</a>, but that's not the reason why I'm getting the errors. I would like to understand what's going on before going ahead in the hope to get better knowledge on the whole thing.</div><div><br></div><div><br></div><div>Sorry for the long email, and thank you so much in advance for the help.</div><div><br></div><div><br clear="all"><div>Cheers,</div></div></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><b><span style="color:rgb(102,102,102)">Tiago Santos</span></b><br></div></div></div></div>
<br>
</div>