[Pacemaker] How to setup STONITH in a 2-node active/passive linux HA pacemaker cluster?

Mathias Nestler mathias.nestler at barzahlen.de
Tue Mar 20 11:14:57 EDT 2012


Hi Dejan,

On 20.03.2012, at 15:25, Dejan Muhamedagic wrote:

> Hi,
> 
> On Tue, Mar 20, 2012 at 08:52:39AM +0100, Mathias Nestler wrote:
>> On 19.03.2012, at 20:26, Florian Haas wrote:
>> 
>>> On Mon, Mar 19, 2012 at 8:14 PM, Mathias Nestler
>>> <mathias.nestler at barzahlen.de> wrote:
>>>> Hi everyone,
>>>> 
>>>> I am trying to setup an active/passive (2 nodes) Linux-HA cluster with corosync and pacemaker to hold a PostgreSQL-Database up and running. It works via DRBD and a service-ip. If node1 fails, node2 should take over. The same if PG runs on node2 and it fails. Everything works fine except the STONITH thing.
>>>> 
>>>> Between the nodes is an dedicated HA-connection (10.10.10.X), so I have the following interface configuration:
>>>> 
>>>> eth0                        eth1                   host
>>>> 10.10.10.251    172.10.10.1     node1
>>>> 10.10.10.252    172.10.10.2     node2
>>>> 
>>>> Stonith is enabled and I am testing with a ssh-agent to kill nodes.
>>>> 
>>>> crm configure property stonith-enabled=true
>>>> crm configure property stonith-action=poweroff
>>>> crm configure rsc_defaults resource-stickiness=100
>>>> crm configure property no-quorum-policy=ignore
>>>> 
>>>> crm configure primitive stonith_postgres stonith:external/ssh \
>>>>              params hostlist="node1 node2"
>>>> crm configure clone fencing_postgres stonith_postgres
>>> 
>>> You're missing location constraints, and doing this with 2 primitives
>>> rather than 1 clone is usually cleaner. The example below is for
>>> external/libvirt rather than external/ssh, but you ought to be able to
>>> apply the concept anyhow:
>>> 
>>> http://www.hastexo.com/resources/hints-and-kinks/fencing-virtual-cluster-nodes
>>> 
>> 
>> As is understood the cluster decides which node has to be stonith'ed. Besides this, I already tried the following configuration:
>> 
>> crm configure primitive stonith1_postgres stonith:ssh \
>> 	params hostlist="node1"
>> 	op monitor interval="25" timeout="10"
>> crm configure primitive stonith2_postgres stonith:ssh \
>> 	params hostlist="node2"
>> 	op monitor interval="25" timeout="10"
>> crm configure location stonith1_not_on_node1 stonith1_postgres \
>> 	-inf: node1
>> crm configure location stonith2_not_on_node2 stonith2_postgres \
>> 	-inf: node2
>> 
>> The result is the same :/
> 
> Neither ssh nor external/ssh are supported fencing options. Both
> include a sleep before reboot which makes the window in which
> it's possible for both nodes to fence each other larger than it
> is usually the case with production quality stonith plugins.

I use this ssh-stonith only for testing. At the moment I am creating the cluster in a virtual environment. Besides this, what is the difference between ssh and external/ssh?
My problem is, that each node tries to kill the other. But I only want to kill the node with the postgres resource on it if connection between nodes breaks.

> 
> As for the configuration, I'd rather use the first one, just not
> cloned. That also helps prevent mutual fencing.
> 

I cloned it because I also want the STONITH-feature if postgres lives on the other node. How can I achieve it?

> See also:
> 
> http://www.clusterlabs.org/doc/crm_fencing.html
> http://ourobengr.com/ha
> 

Thank you very much

Best
Mathias

> Thanks,
> 
> Dejan
> 
>>> Hope this helps.
>>> Cheers,
>>> Florian
>>> 
>> 
>> Best
>> Mathias
>> 
> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120320/eb352a64/attachment-0003.html>


More information about the Pacemaker mailing list