[Pacemaker] Fencing in Pacemaker with Cyclades

Martin Steigerwald ms at teamix.de
Wed Aug 18 10:41:32 EDT 2010


I have a working fencing setup with heartbeat-1:

somehost1:~# grep ^stonith /etc/ha.d/ha.cf
stonith_host    *       cyclades root              10

So thats a cyclades stonith plugin, the IP adress of the Cyclades Alterpath, login name for SSH login, and the serial port of the IPDU that should powercycle the node to be fenced.

Now when I want to configure a stonith plugin in corosync/pacemaker, I can't set the serial port.

There is simply no such parameter in that resource agent shown in pacemaker:

crm(live)# ra info cyclades stonith
<!-- no value --> (stonith:cyclades)

Cyclades AlterPath PM series power switches (via TS/ACS/KVM).

Parameters (* denotes required, [] the default):

ipaddr* (string): IP Address
    The IP address of the STONITH device

login* (string): Login
    The username used for logging in to the STONITH device

stonith-timeout (time, [60s]):
    How long to wait for the STONITH action to complete. Overrides the stonith-timeout cluster property

priority (integer, [0]):
    The priority of the stonith resource. The lower the number, the higher the priority.

Operations' defaults (advisory minimum):

    start         timeout=60
    stop          timeout=15
    status        timeout=60
    monitor_0     interval=3600 timeout=60

So I tried with just:

primitive fencing stonith:cyclades \
        params ipaddr="" login="root" \
        op monitor interval="15s" timeout="60s"

But it doesn't work:

Failed actions:
    fencing:0_start_0 (node=somenode2, call=6, rc=1, status=complete): unknown error
    fencing:1_start_0 (node=somenode1, call=7, rc=1, status=complete): unknown error

I find no hint except that "unknown error" in syslog or crm shell. And pacemaker raised fail count to infinite after AFAIR the second attempt. I did not find a single hint via Google on how to configure stonith with cyclades for a corosync / pacemaker setup either.

How should it know which serial port I connected the IPDU too?

Related: Why does pacemaker raise failure count to infinity so quickly? In our old heartbeat setup heartbeat tried stonithing the other hosts for lots of attempts and didn't give up that quickly. Well usually it should work on the first attempt, but with shared storage it can be very dangerous if a cluster partner takes over resources, when fencing the unresponsive node did not work.

I am using corosync 1.2.1-1~bpo50+1 and pacemaker from lenny-backports.[1]

[1] http://www.backports.org

Martin Steigerwald - team(ix) GmbH - http://www.teamix.de
gpg: 19E3 8D42 896F D004 08AC A0CA 1E10 C593 0399 AE90

More information about the Pacemaker mailing list