[Pacemaker] IBM BladeCenter HS22 STONITH

Thu Mar 14 09:47:21 EDT 2013

The problem was my error. I was using "," to separate my hosts instead of ";".

The correct format for my STONITH resource is:
# pcs stonith create BladeStonith fence_bladecenter pcmk_host_map="storage1.localdomain:7;storage2.localdomain:8;storage3.localdomain:9" ipaddr=10.48.64.40 login=fence passwd=PASSWD action=off

________________________________
 From: Robbie Reese <jr621 at yahoo.com>
To: "pacemaker at oss.clusterlabs.org" <pacemaker at oss.clusterlabs.org> 
Sent: Thursday, March 14, 2013 8:49 AM
Subject: [Pacemaker] IBM BladeCenter HS22 STONITH

Hi

I'm trying to configure STONITH with my BladeCenter HS22 but I'm running into some issues. 

This is on CentOS 6.4 with the following pacemaker packages:

pacemaker-cli-1.1.8-7.el6.x86_64
pacemaker-1.1.8-7.el6.x86_64
pacemaker-libs-1.1.8-7.el6.x86_64
pacemaker-cluster-libs-1.1.8-7.el6.x86_64

I have 3 blades all configured as follows:

# pcs status
Last updated: Thu Mar 14 04:42:37 2013
Last change: Thu Mar 14 04:38:16 2013 via cibadmin on storage1.localdomain
Stack: classic openais (with plugin)
Current DC: storage1.localdomain - partition with quorum
Version: 1.1.8-7.el6-394e906
3
 Nodes configured, 3 expected votes
1 Resources configured.

Online: [ storage1.localdomain storage2.localdomain storage3.localdomain ]

Full list of resources:

 BladeStonith   (stonith:fence_bladecenter):    Started storage1.localdomain

The fence resource I've created is as follows:

# pcs stonith create BladeStonith fence_bladecenter pcmk_host_list="storage1.localdomain,storage2.localdomain,storage3.localdomain" pcmk_host_map="storage1.localdomain:7,storage2.localdomain:8,storage3.localdomain:9" ipaddr=10.48.64.40 login=fence passwd=PASSWORD action=off

When I try to test a reboot of a node I get the following error message:

# stonith_admin --reboot storage2.localdomain

Mar 14 04:39:04 storage1 stonith_admin[11450]:   notice: crm_log_args: Invoked: stonith_admin --reboot storage2.localdomain
Mar 14 04:39:04 storage1 stonith-ng[2169]:   notice: handle_request: Client stonith_admin.11450.e3fd7c77 wants to fence (reboot) 'storage2.localdomain' with device '(any)'
Mar 14 04:39:04 storage1 stonith-ng[2169]:   notice:
 initiate_remote_stonith_op: Initiating remote operation reboot for storage2.localdomain: b99cc9b2-73df-43d4-9e6b-78b83975c54b (0)
Mar 14 04:39:06 storage1 stonith-ng[2169]:    error: log_operation: Operation 'reboot' [11453] (call 0 from stonith_admin.11450) for host 'storage2.localdomain' with device 'BladeStonith' returned: -1001 (Generic Pacemaker error)
Mar 14 04:39:06 storage1 stonith-ng[2169]:  warning: log_operation: BladeStonith:11453 [ Parse error: Ignoring unknown option 'nodename=storage2.localdomain' ]
Mar 14 04:39:06 storage1 stonith-ng[2169]:  warning: log_operation: BladeStonith:11453 [ Failed: Unable to obtain correct plug status or plug is not available ]
Mar 14 04:41:28 storage1 stonith-ng[2169]:    error: remote_op_done: Operation reboot of storage2.localdomain by storage1.localdomain for stonith_admin.11450 at storage1.localdomain.b99cc9b2: Timer expired
Mar 14 04:41:28 storage1
 crmd[2173]:   notice: tengine_stonith_notify: Peer storage2.localdomain was not terminated (st_notify_fence) by storage1.localdomain for storage1.localdomain: Timer expired (ref=b99cc9b2-73df-43d4-9e6b-78b83975c54b) by client stonith_admin.11450

I've got the plug assignmants corrosponding to the blades correct, I'm not sure why the blades aren't fencing?

_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20130314/c50947f0/attachment-0003.html>