[Pacemaker] How to delete two offline nodes together?
    Fanghao Sha 
    shafanghao at gmail.com
       
    Sat Mar 31 12:01:54 EDT 2012
    
    
  
Hi,****
** **
I have a cluster 4 nodes (CentOS 5.2) using pacemaker-1.0.11, with heartbeat
-3.0.3.****
The configuration is:
[root at node-0 ~]# crm configure show
node $id="25b34bc9-06d0-491c-b019-76b7acdfe30f" node-1
node $id="578988ce-5e15-4931-a659-e174fc015785" node-0
node $id="8a5a9f5c-43d1-4752-921f-4f2eebf16b64" node-3
node $id="fd2256ce-027f-4545-b28a-6b73a077e1d2" node-2
primitive failover-ip ocf:heartbeat:IPaddr2 \
params ip="10.10.5.192" \
op monitor interval="10s"
primitive master-app-rsc lsb:cluster-master \
op monitor interval="10s"
primitive node-app-rsc lsb:cluster-node \
op monitor interval="10s"
group group-dc failover-ip master-app-rsc
clone clone-node-app-rsc node-app-rsc
location rule-group-dc group-dc \
rule $id="rule-group-dc-rule" -inf: #is_dc eq false
property $id="cib-bootstrap-options" \
start-failure-is-fatal="false" \
no-quorum-policy="ignore" \
symmetric-cluster="true" \
stonith-enabled="false" \
dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \
cluster-infrastructure="Heartbeat"
The problem:
The "node-2" and "node-3" are shutdown, and their status change to offline.
At first, I tried to delete them one by one.
When running "/usr/share/heartbeat/hb_delnode node-3" on node-0, the
/var/log/messages print:
--------------------------------------------
139558 Mar 31 23:43:59 node-0 heartbeat: [13142]: ERROR: HBDoMsg_T_DELNODE:
deletion failed. We don't have all required nodes alive (node-2 is dead)
-------------------------------------------
So I think them should be deleted together.
Then running "/usr/share/heartbeat/hb_delnode node-2 node-3" on node-0, but
the /var/log/messages print:
--------------------------------------
Apr  1 00:00:21 node-0 ccm: [13194]: ERROR: ccm_control_process: Node count
from node node-0 does not agree: local count=2, count in message=3
Apr  1 00:00:21 node-0 ccm: [13194]: ERROR: Please make sure ha.cf files on
all nodes have same nodes list or add "autojoin any" to ha.cf
Apr  1 00:00:21 node-0 ccm: [13194]: info: If this problem persists, check
the heartbeat 'hostcache' files in the cluster to look for problems.
--------------------------------------
These two ways are both failed. :(
How could I do, please?
Any help is appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120401/c566188d/attachment-0002.html>
    
    
More information about the Pacemaker
mailing list