[ClusterLabs] Phantom Node

Andrew Beekhof andrew at beekhof.net
Mon Aug 17 00:11:49 UTC 2015


> On 14 Aug 2015, at 7:53 am, Allan Brand <allan.brand at gmail.com> wrote:
> 
> I can't seem to track this down and am hoping someone has seen this or can tell me what's happening.

Try this:

- shut down the cluster
- remove the stray node entry from the cib (/var/lib/pacemaker/cib/cib.xml)
- delete the .sig file (/var/lib/pacemaker/cib/cib.xml.sig)
- clear the logs
- start the cluster

if you see the node come back, send us the logs and we should be able to determine where its coming from :)

possibility… does uname -n return node01 or node01.private ? same for node02?

> 
> I have a 2 node test cluster, node01.private and node02.private.
> 
> [root at node01 ~]# cat /etc/hosts
> 127.0.0.1   localhost
> ::1         localhost
> 
> 192.168.168.9   node01.private
> 192.168.168.10  node02.private
> 192.168.168.14  cluster.private
> 
> The issue is when I run 'pcs status' it shows both nodes online but a 3rd node, node01, to be offline:
> 
> [root at node01 ~]# pcs status
> Cluster name: cluster.private
> Last updated: Thu Aug 13 16:41:54 2015
> Last change: Wed Aug 12 18:23:22 2015
> Stack: cman
> Current DC: node01.private - partition with quorum
> Version: 1.1.11-97629de
> 3 Nodes configured
> 1 Resources configured
> 
> 
> Online: [ node01.private node02.private ]
> OFFLINE: [ node01 ]
> 
> Full list of resources:
> 
>  privateIP      (ocf::heartbeat:IPaddr2):       Started node01.private
> 
> [root at node01 ~]#
> [root at node01 ~]# pcs config
> Cluster Name: cluster.private
> Corosync Nodes:
>  node01.private node02.private
> Pacemaker Nodes:
>  node01 node01.private node02.private
> 
> Resources:
>  Resource: privateIP (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=192.168.168.14 cidr_netmask=32
>   Operations: start interval=0s timeout=20s (privateIP-start-interval-0s)
>               stop interval=0s timeout=20s (privateIP-stop-interval-0s)
>               monitor interval=30s (privateIP-monitor-interval-30s)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
>   Resource: privateIP
>     Enabled on: node01.private (score:INFINITY) (id:location-privateIP-node01.private-INFINITY)
> Ordering Constraints:
> Colocation Constraints:
> 
> Resources Defaults:
>  No defaults set
> Operations Defaults:
>  No defaults set
> 
> Cluster Properties:
>  cluster-infrastructure: cman
>  dc-version: 1.1.11-97629de
>  expected-quorum-votes: 2
>  no-quorum-policy: ignore
>  stonith-enabled: false
> [root at node01 ~]#
> [root at node01 ~]# cat /etc/cluster/cluster.conf
> <cluster config_version="8" name="cluster.private">
>   <fence_daemon/>
>   <clusternodes>
>     <clusternode name="node01.private" nodeid="1">
>       <fence>
>         <method name="pcmk-redirect">
>           <device name="pcmk" port="node01.private"/>
>         </method>
>       </fence>
>     </clusternode>
>     <clusternode name="node02.private" nodeid="2">
>       <fence>
>         <method name="pcmk-redirect">
>           <device name="pcmk" port="node02.private"/>
>         </method>
>       </fence>
>     </clusternode>
>   </clusternodes>
>   <cman/>
>   <fencedevices>
>     <fencedevice agent="fence_pcmk" name="pcmk"/>
>   </fencedevices>
>   <rm>
>     <failoverdomains/>
>     <resources/>
>   </rm>
> </cluster>
> [root at node01 ~]#
> 
> 
> Everything appears to be working correctly, just that phantom offline node shows up.
> 
> Thanks,
> Allan
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list