[Pacemaker] crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline

Dan Frincu df.cluster at gmail.com
Thu Apr 19 09:51:42 EDT 2012


On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 at gmail.com> wrote:
> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>  a. Use case:
>    i. Two nodes in a cluster (Call them Node-1 and Node-2)
>    ii. One interface configured in corosync.conf for its heartbeat or
> messaging. Eg. Bind net addr :
>    iii. Another interface configured in /etc/hosts for hostname resolution.
>    Eg. IP: Hostname: Node-1
>    Eg. IP: Hostname: Node-2
>    iv. Hence for all ssh communication between the two nodes, hostname resolves
> to subnet 129 address.
>    v. 12 services configured in active/passive mode
>    vi. 1 service configured in master/slave mode
>    vii. 8 services are non-sticky (they failback) in active/passive
>    viii. 4 services are sticky (do not failback) in active/passive
>    ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>  b. Observations:
>    i. On Node-2, the interface was down over which IP: Hostname:
> Node-2 was configured.
>    ii. On Node-1 all interfaces were up.
>    iii. Interface used by corosync for hearbeat/messaging was up at all times
> (Bind net addr :
>    iv. In crm_mon: Node-1 sees Node-2 as offline
>        cibadmin --query fails to work (remote node did not respond)
>    v. In crm_mon: Node-2 sees Node-1 as online
>    vi. All the services were seen active on Node-1 (including those that were
> preferred for Node-2). Observed in crm_mon output.
>    vii. 4 services for which Node-2 was preferred were seen active Node-2 also
> (hence 4 services active on both the nodes).
>    Observed in crm_mon output: Only 4 services were shown active, the status of
> the rest of the services active on Node-1 did not reflect in crm_mon
>    Even though crm_mon on Node-2 sees Node-1 as “online”.
>  c. Errors in log file:
>    i. On Node-2:
>      1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>      2. The above error appears for all the resources configured in pacemaker.
> Query:
> 1) For what purpose does Pacemaker require “ssh without a pass key” to be
> enabled between the nodes in a cluster ?


> 2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
> come into picture ?

When choosing where to allocate resources not explicitly tied to a node. See




> 3) Let’s say in a two node cluster two communication paths are available between
> the two nodes.
>  a. Eth1 and eth2.
>  b. The hostname of the node resolves to IP Address on eth1.
>  c. Consider, eth1 (network cable disconnected) goes down.
>  d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
> eth1 addr).

Inter-node communication is usually specified by IP address, and
redundant connections (as in your case) is recommended.

>  e. Will this (hostname) have any issue ?
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

Dan Frincu

More information about the Pacemaker mailing list