[ClusterLabs] Cannot add a node with pcs

Piotr Szafarczyk piotr-l at netexpert.pl
Tue Jul 12 06:34:26 EDT 2022


Hi,

I used to have a working cluster with 3 nodes (and stonith disabled). 
After an unexpected restart of one node, the cluster split. The node #2 
started to see the others as unclean. Nodes 1 and 2 were cooperating 
with each other, showing #2 as offline. There were no network connection 
problems.

I removed #2 (operating from #1) with
pcs cluster node remove n2

I verified that it had removed all configuration from #2, both for 
corosync and for pacemaker. The cluster looks like working correctly 
with two nodes (and no traces of #2).

Now I am trying to add the third node back.
pcs cluster node add n2
Disabling SBD service...
n2: sbd disabled
Sending 'corosync authkey', 'pacemaker authkey' to 'n2'
n2: successful distribution of the file 'corosync authkey'
n2: successful distribution of the file 'pacemaker authkey'
Sending updated corosync.conf to nodes...
n3: Succeeded
n2: Succeeded
n1: Succeeded
n3: Corosync configuration reloaded

I am able to start #2 operating from #1

pcs cluster pcsd-status
   n2: Online
   n3: Online
   n1: Online

pcs cluster enable n2
pcs cluster start n2

I can see that corosync's configuration has been updated, but 
pacemaker's not.

_Checking from #1:_

pcs config
Cluster Name: n
Corosync Nodes:
  n1 n3 n2
Pacemaker Nodes:
  n1 n3
[...]

pcs status
   * 2 nodes configured
Node List:
   * Online: [ n1 n3 ]
[...]

pcs cluster cib scope=nodes
<nodes>
   <node id="1" uname="n1"/>
   <node id="3" uname="n3"/>
</nodes>

_#2 is seeing the state differently:_

pcs config
Cluster Name: n
Corosync Nodes:
  n1 n3 n2
Pacemaker Nodes:
  n1 n2 n3

pcs status
   * 3 nodes configured
Node List:
   * Online: [ n2 ]
   * OFFLINE: [ n1 n3 ]
Full List of Resources:
   * No resources
[...]
(there are resources configured on #1 and #3)

pcs cluster cib scope=nodes
<nodes>
   <node id="1" uname="n1"/>
   <node id="3" uname="n3"/>
   <node id="2" uname="n2"/>
</nodes>

Help me diagnose it please. Where should I look for the problem? (I have 
already tried a few things more - I see nothing helpful in log files, 
pcs --debug shows nothing suspicious, tried even editing the CIB manually)

Best regards,

Piotr Szafarczyk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20220712/496f7577/attachment.htm>


More information about the Users mailing list