[Pacemaker] No communication between nodes (setup problem)

Hans Bert dadeda2002 at yahoo.de
Wed Jan 30 10:06:31 UTC 2013


Hello,

we had to move from Fedora 16 to Fedora 18 and wanted to set up Corosync with Pacemaker and PCS as management tool.
With F16 our cluster was running pretty good, but with F18 after 5 days we are reaching the point were we don't have
got ideas what might be the problem(s).


The cluster is build of two servers (server1=192.168.100.111; server2=192.168.100.112)

Based on the Howto for F18 with pcs we created the following corosync.conf:

totem {
  version: 2
  secauth: off
  cluster_name: mcscluster
  transport: udpu
}

nodelist {
  node {
    ring0_addr: 192.168.100.111
  }
  node {
    ring0_addr: 192.168.100.112
  }
}

quorum {
  provider: corosync_votequorum
}

logging {
  fileline: off
  to_stderr: no
  to_logfile: yes
  to_syslog: yes
  logfile: /var/log/cluster/corosync.log
  debug: on
  timestamp: on
}



After we started the server a status check shows us:


[root at server1 corosync]#pcs status corosync

Membership information
----------------------
    Nodeid      Votes Name
1868867776          1 server1 (local)

[root at server1 ~]# pcs status
Last updated: Wed Jan 30 10:45:17 2013
Last change: Wed Jan 30 10:18:56 2013 via cibadmin on server1
Stack: corosync
Current DC: server1 (1868867776) - partition WITHOUT quorum
Version: 1.1.8-3.fc18-394e906
1 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ server1 ]

Full list of resources:



And on the other server:


[root at server2 corosync]# pcs status corosync

Membership information
----------------------
    Nodeid      Votes Name
1885644992          1 server2 (local)

[root at server2 corosync]# pcs status
Last updated: Wed Jan 30 10:44:40 2013
Last change: Wed Jan 30 10:19:36 2013 via cibadmin on server2
Stack: corosync
Current DC: server2 (1885644992) - partition WITHOUT quorum
Version: 1.1.8-3.fc18-394e906
1 Nodes configured, unknown expected votes
0 Resources configured.


Online: [ server2 ]






The only warnings and errors in the logfile are:

[root at server1 ~]# cat /var/log/cluster/corosync.log | egrep "warning|error"
Jan 30 10:25:59 [1608] server1       crmd:  warning: do_log:    FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Jan 30 10:25:59 [1607] server1    pengine:  warning: cluster_status:    We do not have quorum - fencing and resource management disabled
Jan 30 10:28:25 [1525] server1 corosync debug   [QUORUM] getinfo response error: 1
Jan 30 10:40:59 [1607] server1    pengine:  warning: cluster_status:    We do not have quorum - fencing and resource management disabled


root at server2 corosync]# cat /var/log/cluster/corosync.log | egrep "warning|error"
Jan 30 10:27:18 [1458] server2       crmd:  warning: do_log:    FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Jan 30 10:27:18 [1457] server2    pengine:  warning: cluster_status:    We do not have quorum - fencing and resource management disabled
Jan 30 10:29:19 [1349] server2 corosync debug   [QUORUM] getinfo response error: 1
Jan 30 10:42:18 [1457] server2    pengine:  warning: cluster_status:    We do not have quorum - fencing and resource management disabled
Jan 30 10:44:36 [1349] server2 corosync debug   [QUORUM] getinfo response error: 1




We have installed the following packages:

corosync-2.2.0-1.fc18.i686
corosynclib-2.2.0-1.fc18.i686
drbd-bash-completion-8.3.13-1.fc18.i686
drbd-pacemaker-8.3.13-1.fc18.i686
drbd-utils-8.3.13-1.fc18.i686
pacemaker-1.1.8-3.fc18.i686
pacemaker-cli-1.1.8-3.fc18.i686
pacemaker-cluster-libs-1.1.8-3.fc18.i686
pacemaker-libs-1.1.8-3.fc18.i686
pcs-0.9.27-3.fc18.i686



Firewalls are disabled, Pinging and SSH communication is working without any problems.

With best regards




More information about the Pacemaker mailing list