[Pacemaker] Nodes unable to connect / find each other

Regendoerp, Achim Achim.Regendoerp at galacoral.com
Wed Mar 14 18:43:40 UTC 2012


Hi,

Below is a cut out from the tcpdump run on both boxes. The tcpdump is the same on both boxes.
The traffic only appears if I set the bindnetaddr in /etc/corosync/corosync.conf to the machines' individual IP instead of to 10.26.29.0 (as advised by howtos).
Having the latter IP as bindnetaddr, there's no traffic at all.

tcpdump -envv "port 5405"

18:39:58.293962 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:39:58.463288 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:58.463365 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 692c!] UDP, length 70
18:39:58.653150 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:58.653251 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 692a!] UDP, length 70
18:39:58.673924 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:39:58.843272 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:58.843367 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6928!] UDP, length 70
18:39:59.033082 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.033171 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6926!] UDP, length 70
18:39:59.053776 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:39:59.222927 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.223032 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6924!] UDP, length 70
18:39:59.412658 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.412758 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6922!] UDP, length 70
18:39:59.432230 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:39:59.602550 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.602706 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6920!] UDP, length 70
18:39:59.792582 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.792685 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 691e!] UDP, length 70
18:39:59.812182 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:39:59.982452 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:39:59.982541 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 691c!] UDP, length 70
18:40:00.172345 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:00.172491 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 691a!] UDP, length 70
18:40:00.192138 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:40:00.362559 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:00.362662 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6918!] UDP, length 70
18:40:00.552256 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:00.552372 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6916!] UDP, length 70
18:40:00.572941 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:40:00.742324 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:00.742498 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6914!] UDP, length 70
18:40:00.932293 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:00.932379 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.239.hpoms-dps-lstn > 10.26.29.238.netsupport: [bad udp cksum 6912!] UDP, length 70
18:40:00.952936 00:50:56:88:00:f3 > 01:00:5e:5e:01:01, ethertype IPv4 (0x0800), length 124: (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP (17), length 110)
    10.26.29.238.hpoms-dps-lstn > 226.94.1.1.netsupport: [udp sum ok] UDP, length 82
18:40:01.122346 00:50:56:88:00:f3 > 00:50:56:88:00:cd, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)
    10.26.29.238.hpoms-dps-lstn > 10.26.29.239.netsupport: [udp sum ok] UDP, length 70
18:40:01.122444 00:50:56:88:00:cd > 00:50:56:88:00:f3, ethertype IPv4 (0x0800), length 112: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 98)


Thanks,

Achim


From: David Coulson [mailto:david at davidcoulson.net]
Sent: 14 March 2012 18:28
To: The Pacemaker cluster resource manager
Subject: Re: [Pacemaker] Nodes unable to connect / find each other

run tcpdump on each side for port 5405. Do you see traffic from the other box?



On Mar 14, 2012, at 2:24 PM, Regendoerp, Achim wrote:


Greetings everyone,

I am currently experiencing some troubles getting two cluster nodes in a corosync + pacemaker set up to talk with / find each other and set up properly.

I have read various howtos out in the net as well as based on the set up I've done last year too, but to no avail.

Basically I have two VMs in a VLAN, which has all necessary ports open inside the VLAN, and both are supposed to be clustered for an NFS system (which will involve DRBD too).

So far the problem appears to be on the corosync / pacemaker level as in the nodes can see each other (based on what the logs state), yet running 'crm configure' and setting various properties, the error always is that the remote node cannot be found. I am not able to get past this problem, and am running out of ideas.

Short and FQDN hostname are all set up too.
'crm configure' is run as root which is also part of the 'haclient' group.

/etc/hosts.allow and hosts.deny do not have any entries preventing connections

I'd be grateful for any ideas / advise / help etc. :)
Please let me know if any further logs or missing configuration print outs are needed. Thanks.

The package versions are:

rpm -qa | egrep "corosync|pacemaker|cluster|resource"

corosynclib-1.4.1-4.el6.x86_64
clusterlib-3.0.12.1-23.el6.x86_64
pacemaker-cluster-libs-1.1.6-3.el6.x86_64
resource-agents-3.9.2-7.el6.x86_64
cluster-glue-libs-1.0.5-2.el6.x86_64
pacemaker-libs-1.1.6-3.el6.x86_64
corosync-1.4.1-4.el6.x86_64
pacemaker-cli-1.1.6-3.el6.x86_64
cluster-glue-1.0.5-2.el6.x86_64
pacemaker-1.1.6-3.el6.x86_64

Below are the various configuration files and log messages:

### /etc/corosync/corosync.conf ###

compatibility: whitetank

totem {
        version: 2
        secauth: off
        threads: 0
        join:   1000
        consensus: 7500
        max_messages: 20
        interface {
                ringnumber: 0
bindnetaddr: 10.26.29.0 # colleague set these to .238 and .239 respectively on the other node, which are their eth0 NICs to test if it makes any difference
mcastaddr: 226.94.1.1
mcastport: 5405
                #ttl: 1
        }
}

logging {
        fileline: off
        to_stderr: off
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: on
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}

### /etc/corosync/service.d/pcmk  ###

service
{
        name: pacemaker
        ver: 1
        use_mgmtd: no
        use_logd: yes
}

### /etc/sysconfig/pacemaker  (all default) ###

# Variables for running child daemons under valgrind and/or checking for memory problems
#export G_SLICE=always-malloc
#export MALLOC_PERTURB_=221 # or 0
#export MALLOC_CHECK_=3     # or 0,1,2
#export HA_valgrind_enabled=yes
#export HA_valgrind_enabled=cib,crmd
#export HA_callgrind_enabled=yes
#export HA_callgrind_enabled=cib,crmd
#export VALGRIND_OPTS="--leak-check=full --trace-children=no --num-callers=25 --log-file=/tmp/pacemaker-%p.valgrind"

# Variables that control logging
#export PCMK_trace_functions=
#export PCMK_trace_formats=
#export PCMK_trace_files=


### crm error ###

crm(live)configure# property stonith-enabled="false"
crm(live)configure# commit
Call cib_replace failed (-41): Remote node did not respond
<null>
ERROR: could not replace cib
INFO: offending xml: <configuration>
        <crm_config>
                <cluster_property_set id="cib-bootstrap-options">
                        <nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>
                </cluster_property_set>
        </crm_config>
        <nodes/>
        <resources/>
        <constraints/>
</configuration>

### netstat node 1 ###
netstat -tulpen
udp        0      0 10.26.29.238:5404           0.0.0.0:*                               0          141073     9501/corosync
udp        0      0 10.26.29.238:5405           0.0.0.0:*                               0          141074     9501/corosync
udp        0      0 226.94.1.1:5405             0.0.0.0:*                               0          141072     9501/corosync
(and same for node 2 with its respective ip 10.26.29.239)

netstat -nlpa | grep corosync

udp        0      0 10.26.29.238:5404           0.0.0.0:*                               9501/corosync
udp        0      0 10.26.29.238:5405           0.0.0.0:*                               9501/corosync
udp        0      0 226.94.1.1:5405             0.0.0.0:*                               9501/corosync
unix  2      [ ACC ]     STREAM     LISTENING     141067 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141236 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141225 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141200 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141161 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141152 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141132 9501/corosync       @corosync.ipc
unix  3      [ ]         STREAM     CONNECTED     141124 9501/corosync       @corosync.ipc
unix  2      [ ]         DGRAM                    141064 9501/corosync


### /var/log/messages ###

Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Corosync built-in features: nss dbus rdma snmp
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [TOTEM ] Initializing transport (UDP/IP Multicast).
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [TOTEM ] The network interface [10.26.29.238] is now up.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync extended virtual synchrony service
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync configuration service
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync cluster config database access v1.01
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync profile loading service
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [CPG   ] chosen downlist: sender r(0) ip(10.26.29.238) ; members(old:0 left:0)
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Completed service synchronization, ready to provide service.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [CPG   ] chosen downlist: sender r(0) ip(10.26.29.238) ; members(old:1 left:0)
Mar 14 18:01:23 wkse13p1xynfs01 corosync[9501]:   [MAIN  ] Completed service synchronization, ready to provide service.
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: Invoked: pacemakerd
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: config_find_next: No additional configuration supplied for: service
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: config_find_next: No additional configuration supplied for: quorum
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: No default for option: provider
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_cluster_type: Detected an active 'corosync' cluster
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: read_config: Reading configure for stack: corosync
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: config_find_next: Processing additional logging options...
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: Found 'on' for option: debug
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: Found 'yes' for option: to_logfile
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: Found 'yes' for option: to_syslog
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9514]: info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: main: Starting Pacemaker 1.1.6-3.el6 (Build: a02c0f19a00c1eb2527ad38f146ebc0834814558):  generated-manpages agent-manpages ascii-docs publican-docs ncurses trace-logging cman corosync-quorum corosync
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: main: Maximum core file size is: 18446744073709551615
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: 0x1f41220 Node 3994884618 now known as wkse13p1xynfs01 (was: (null))
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000000002 (was 00000000000000000000000000000000)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9521 for process stonith-ng
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000100002 (was 00000000000000000000000000000002)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9522 for process cib
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000100102 (was 00000000000000000000000000100002)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9523 for process lrmd
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000100112 (was 00000000000000000000000000100102)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9524 for process attrd
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000101112 (was 00000000000000000000000000100112)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9525 for process pengine
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000111112 (was 00000000000000000000000000101112)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: start_child: Forked child 9526 for process crmd
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs01 now has process list: 00000000000000000000000000111312 (was 00000000000000000000000000111112)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: main: Starting mainloop
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: 0x1f43510 Node 4011661834 now known as wkse13p1xynfs02 (was: (null))
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000000002 (was 00000000000000000000000000000000)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000100002 (was 00000000000000000000000000000002)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000100102 (was 00000000000000000000000000100002)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000100112 (was 00000000000000000000000000100102)
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000101112 (was 00000000000000000000000000100112)
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: G_main_add_SignalHandler: Added signal handler for signal 15
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000111112 (was 00000000000000000000000000101112)
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: Invoked: /usr/lib64/heartbeat/stonithd
Mar 14 18:01:29 wkse13p1xynfs01 pacemakerd: [9517]: info: update_node_processes: Node wkse13p1xynfs02 now has process list: 00000000000000000000000000111312 (was 00000000000000000000000000111112)
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: G_main_add_TriggerHandler: Added signal manual handler
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: enabling coredumps
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: G_main_add_SignalHandler: Added signal handler for signal 10
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: G_main_add_SignalHandler: Added signal handler for signal 12
Mar 14 18:01:29 wkse13p1xynfs01 lrmd: [9523]: info: Started.
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: get_cluster_type: Cluster type is: 'corosync'
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: notice: crm_cluster_connect: Connecting to cluster infrastructure: corosync
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: Invoked: /usr/lib64/heartbeat/attrd
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: validate_with_relaxng: Creating RNG parser context
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: main: Starting up
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: get_cluster_type: Cluster type is: 'corosync'
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: notice: crm_cluster_connect: Connecting to cluster infrastructure: corosync
Mar 14 18:01:29 wkse13p1xynfs01 crmd: [9526]: info: Invoked: /usr/lib64/heartbeat/crmd
Mar 14 18:01:29 wkse13p1xynfs01 crmd: [9526]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Mar 14 18:01:29 wkse13p1xynfs01 pengine: [9525]: info: Invoked: /usr/lib64/heartbeat/pengine
Mar 14 18:01:29 wkse13p1xynfs01 pengine: [9525]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster
Mar 14 18:01:29 wkse13p1xynfs01 crmd: [9526]: info: main: CRM Hg Version: a02c0f19a00c1eb2527ad38f146ebc0834814558
Mar 14 18:01:29 wkse13p1xynfs01 crmd: [9526]: info: crmd_init: Starting crmd
Mar 14 18:01:29 wkse13p1xynfs01 crmd: [9526]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Mar 14 18:01:29 wkse13p1xynfs01 pengine: [9525]: info: main: Starting pengine
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: init_ais_connection_once: Connection to 'corosync': established
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: crm_new_peer: Node wkse13p1xynfs01 now has id: 3994884618
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: crm_new_peer: Node 3994884618 is now known as wkse13p1xynfs01
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: main: Starting stonith-ng mainloop
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: crm_update_peer: Node wkse13p1xynfs01: id=3994884618 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00000000000000000000000000111312 (new)
Mar 14 18:01:29 wkse13p1xynfs01 stonith-ng: [9521]: info: crm_new_peer: Node 0 is now known as wkse13p1xynfs02
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: init_ais_connection_once: Connection to 'corosync': established
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: crm_new_peer: Node wkse13p1xynfs01 now has id: 3994884618
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: crm_new_peer: Node 3994884618 is now known as wkse13p1xynfs01
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: main: Cluster connection active
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: main: Accepting attribute updates
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: notice: main: Starting mainloop...
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: crm_update_peer: Node wkse13p1xynfs01: id=3994884618 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00000000000000000000000000111312 (new)
Mar 14 18:01:29 wkse13p1xynfs01 attrd: [9524]: info: crm_new_peer: Node 0 is now known as wkse13p1xynfs02
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: startCib: CIB Initialization completed successfully
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: get_cluster_type: Cluster type is: 'corosync'
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: notice: crm_cluster_connect: Connecting to cluster infrastructure: corosync
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: init_ais_connection_once: Connection to 'corosync': established
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: crm_new_peer: Node wkse13p1xynfs01 now has id: 3994884618
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: crm_new_peer: Node 3994884618 is now known as wkse13p1xynfs01
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: cib_init: Starting cib mainloop
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: crm_update_peer: Node wkse13p1xynfs01: id=3994884618 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00000000000000000000000000111312 (new)
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: crm_new_peer: Node 0 is now known as wkse13p1xynfs02
Mar 14 18:01:29 wkse13p1xynfs01 cib: [9522]: info: Managed write_cib_contents process 9530 exited with return code 0.
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: do_cib_control: CIB connection established
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: get_cluster_type: Cluster type is: 'corosync'
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: notice: crm_cluster_connect: Connecting to cluster infrastructure: corosync
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: init_ais_connection_once: Connection to 'corosync': established
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: crm_new_peer: Node wkse13p1xynfs01 now has id: 3994884618
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: crm_new_peer: Node 3994884618 is now known as wkse13p1xynfs01
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: ais_status_callback: status: wkse13p1xynfs01 is now unknown
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: init_quorum_connection: Configuring Pacemaker to obtain quorum from Corosync
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: notice: init_quorum_connection: Quorum acquired
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: do_ha_control: Connected to the cluster
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: do_started: Delaying start, no membership data (0000000000100000)
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: crmd_init: Starting crmd's mainloop
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: config_query_callback: Shutdown escalation occurs after: 1200000ms
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: config_query_callback: Checking for expired actions every 900000ms
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: notice: crmd_peer_update: Status update: Client wkse13p1xynfs01/crmd now has status [online] (DC=<null>)
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: crm_update_peer: Node wkse13p1xynfs01: id=3994884618 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00000000000000000000000000111312 (new)
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: crm_new_peer: Node 0 is now known as wkse13p1xynfs02
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: ais_status_callback: status: wkse13p1xynfs02 is now unknown
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: pcmk_quorum_notification: Membership 0: quorum retained (0)
Mar 14 18:01:30 wkse13p1xynfs01 crmd: [9526]: info: do_started: Delaying start, no membership data (0000000000100000)
Mar 14 18:01:34 wkse13p1xynfs01 attrd: [9524]: info: cib_connect: Connected to the CIB after 1 signon attempts
Mar 14 18:01:34 wkse13p1xynfs01 attrd: [9524]: info: cib_connect: Sending full refresh



Cheers,

Achim
_______________________________________________
Pacemaker mailing list: Pacemaker at oss.clusterlabs.org<mailto:Pacemaker at oss.clusterlabs.org>
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120314/a7304fe2/attachment.htm>


More information about the Pacemaker mailing list