[Pacemaker] Multiple thread after rebooting server: the node doesn't go online
Giovanni Di Milia
gdimilia at cfa.harvard.edu
Thu Nov 12 18:21:41 EST 2009
I set up a cluster of two servers CentOS 5.4 x86_64 with pacemaker
1.06 and corosync 1.1.2
I only installed the x86_64 packages (yum install pacemaker try to
install also the 32 bits one).
I configured a shared cluster IP (it's a public ip) and a cluster
website.
Everything work fine if i try to stop corosync on one of the two
servers (the services pass from one machine to the other without
problems), but if I reboot one server, when it returns alive it cannot
go online in the cluster.
I also noticed that there are several thread of corosync and if I kill
all of them and then I start again corosync, everything work fine again.
I don't know what is happening and I'm not able to reproduce the same
situation on some virtual servers!
Thanks,
Giovanni
the configuration of corosync is the following:
##############################################
# Please read the corosync.conf.5 manual page
compatibility: whitetank
aisexec {
# Run as root - this is necessary to be able to manage resources with
Pacemaker
user: root
group: root
}
service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
use_mgmtd: yes
use_logd: yes
}
totem {
version: 2
# How long before declaring a token lost (ms)
token: 5000
# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10
# How long to wait for join messages in the membership protocol (ms)
join: 1000
# How long to wait for consensus to be achieved before starting a new
round of membership configuration (ms)
consensus: 2500
# Turn off the virtual synchrony filter
vsftype: none
# Number of messages that may be sent by one processor on receipt of
the token
max_messages: 20
# Stagger sending the node join messages by 1..send_join ms
send_join: 45
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Disable encryption
secauth: off
# How many threads to use for encryption/decryption
threads: 0
# Optionally assign a fixed node id (integer)
# nodeid: 1234
interface {
ringnumber: 0
# The following values need to be set based on your environment
bindnetaddr: XXX.XXX.XXX.0 #here I put the right ip for my configuration
mcastaddr: 226.94.1.1
mcastport: 4000
}
}
logging {
fileline: off
to_stderr: yes
to_logfile: yes
to_syslog: yes
logfile: /tmp/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
##################################################
More information about the Pacemaker
mailing list