[ClusterLabs] getting "Totem is unable to form a cluster" error
Muhammad Sharfuddin
M.Sharfuddin at nds.com.pk
Thu Apr 7 18:24:13 UTC 2016
pacemaker 1.1.12-11.12
openais 1.1.4-5.24.5
corosync 1.4.7-0.23.5
Its a two node active/passive cluster and we just upgraded the SLES 11
SP 3 to SLES 11 SP 4(nothing else) but when we try to start the cluster
service we get the following error:
"Totem is unable to form a cluster because of an operating system or
network fault."
Firewall is stopped and disabled on both the nodes. Both nodes can
ping/ssh/vnc each other.
corosync.conf:
aisexec {
group: root
user: root
}
service {
use_mgmtd: yes
use_logd: yes
ver: 0
name: pacemaker
}
totem {
rrp_mode: none
join: 60
max_messages: 20
vsftype: none
token: 5000
consensus: 6000
interface {
bindnetaddr: 192.168.150.0
member {
memberaddr: 192.168.150.12
}
member {
memberaddr: 192.168.150.13
}
mcastport: 5405
ringnumber: 0
}
secauth: off
version: 2
transport: udpu
token_retransmits_before_loss_const: 10
clear_node_high_bit: new
}
logging {
to_logfile: no
to_syslog: yes
debug: off
timestamp: off
to_stderr: no
fileline: off
syslog_facility: daemon
}
amf {
mode: disable
}
/var/log/messages:
Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Corosync Cluster Engine
('1.4.7'): started and ready to provide service.
Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Corosync built-in
features: nss
Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Successfully configured
openais services to load
Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Successfully read main
configuration file '/etc/corosync/corosync.conf'.
Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] Initializing transport
(UDP/IP Unicast).
Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] Initializing
transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] The network interface is
down.
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais cluster membership service B.01.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais event service B.01.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais checkpoint service B.01.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais availability management framework B.01.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais message service B.03.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais distributed locking service B.03.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
openais timer service A.01.01
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: process_ais_conf:
Reading configure
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_init:
Local handle: 7685269064754659330 for logging
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional logging options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'off' for option: debug
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'no' for option: to_logfile
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'yes' for option: to_syslog
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'daemon' for option: syslog_facility
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_init:
Local handle: 8535092201842016259 for quorum
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
No additional configuration supplied for: quorum
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt: No
default for option: provider
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_init:
Local handle: 8054506479773810692 for service
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: config_find_next:
Processing additional service options...
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found '0' for option: ver
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Defaulting to 'pcmk' for option: clustername
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'yes' for option: use_logd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: get_config_opt:
Found 'yes' for option: use_mgmtd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: pcmk_startup: CRM:
Initialized
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] Logging: Initialized
pcmk_startup
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: pcmk_startup:
Maximum core file size is: 18446744073709551615
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: pcmk_startup:
Service: 9
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: pcmk_startup: Local
hostname: prd1
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: pcmk_update_nodeid:
Local node id: 2130706433
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: update_member:
Creating entry for node 2130706433 born on 0
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: update_member:
0x64c9c0 Node 2130706433 now known as prd1 (was: (null))
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: update_member: Node
prd1 now has 1 quorum votes (was 0)
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: update_member: Node
2130706433/prd1 is now: member
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Using
uid=90 and group=90 for process cib
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8677 for process cib
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8678 for process stonith-ng
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8679 for process lrmd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Using
uid=90 and group=90 for process attrd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8680 for process attrd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Using
uid=90 and group=90 for process pengine
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8681 for process pengine
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Using
uid=90 and group=90 for process crmd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8682 for process crmd
Apr 6 17:51:49 prd1 corosync[8672]: [pcmk ] info: spawn_child: Forked
child 8683 for process mgmtd
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
Pacemaker Cluster Manager 1.1.12
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync extended virtual synchrony service
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync configuration service
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync cluster closed process group service v1.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync cluster config database access v1.01
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync profile loading service
Apr 6 17:51:49 prd1 corosync[8672]: [SERV ] Service engine loaded:
corosync cluster quorum service v0.1
Apr 6 17:51:49 prd1 corosync[8672]: [MAIN ] Compatibility mode set to
whitetank. Using V1 and V2 of the synchronization engine.
Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] adding new UDPU member
{192.168.150.12}
Apr 6 17:51:49 prd1 corosync[8672]: [TOTEM ] adding new UDPU member
{192.168.150.13}
Apr 6 17:51:50 prd1 lrmd[8679]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 mgmtd: [8683]: info: Pacemaker-mgmt Git Version:
969d213
Apr 6 17:51:50 prd1 mgmtd: [8683]: WARN: Core dumps could be lost if
multiple dumps occur.
Apr 6 17:51:50 prd1 mgmtd: [8683]: WARN: Consider setting non-default
value in /proc/sys/kernel/core_pattern (or equivalent) for maximum
supportability
Apr 6 17:51:50 prd1 mgmtd: [8683]: WARN: Consider setting
/proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum
supportability
Apr 6 17:51:50 prd1 attrd[8680]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 pengine[8681]: notice: crm_add_logfile:
Additional logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 attrd[8680]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Apr 6 17:51:50 prd1 cib[8677]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 crmd[8682]: notice: crm_add_logfile: Additional
logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 attrd[8680]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Recorded
connection 0x7f944c04acf0 for attrd/8680
Apr 6 17:51:50 prd1 crmd[8682]: notice: main: CRM Git Version: f47ea56
Apr 6 17:51:50 prd1 attrd[8680]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 attrd[8680]: notice: main: Starting mainloop...
Apr 6 17:51:50 prd1 stonith-ng[8678]: notice: crm_add_logfile:
Additional logging available in /var/log/pacemaker.log
Apr 6 17:51:50 prd1 stonith-ng[8678]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Apr 6 17:51:50 prd1 stonith-ng[8678]: notice: get_node_name:
Defaulting to uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Recorded
connection 0x658190 for stonith-ng/8678
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: update_member: Node
prd1 now has process list: 00000000000000000000000000151312 (1381138)
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Sending
membership update 0 to stonith-ng
Apr 6 17:51:50 prd1 stonith-ng[8678]: notice: get_node_name:
Defaulting to uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 cib[8677]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Apr 6 17:51:50 prd1 cib[8677]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Recorded
connection 0x65d450 for cib/8677
Apr 6 17:51:50 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Sending
membership update 0 to cib
Apr 6 17:51:50 prd1 cib[8677]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:50 prd1 cib[8677]: notice: crm_update_peer_state:
cib_peer_update_callback: Node prd1[2130706433] - state is now lost (was
(null))
Apr 6 17:51:50 prd1 cib[8677]: notice: crm_update_peer_state:
plugin_handle_membership: Node prd1[2130706433] - state is now member
(was lost)
Apr 6 17:51:50 prd1 mgmtd: [8683]: info: Started.
Apr 6 17:51:51 prd1 crmd[8682]: notice: crm_cluster_connect:
Connecting to cluster infrastructure: classic openais (with plugin)
Apr 6 17:51:51 prd1 crmd[8682]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:51 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Recorded
connection 0x661b00 for crmd/8682
Apr 6 17:51:51 prd1 corosync[8672]: [pcmk ] info: pcmk_ipc: Sending
membership update 0 to crmd
Apr 6 17:51:51 prd1 crmd[8682]: notice: get_node_name: Defaulting to
uname -n for the local classic openais (with plugin) node name
Apr 6 17:51:51 prd1 stonith-ng[8678]: notice: setup_cib: Watching for
stonith topology changes
Apr 6 17:51:51 prd1 stonith-ng[8678]: notice: crm_update_peer_state:
st_peer_update_callback: Node prd1[2130706433] - state is now lost (was
(null))
Apr 6 17:51:51 prd1 stonith-ng[8678]: notice: crm_update_peer_state:
plugin_handle_membership: Node prd1[2130706433] - state is now member
(was lost)
Apr 6 17:51:51 prd1 crmd[8682]: notice: crm_update_peer_state:
plugin_handle_membership: Node prd1[2130706433] - state is now member
(was (null))
Apr 6 17:51:51 prd1 crmd[8682]: notice: do_started: The local CRM is
operational
Apr 6 17:51:51 prd1 crmd[8682]: notice: do_state_transition: State
transition S_STARTING -> S_PENDING [ input=I_PENDING
cause=C_FSA_INTERNAL origin=do_started ]
Apr 6 17:51:51 prd1 stonith-ng[8678]: notice: unpack_config: On loss
of CCM Quorum: Ignore
Apr 6 17:52:12 prd1 crmd[8682]: warning: do_log: FSA: Input
I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING
Apr 6 17:52:35 prd1 corosync[8672]: [MAIN ] Totem is unable to form a
cluster because of an operating system or network fault. The most common
cause of this message is that the local firewall is configured improperly.
Apr 6 17:52:36 prd1 corosync[8672]: [MAIN ] Totem is unable to form a
cluster because of an operating system or network fault. The most common
cause of this message is that the local firewall is configured improperly.
--
Regards,
Muhammad Sharfuddin
<http://www.nds.com.pk>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160407/4fab02f8/attachment-0003.html>
More information about the Users
mailing list