[Pacemaker] OFFLINE node after cluster upgrade

Carlos Molina dr.chamberlain at gmail.com
Thu Nov 8 13:50:05 UTC 2012


ruslan usifov <ruslan.usifov at ...> writes:

> 
> 
> I solve this problem!On one node in log i found follow error message.slv009
....   peer is not p art of our clusterSo i stop pacemaker in that host (i use
v1 for pacemaker):/etc/pacemaker stop
> /etc/corosync stop Then remove all cib info from /var/lib/heatbeat/crm and
cleanup /var/lib/pengine dir. thean restart clsuer on that node. And vuala  all
begin working as expected.But i still have question why this happens??? Why
nodes begin think that other nodes are not the part of cluster???
> 2012/2/24 ruslan usifov <ruslan.usifov at gmail.com>
> HelloI have 3 nodes cluster setup. After upgrade OS, i get that one node
parmanently on OFFLINE state.OS: ubuntu 10.0.4pacemaker:
1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50con OFFLINE node i see in log
follow:Feb 24 20:27:45 slv009 crmd: [9125]: info: do_dc_release: DC role
releasedFeb 24 20:27:45 slv009 crmd: [9125]: info: do_te_control: Transitioner
is now inactiveFeb 24 20:28:05 slv009 crmd: [9125]: info: crm_timer_popped:
Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
> 
> Feb 24 20:28:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
from crm_timer_popped() received in state S_PENDINGFeb 24 20:28:05 slv009 crmd:
[9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
> 
> timer_popped ]Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition:
State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_election_count_vote ]Feb 24 20:28:05 slv009 crmd: [9125]: info:
do_dc_release: DC role released
> 
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
inactiveFeb 24 20:28:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:28:25 slv009 crmd: [9125]:
WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
S_PENDING
> 
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
origin=crm_timer_popped ]Feb 24 20:28:25 slv009 crmd: [9125]: info:
do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
cause=C_FSA_INTERNAL origin=do_elec
> 
> tion_count_vote ]Feb 24 20:28:25 slv009 crmd: [9125]: info: do_dc_release: DC
role releasedFeb 24 20:28:25 slv009 crmd: [9125]: info: do_te_control:
Transitioner is now inactiveFeb 24 20:28:45 slv009 crmd: [9125]: info:
crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
> 
> Feb 24 20:28:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
from crm_timer_popped() received in state S_PENDINGFeb 24 20:28:45 slv009 crmd:
[9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
> 
> timer_popped ]Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition:
State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_election_count_vote ]Feb 24 20:28:45 slv009 crmd: [9125]: info:
do_dc_release: DC role released
> 
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
inactiveFeb 24 20:29:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:29:05 slv009 crmd: [9125]:
WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
S_PENDING
> 
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
origin=crm_timer_popped ]Feb 24 20:29:05 slv009 crmd: [9125]: info:
do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
cause=C_FSA_INTERNAL origin=do_elec
> 
> tion_count_vote ]Feb 24 20:29:05 slv009 crmd: [9125]: info: do_dc_release: DC
role releasedFeb 24 20:29:05 slv009 crmd: [9125]: info: do_te_control:
Transitioner is now inactiveFeb 24 20:29:25 slv009 crmd: [9125]: info:
crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
> 
> Feb 24 20:29:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
from crm_timer_popped() received in state S_PENDINGFeb 24 20:29:25 slv009 crmd:
[9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
> 
> timer_popped ]Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition:
State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_election_count_vote ]Feb 24 20:29:25 slv009 crmd: [9125]: info:
do_dc_release: DC role released
> 
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
inactiveFeb 24 20:29:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:29:45 slv009 crmd: [9125]:
WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
S_PENDING
> 
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
origin=crm_timer_popped ]Feb 24 20:29:45 slv009 crmd: [9125]: info:
do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
cause=C_FSA_INTERNAL origin=do_elec
> 
> tion_count_vote ]Feb 24 20:29:45 slv009 crmd: [9125]: info: do_dc_release: DC
role releasedFeb 24 20:29:45 slv009 crmd: [9125]: info: do_te_control:
Transitioner is now inactiveFeb 24 20:30:05 slv009 crmd: [9125]: info:
crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
> 
> Feb 24 20:30:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
from crm_timer_popped() received in state S_PENDINGFeb 24 20:30:05 slv009 crmd:
[9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
> 
> timer_popped ]Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition:
State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_election_count_vote ]I have follow crm conf:node slv008node slv009node
slv010primitive http_173.192.214.78_eth1 ocf:heartbeat:IPaddr2 \        params
ip="173.192.214.78" nic="eth1:1" cidr_netmask="30" \        op monitor
interval="10s"
> 
> primitive http_nginx ocf:heartbeat:nginx \        op monitor interval="10s"
timeout="120s"group http http_173.192.214.78_eth1 http_nginx \        meta
target-role="Started" is-managed="true"
> 
> property $id="cib-bootstrap-options" \       
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \       
cluster-infrastructure="openais" \        expected-quorum-votes="3" \
> 
>         stonith-enabled="false"rsc_defaults $id="rsc-options" \       
resource-stickiness="100"Also i cant restart pacemaker on that node cleanly ie
throw init.d script (it just hung and all) 
> 
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at ...
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


Thanks a lot, I was having this same issue after months of running a healthy
cluster, two of them became aware of only each other and forgot the rest.

This trick brought them all together again.






More information about the Pacemaker mailing list