[Pacemaker] OFFLINE node after cluster upgrade

ruslan usifov ruslan.usifov at gmail.com
Fri Feb 24 16:28:54 EST 2012


I solve this problem!


On one node in log i found follow error message.

slv009 ....   peer is not p art of our cluster

So i stop pacemaker in that host (i use v1 for pacemaker):

/etc/pacemaker stop
/etc/corosync stop


Then remove all cib info from /var/lib/heatbeat/crm and cleanup
/var/lib/pengine dir. thean restart clsuer on that node. And vuala  all
begin working as expected.


But i still have question why this happens??? Why nodes begin think that
other nodes are not the part of cluster???


2012/2/24 ruslan usifov <ruslan.usifov at gmail.com>

> Hello
>
> I have 3 nodes cluster setup. After upgrade OS, i get that one node
> parmanently on OFFLINE state.
>
>
> OS: ubuntu 10.0.4
> pacemaker: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
>
>
>
> on OFFLINE node i see in log follow:
>
> Feb 24 20:27:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:27:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:28:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:28:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:29:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:29:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_dc_release: DC role released
> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is
> now inactive
> Feb 24 20:30:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)
> Feb 24 20:30:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDING
> Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT
> cause=C_TIMER_POPPED origin=crm_
> timer_popped ]
> Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_elec
> tion_count_vote ]
>
>
> I have follow crm conf:
>
> node slv008
> node slv009
> node slv010
> primitive http_173.192.214.78_eth1 ocf:heartbeat:IPaddr2 \
>         params ip="173.192.214.78" nic="eth1:1" cidr_netmask="30" \
>         op monitor interval="10s"
> primitive http_nginx ocf:heartbeat:nginx \
>         op monitor interval="10s" timeout="120s"
> group http http_173.192.214.78_eth1 http_nginx \
>         meta target-role="Started" is-managed="true"
> property $id="cib-bootstrap-options" \
>         dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>         cluster-infrastructure="openais" \
>         expected-quorum-votes="3" \
>         stonith-enabled="false"
> rsc_defaults $id="rsc-options" \
>         resource-stickiness="100"
>
>
>
>
>
> Also i cant restart pacemaker on that node cleanly ie throw init.d script
> (it just hung and all)
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120225/4a653507/attachment-0003.html>


More information about the Pacemaker mailing list