[Pacemaker] OFFLINE node after cluster upgrade

ruslan usifov ruslan.usifov at gmail.com
Mon Nov 19 13:51:11 EST 2012


I think now that this solution is not good:-))) The better way is
fully restart cluster. before you restart it you mast set it in
maintenance mode (in this mode pacemaker will not try manage already
started service, so you doesn't get downtime):

property maintenance-mode=true

2012/11/8 Carlos Molina <dr.chamberlain at gmail.com>:
> ruslan usifov <ruslan.usifov at ...> writes:
>
>>
>>
>> I solve this problem!On one node in log i found follow error message.slv009
> ....   peer is not p art of our clusterSo i stop pacemaker in that host (i use
> v1 for pacemaker):/etc/pacemaker stop
>> /etc/corosync stop Then remove all cib info from /var/lib/heatbeat/crm and
> cleanup /var/lib/pengine dir. thean restart clsuer on that node. And vuala  all
> begin working as expected.But i still have question why this happens??? Why
> nodes begin think that other nodes are not the part of cluster???
>> 2012/2/24 ruslan usifov <ruslan.usifov at gmail.com>
>> HelloI have 3 nodes cluster setup. After upgrade OS, i get that one node
> parmanently on OFFLINE state.OS: ubuntu 10.0.4pacemaker:
> 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50con OFFLINE node i see in log
> follow:Feb 24 20:27:45 slv009 crmd: [9125]: info: do_dc_release: DC role
> releasedFeb 24 20:27:45 slv009 crmd: [9125]: info: do_te_control: Transitioner
> is now inactiveFeb 24 20:28:05 slv009 crmd: [9125]: info: crm_timer_popped:
> Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
>>
>> Feb 24 20:28:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDINGFeb 24 20:28:05 slv009 crmd:
> [9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
> input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
>>
>> timer_popped ]Feb 24 20:28:05 slv009 crmd: [9125]: info: do_state_transition:
> State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_election_count_vote ]Feb 24 20:28:05 slv009 crmd: [9125]: info:
> do_dc_release: DC role released
>>
>> Feb 24 20:28:05 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
> inactiveFeb 24 20:28:25 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:28:25 slv009 crmd: [9125]:
> WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
> S_PENDING
>>
>> Feb 24 20:28:25 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
> origin=crm_timer_popped ]Feb 24 20:28:25 slv009 crmd: [9125]: info:
> do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
> cause=C_FSA_INTERNAL origin=do_elec
>>
>> tion_count_vote ]Feb 24 20:28:25 slv009 crmd: [9125]: info: do_dc_release: DC
> role releasedFeb 24 20:28:25 slv009 crmd: [9125]: info: do_te_control:
> Transitioner is now inactiveFeb 24 20:28:45 slv009 crmd: [9125]: info:
> crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
>>
>> Feb 24 20:28:45 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDINGFeb 24 20:28:45 slv009 crmd:
> [9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
> input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
>>
>> timer_popped ]Feb 24 20:28:45 slv009 crmd: [9125]: info: do_state_transition:
> State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_election_count_vote ]Feb 24 20:28:45 slv009 crmd: [9125]: info:
> do_dc_release: DC role released
>>
>> Feb 24 20:28:45 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
> inactiveFeb 24 20:29:05 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:29:05 slv009 crmd: [9125]:
> WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
> S_PENDING
>>
>> Feb 24 20:29:05 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
> origin=crm_timer_popped ]Feb 24 20:29:05 slv009 crmd: [9125]: info:
> do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
> cause=C_FSA_INTERNAL origin=do_elec
>>
>> tion_count_vote ]Feb 24 20:29:05 slv009 crmd: [9125]: info: do_dc_release: DC
> role releasedFeb 24 20:29:05 slv009 crmd: [9125]: info: do_te_control:
> Transitioner is now inactiveFeb 24 20:29:25 slv009 crmd: [9125]: info:
> crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
>>
>> Feb 24 20:29:25 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDINGFeb 24 20:29:25 slv009 crmd:
> [9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
> input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
>>
>> timer_popped ]Feb 24 20:29:25 slv009 crmd: [9125]: info: do_state_transition:
> State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_election_count_vote ]Feb 24 20:29:25 slv009 crmd: [9125]: info:
> do_dc_release: DC role released
>>
>> Feb 24 20:29:25 slv009 crmd: [9125]: info: do_te_control: Transitioner is now
> inactiveFeb 24 20:29:45 slv009 crmd: [9125]: info: crm_timer_popped: Election
> Trigger (I_DC_TIMEOUT) just popped (20000ms)Feb 24 20:29:45 slv009 crmd: [9125]:
> WARN: do_log: FSA: Input I_DC_TIMEOUT from crm_timer_popped() received in state
> S_PENDING
>>
>> Feb 24 20:29:45 slv009 crmd: [9125]: info: do_state_transition: State
> transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED
> origin=crm_timer_popped ]Feb 24 20:29:45 slv009 crmd: [9125]: info:
> do_state_transition: State transition S_ELECTION -> S_PENDING [ input=I_PENDING
> cause=C_FSA_INTERNAL origin=do_elec
>>
>> tion_count_vote ]Feb 24 20:29:45 slv009 crmd: [9125]: info: do_dc_release: DC
> role releasedFeb 24 20:29:45 slv009 crmd: [9125]: info: do_te_control:
> Transitioner is now inactiveFeb 24 20:30:05 slv009 crmd: [9125]: info:
> crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped (20000ms)
>>
>> Feb 24 20:30:05 slv009 crmd: [9125]: WARN: do_log: FSA: Input I_DC_TIMEOUT
> from crm_timer_popped() received in state S_PENDINGFeb 24 20:30:05 slv009 crmd:
> [9125]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [
> input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_
>>
>> timer_popped ]Feb 24 20:30:05 slv009 crmd: [9125]: info: do_state_transition:
> State transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
> origin=do_election_count_vote ]I have follow crm conf:node slv008node slv009node
> slv010primitive http_173.192.214.78_eth1 ocf:heartbeat:IPaddr2 \        params
> ip="173.192.214.78" nic="eth1:1" cidr_netmask="30" \        op monitor
> interval="10s"
>>
>> primitive http_nginx ocf:heartbeat:nginx \        op monitor interval="10s"
> timeout="120s"group http http_173.192.214.78_eth1 http_nginx \        meta
> target-role="Started" is-managed="true"
>>
>> property $id="cib-bootstrap-options" \
> dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
> cluster-infrastructure="openais" \        expected-quorum-votes="3" \
>>
>>         stonith-enabled="false"rsc_defaults $id="rsc-options" \
> resource-stickiness="100"Also i cant restart pacemaker on that node cleanly ie
> throw init.d script (it just hung and all)
>>
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at ...
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
> Thanks a lot, I was having this same issue after months of running a healthy
> cluster, two of them became aware of only each other and forgot the rest.
>
> This trick brought them all together again.
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list