[Pacemaker] crmd - set dc unset dc loop

Philippe Carbonnier Philippe.Carbonnier at vif.fr
Thu Jan 19 06:21:16 EST 2012


Hello Andreas, and others,

I'm sorry not having well explain my problem (I'm french and don't speak 
english fluently...).
The loop has arrived after putting ujoss1 (not ujboss2) standy, this is 
why I put more log for ujboss1 and not ujboss2.

I'll try to answer your questions :
the network setup is simple : ujboss1 ans ujboss2 are on the same lan, 
without firewall between. Perhaps it's not what you was looking for?

corosync.conf :
aisexec {
         user:   root
         group:  root
}
service {
         name: pacemaker
         ver:  0
}
compatibility: whitetank
totem {
         version: 2
         secauth: on
         threads: 0
         interface {
                 ringnumber: 0
                 bindnetaddr: 128.1.0.0
                 mcastaddr: 226.94.1.1
                 mcastport: 5405
         }
}
logging {
         fileline: off
         to_stderr: no
         to_logfile: yes
         to_syslog: no
         logfile: /var/log/cluster/corosync.log
         debug: off
         timestamp: on
         logger_subsys {
                 subsys: AMF
                 debug: off
         }
}
amf {
         mode: disabled
}

and crm configure show :
node ujbossterrena1 \
         attributes standby="off"
node ujbossterrena2 \
         attributes standby="off"
primitive clusterIP ocf:heartbeat:IPaddr2 \
         params ip="128.1.13.9" cidr_netmask="32" iflabel="jbossfailover" \
         op monitor interval="30s"
primitive routing-jboss lsb:routing-jboss \
         op monitor interval="30s"
group vifGroup clusterIP routing-jboss \
         meta target-role="Started"
location prefer-clusterIP clusterIP 50: ujboss1
property $id="cib-bootstrap-options" \
         dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         stonith-enabled="false" \
         no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
         resource-stickiness="20"


and the previous log for ujboss2:
Jan 17 12:51:46 corosync [TOTEM ] A processor failed, forming new 
configuration.
Jan 17 12:51:47 corosync [pcmk  ] notice: pcmk_peer_update: Transitional 
membership event on ring 564: memb=1, new=0, lost=1
Jan 17 12:51:47 corosync [pcmk  ] info: pcmk_peer_update: memb: 
ujbossterrena2 51184000
Jan 17 12:51:47 corosync [pcmk  ] info: pcmk_peer_update: lost: 
ujbossterrena1 34406784
Jan 17 12:51:47 corosync [pcmk  ] notice: pcmk_peer_update: Stable 
membership event on ring 564: memb=1, new=0, lost=0
Jan 17 12:51:47 corosync [pcmk  ] info: pcmk_peer_update: MEMB: 
ujbossterrena2 51184000
Jan 17 12:51:47 corosync [pcmk  ] info: ais_mark_unseen_peer_dead: Node 
ujbossterrena1 was not seen in the previous transition
Jan 17 12:51:47 corosync [pcmk  ] info: update_member: Node 
34406784/ujbossterrena1 is now: lost
Jan 17 12:51:47 corosync [pcmk  ] info: send_member_notification: 
Sending membership update 564 to 2 children
Jan 17 12:51:47 corosync [TOTEM ] A processor joined or left the 
membership and a new membership was formed.
Jan 17 12:51:47 corosync [MAIN  ] Completed service synchronization, 
ready to provide service.
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: notice: ais_dispatch: 
Membership 564: quorum lost
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: crm_update_peer: 
Node ujbossterrena1: id=34406784 state=lost (new) addr=r(
0) ip(128.1.13.2)  votes=1 born=488 seen=556 
proc=00000000000000000000000000013312
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: WARN: check_dead_member: 
Our DC node (ujbossterrena1) left the cluster
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: notice: ais_dispatch: 
Membership 564: quorum lost
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: crm_update_peer: Node 
ujbossterrena1: id=34406784 state=lost (new) addr=r(0
) ip(128.1.13.2)  votes=1 born=488 seen=556 
proc=00000000000000000000000000013312
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
State transition S_NOT_DC -> S_ELECTION [ input=I_ELE
CTION cause=C_FSA_INTERNAL origin=check_dead_member ]
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: update_dc: Unset DC 
ujbossterrena1
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
State transition S_ELECTION -> S_INTEGRATION [ input=
I_ELECTION_DC cause=C_FSA_INTERNAL origin=do_election_check ]
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_te_control: 
Registering TE UUID: 9a4e45a9-9436-4a23-b4e9-9875ba334906
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: WARN: 
cib_client_add_notify_callback: Callback already present
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: set_graph_functions: 
Setting custom graph functions
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: unpack_graph: 
Unpacked transition -1: 0 actions in 0 synapses
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_dc_takeover: 
Taking over DC status for this partition
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: 
cib_process_readwrite: We are now in R/W mode
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_master for section 'all' (o
rigin=local/crmd/73, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section cib (ori
gin=local/crmd/74, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_conf
ig (origin=local/crmd/76, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: join_make_offer: 
Making join offers based on membership 564
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_conf
ig (origin=local/crmd/78, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: 
do_dc_join_offer_all: join-3: Waiting on 1 outstanding join acks
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: ais_dispatch: 
Membership 564: quorum still lost
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: crm_ais_dispatch: 
Setting expected votes to 2
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: 
config_query_callback: Checking for expired actions every 900000ms
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: 
config_query_callback: Sending expected-votes=2 to corosync
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_conf
ig (origin=local/crmd/81, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: update_dc: Set DC to 
ujbossterrena2 (3.0.1)
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: ais_dispatch: 
Membership 564: quorum still lost
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: crm_ais_dispatch: 
Setting expected votes to 2
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_conf
ig (origin=local/crmd/84, version=0.233.5): ok (rc=0)
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
State transition S_INTEGRATION -> S_FINALIZE_JOIN [ i
nput=I_INTEGRATED cause=C_FSA_INTERNAL origin=check_join_state ]
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
All 1 cluster nodes responded to the join offer.
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: do_dc_join_finalize: 
join-3: Syncing the CIB from ujbossterrena2 to the re
st of the cluster
Jan 17 12:51:47 ujbossterrena2 crmd: [18796]: info: te_connect_stonith: 
Attempting connection to fencing daemon...
Jan 17 12:51:47 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_sync for section 'all' (ori
gin=local/crmd/85, version=0.233.5): ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: te_connect_stonith: 
Connected
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section nodes (o
rigin=local/crmd/86, version=0.233.5): ok (rc=0)
Jan 17 12:51:48 corosync [pcmk  ] notice: pcmk_peer_update: Transitional 
membership event on ring 568: memb=1, new=0, lost=0
Jan 17 12:51:48 corosync [pcmk  ] info: pcmk_peer_update: memb: 
ujbossterrena2 51184000
Jan 17 12:51:48 corosync [pcmk  ] notice: pcmk_peer_update: Stable 
membership event on ring 568: memb=2, new=1, lost=0
Jan 17 12:51:48 corosync [pcmk  ] info: update_member: Node 
34406784/ujbossterrena1 is now: member
Jan 17 12:51:48 corosync [pcmk  ] info: pcmk_peer_update: NEW:  
ujbossterrena1 34406784
Jan 17 12:51:48 corosync [pcmk  ] info: pcmk_peer_update: MEMB: 
ujbossterrena1 34406784
Jan 17 12:51:48 corosync [pcmk  ] info: pcmk_peer_update: MEMB: 
ujbossterrena2 51184000
Jan 17 12:51:48 corosync [pcmk  ] info: send_member_notification: 
Sending membership update 568 to 2 children
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: notice: ais_dispatch: 
Membership 568: quorum acquired
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: ais_status_callback: 
status: ujbossterrena1 is now member (was lost)
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: crm_update_peer: 
Node ujbossterrena1: id=34406784 state=member (new) addr=
r(0) ip(128.1.13.2)  votes=1 born=488 seen=568 
proc=00000000000000000000000000013312
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: notice: ais_dispatch: 
Membership 568: quorum acquired
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: crm_update_peer: Node 
ujbossterrena1: id=34406784 state=member (new) addr=r
(0) ip(128.1.13.2)  votes=1 born=488 seen=568 
proc=00000000000000000000000000013312
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_delete for section //node_s
tate[@uname='ujbossterrena1']/lrm (origin=local/crmd/87, 
version=0.233.6): ok (rc=0)
Jan 17 12:51:48 corosync [TOTEM ] A processor joined or left the 
membership and a new membership was formed.
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: crm_update_quorum: 
Updating quorum status to true (call=91)
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_delete for section //node_s
tate[@uname='ujbossterrena1']/transient_attributes 
(origin=local/crmd/88, version=0.233.7): ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section nodes (o
rigin=local/crmd/89, version=0.233.7): ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: log_data_element: 
cib:diff: - <cib dc-uuid="ujbossterrena1" admin_epoch="0"
  epoch="233" num_updates="7" />
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: log_data_element: 
cib:diff: + <cib dc-uuid="ujbossterrena2" admin_epoch="0"
  epoch="234" num_updates="1" />
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section cib (ori
gin=local/crmd/91, version=0.234.1): ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: crm_ais_dispatch: 
Setting expected votes to 2
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: 
erase_xpath_callback: Deletion of "//node_state[@uname='ujbossterrena1']/l
rm": ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: 
erase_xpath_callback: Deletion of "//node_state[@uname='ujbossterrena1']/t
ransient_attributes": ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_conf
ig (origin=local/crmd/93, version=0.234.1): ok (rc=0)
Jan 17 12:51:48 ujbossterrena2 cib: [20766]: info: write_cib_contents: 
Archived previous version as /var/lib/heartbeat/crm/cib
-88.raw
Jan 17 12:51:48 ujbossterrena2 cib: [20766]: info: write_cib_contents: 
Wrote version 0.234.0 of the CIB to disk (digest: 62f91
5518fc318be93098daf9a5fecca)
Jan 17 12:51:48 ujbossterrena2 cib: [20766]: info: retrieveCib: Reading 
cluster configuration from: /var/lib/heartbeat/crm/cib
.4oDZVT (digest: /var/lib/heartbeat/crm/cib.i5iINe)
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: WARN: cib_process_diff: 
Diff 0.233.5 -> 0.234.1 not applied to 0.234.1: current "
epoch" is greater than required
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_lrm_rsc_op: 
Performing key=8:8777:0:39671e48-9519-4b61-b781-2efcd379df7
a op=clusterIP_start_0 )
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_lrm_rsc_op: 
Discarding attempt to perform action start on clusterIP in
state S_FINALIZE_JOIN
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: send_direct_ack: 
ACK'ing resource op clusterIP_start_0 from 8:8777:0:39671
e48-9519-4b61-b781-2efcd379df7a: lrm_invoke-lrmd-1326801108-25
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: process_te_message: 
Processing (N)ACK lrm_invoke-lrmd-1326801108-25 from u
jbossterrena2
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: process_graph_event: 
Action clusterIP_start_0/8 (4:99;8:8777:0:39671e48-95
19-4b61-b781-2efcd379df7a) initiated by a different transitioner
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: 
abort_transition_graph: process_graph_event:456 - Triggered transition abo
rt (complete=1, tag=lrm_rsc_op, id=clusterIP_start_0, 
magic=4:99;8:8777:0:39671e48-9519-4b61-b781-2efcd379df7a) : Foreign even
t
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: update_dc: Unset DC 
ujbossterrena2
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: WARN: do_log: FSA: Input 
I_RELEASE_DC from do_election_count_vote() received in
state S_FINALIZE_JOIN
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
State transition S_FINALIZE_JOIN -> S_RELEASE_DC [ in
put=I_RELEASE_DC cause=C_FSA_INTERNAL origin=do_election_count_vote ]
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_dc_release: DC 
role released
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: 
pe_connection_destroy: Connection to the Policy Engine released
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_te_control: 
Transitioner is now inactive
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_te_control: 
Disconnecting STONITH...
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: 
tengine_stonith_connection_destroy: Fencing daemon disconnected
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: notice: Not currently 
connected.
Jan 17 12:51:48 ujbossterrena2 crmd: [18796]: info: do_state_transition: 
State transition S_RELEASE_DC -> S_PENDING [ input=I_
RELEASE_SUCCESS cause=C_FSA_INTERNAL origin=do_dc_release ]
Jan 17 12:51:48 ujbossterrena2 cib: [18792]: info: 
cib_process_readwrite: We are now in R/O mode
Jan 17 12:51:49 corosync [MAIN  ] Completed service synchronization, 
ready to provide service.
Jan 17 12:51:49 ujbossterrena2 crmd: [18796]: info: update_dc: Set DC to 
ujbossterrena1 (3.0.1)
Jan 17 12:51:49 ujbossterrena2 crmd: [18796]: info: update_dc: Unset DC 
ujbossterrena1
Jan 17 12:51:49 ujbossterrena2 crmd: [18796]: info: update_dc: Set DC to 
ujbossterrena1 (3.0.1)
Jan 17 12:51:49 ujbossterrena2 crmd: [18796]: info: update_dc: Unset DC 
ujbossterrena1
Jan 17 12:51:49 ujbossterrena2 crmd: [18796]: info: update_dc: Set DC to 
ujbossterrena1 (3.0.1)

Best regards,
Philippe




More information about the Pacemaker mailing list