[Pacemaker] Pacemaker failover problem

Erich Weiler weiler at soe.ucsc.edu
Tue Mar 9 15:04:45 UTC 2010


Thanks for the reply!  Yes, I have checked that my LSB scripts are 
compliant.  If this can provide any insight, here is a clip from 
/var/log/messsages on genome-ldap2 when genome-ldap1 goes down (when the 
order constraint is in place):

Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: 
handle_shutdown_request: Creating shutdown request for genome-ldap1 
(state=S_IDLE)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: abort_transition_graph: 
te_update_diff:146 - Triggered transition abort (complete=1, 
tag=transient_attributes, id=genome-ldap1, magic=NA, cib=0.117.3) : 
Transient attribute: update
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition: 
All 2 cluster nodes are eligible to run resources.
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query 
126: Requesting the current CIB: S_POLICY_ENGINE
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback: 
Invoking the PE: query=126, ref=pe_calc-dc-1268146762-101, seq=304, 
quorate=1
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: unpack_config: On 
loss of CCM Quorum: Ignore
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: unpack_config: Node 
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap2 is online
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap1 is shutting down
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: native_print: 
LDAP-IP     (ocf::heartbeat:IPaddr2):       Started genome-ldap1
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: clone_print: 
Clone Set: LDAP-clone
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: short_print: 
Started: [ genome-ldap2 genome-ldap1 ]
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: get_failcount: 
LDAP-clone has failed 1 times on genome-ldap2
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: 
common_apply_stickiness: LDAP-clone can fail 999999 more times on 
genome-ldap2 before being forced off
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:1 cannot run anywhere
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: RecurringOp: 
Start recurring monitor (30s) for LDAP-IP on genome-ldap2
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: stage6: Scheduling 
Node genome-ldap1 for shutdown
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Move 
resource LDAP-IP (Started genome-ldap1 -> genome-ldap2)
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: 
Restart resource LDAP:0       (Started genome-ldap2)
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: notice: LogActions: Stop 
resource LDAP:1  (genome-ldap1)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked 
transition 65: 12 actions in 12 synapses
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_te_invoke: 
Processing graph 65 (ref=pe_calc-dc-1268146762-101) derived from 
/var/lib/pengine/pe-warn-98.bz2
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 16 fired and confirmed
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 13: stop LDAP:1_stop_0 on genome-ldap1
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: WARN: process_pe_message: 
Transition 65: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-98.bz2
Mar  9 06:59:22 genome-ldap2 pengine: [2037]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP:1_stop_0 (13) confirmed on genome-ldap1 (rc=0)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 17 fired and confirmed
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 8: stop LDAP-IP_stop_0 on genome-ldap1
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP-IP_stop_0 (8) confirmed on genome-ldap1 (rc=0)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 9: start LDAP-IP_start_0 on genome-ldap2 (local)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op: 
Performing key=9:65:0:0cc80735-d478-48c0-8260-02b627bed719 
op=LDAP-IP_start_0 )
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:12: start
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 4 fired and confirmed
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_crm_command: 
Executing crm-event (20): do_shutdown on genome-ldap1
Mar  9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip -f inet addr add 
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip -f inet addr add 
10.1.1.83/16 brd 10.1.255.255 dev eth0
Mar  9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: ip link set eth0 up
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: ip link set eth0 up
Mar  9 06:59:22 genome-ldap2 IPaddr2[4277]: INFO: 
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p 
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83 
auto not_used not_used
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:start:stderr) 2010/03/09_06:59:22 INFO: 
/usr/lib64/heartbeat/send_arp -i 200 -r 5 -p 
/var/run/heartbeat/rsctmp/send_arp/send_arp-10.1.1.83 eth0 10.1.1.83 
auto not_used not_used
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP-IP_start_0 (call=12, rc=0, cib-update=127, confirmed=true) ok
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP-IP_start_0 (9) confirmed on genome-ldap2 (rc=0)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 10: monitor LDAP-IP_monitor_30000 on genome-ldap2 (local)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op: 
Performing key=10:65:0:0cc80735-d478-48c0-8260-02b627bed719 
op=LDAP-IP_monitor_30000 )
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:13: monitor
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 14 fired and confirmed
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 12: start LDAP:0_start_0 on genome-ldap2 (local)
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op: 
Performing key=12:65:0:0cc80735-d478-48c0-8260-02b627bed719 
op=LDAP:0_start_0 )
Mar  9 06:59:22 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:14: start
Mar  9 06:59:22 genome-ldap2 lrmd: [4333]: WARN: For LSB init script, no 
additional parameters are needed.
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP-IP_monitor_30000 (call=13, rc=0, cib-update=128, 
confirmed=false) ok
Mar  9 06:59:22 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP-IP_monitor_30000 (10) confirmed on genome-ldap2 (rc=0)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) Checking configuration files for slapd:
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stderr) config file testing succeeded
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) [
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout)   OK  ]
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) Starting slapd:
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) [
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) FAILED
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout) ]
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:start:stdout)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP:0_start_0 (call=14, rc=1, cib-update=129, confirmed=true) 
unknown error
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: WARN: status_from_rc: Action 
12 (LDAP:0_start_0) on genome-ldap2 failed (target: 0 vs. rc: 1): Error
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: WARN: update_failcount: 
Updating failcount for LDAP:0 on genome-ldap2 after failed start: rc=1 
(update=INFINITY, time=1268146763)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph: 
match_graph_event:272 - Triggered transition abort (complete=0, 
tag=lrm_rsc_op, id=LDAP:0_start_0, 
magic=0:1;12:65:0:0cc80735-d478-48c0-8260-02b627bed719, cib=0.117.8) : 
Event failed
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority: 
Abort priority upgraded from 0 to 1
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority: 
Abort action done superceeded by restart
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP:0_start_0 (12) confirmed on genome-ldap2 (rc=4)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 15 fired and confirmed
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph: 
====================================================
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition 
65 (Complete=11, Pending=0, Fired=0, Skipped=1, Incomplete=0, 
Source=/var/lib/pengine/pe-warn-98.bz2): Stopped
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger: 
Transition 65 is now complete
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ 
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
All 2 cluster nodes are eligible to run resources.
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query 
130: Requesting the current CIB: S_POLICY_ENGINE
Mar  9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update: 
Sending flush op to all hosts for: fail-count-LDAP:0 (INFINITY)
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: 
cib_process_shutdown_req: Shutdown REQ from genome-ldap1
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request: 
Operation complete: op cib_shutdown_req for section 'all' 
(origin=genome-ldap1/genome-ldap1/(null), version=0.117.8): ok (rc=0)
Mar  9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update: 
Sent update 31: fail-count-LDAP:0=INFINITY
Mar  9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_trigger_update: 
Sending flush op to all hosts for: last-failure-LDAP:0 (1268146763)
Mar  9 06:59:23 genome-ldap2 attrd: [2036]: info: attrd_perform_update: 
Sent update 33: last-failure-LDAP:0=1268146763
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback: 
Invoking the PE: query=130, ref=pe_calc-dc-1268146763-108, seq=304, 
quorate=1
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On 
loss of CCM Quorum: Ignore
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node 
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph: 
te_update_diff:146 - Triggered transition abort (complete=1, 
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.9) : 
Transient attribute: update
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap2 is online
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph: 
te_update_diff:146 - Triggered transition abort (complete=1, 
tag=transient_attributes, id=genome-ldap2, magic=NA, cib=0.117.10) : 
Transient attribute: update
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op: 
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query 
131: Requesting the current CIB: S_POLICY_ENGINE
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap1 is shutting down
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query 
132: Requesting the current CIB: S_POLICY_ENGINE
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print: 
LDAP-IP     (ocf::heartbeat:IPaddr2):       Started genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print: 
Clone Set: LDAP-clone
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print: 
  LDAP:0 (lsb:ldap):     Started genome-ldap2 FAILED
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print: 
Stopped: [ LDAP:1 ]
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount: 
LDAP-clone has failed 1 times on genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: 
common_apply_stickiness: LDAP-clone can fail 999999 more times on 
genome-ldap2 before being forced off
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:1 cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: RecurringOp: 
Start recurring monitor (10s) for LDAP:0 on genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling 
Node genome-ldap1 for shutdown
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave 
resource LDAP-IP        (Started genome-ldap2)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: 
Recover resource LDAP:0       (Started genome-ldap2)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave 
resource LDAP:1 (Stopped)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback: 
Invoking the PE: query=132, ref=pe_calc-dc-1268146763-109, seq=304, 
quorate=1
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: handle_response: 
pe_calc calculation pe_calc-dc-1268146763-108 is obsolete
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message: 
Transition 66: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-99.bz2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On 
loss of CCM Quorum: Ignore
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node 
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap2 is online
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op: 
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap1 is shutting down
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print: 
LDAP-IP     (ocf::heartbeat:IPaddr2):       Started genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print: 
Clone Set: LDAP-clone
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print: 
  LDAP:0 (lsb:ldap):     Started genome-ldap2 FAILED
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print: 
Stopped: [ LDAP:1 ]
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount: 
LDAP-clone has failed 1000000 times on genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: 
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after 
1000000 failures (max=1000000)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:1 cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:0 cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP-IP cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: stage6: Scheduling 
Node genome-ldap1 for shutdown
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop 
resource LDAP-IP (genome-ldap2)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop 
resource LDAP:0  (genome-ldap2)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave 
resource LDAP:1 (Stopped)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked 
transition 67: 6 actions in 6 synapses
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke: 
Processing graph 67 (ref=pe_calc-dc-1268146763-109) derived from 
/var/lib/pengine/pe-warn-100.bz2
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 10 fired and confirmed
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_crm_command: 
Executing crm-event (14): do_shutdown on genome-ldap1
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 2: stop LDAP:0_stop_0 on genome-ldap2 (local)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op: 
Performing key=2:67:0:0cc80735-d478-48c0-8260-02b627bed719 
op=LDAP:0_stop_0 )
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP:0:15: stop
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP:0_monitor_10000 (call=11, status=1, cib-update=0, 
confirmed=true) Cancelled
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] CLM 
CONFIGURATION CHANGE
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] New Configuration:
Mar  9 06:59:23 genome-ldap2 cib: [2034]: notice: ais_dispatch: 
Membership 308: quorum lost
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: notice: ais_dispatch: 
Membership 308: quorum lost
Mar  9 06:59:23 genome-ldap2 lrmd: [4390]: WARN: For LSB init script, no 
additional parameters are needed.
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ]         r(0) 
ip(10.1.1.85)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message: 
Transition 67: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-100.bz2
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: crm_update_peer: Node 
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84) 
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: ais_status_callback: 
status: genome-ldap1 is now lost (was member)
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] Members Left:
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_peer: Node 
genome-ldap1: id=1409351946 state=lost (new) addr=r(0) ip(10.1.1.84) 
votes=1 born=304 seen=304 proc=00000000000000000000000000013312
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ]         r(0) 
ip(10.1.1.84)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: erase_node_from_join: 
Removed node genome-ldap1 from join calculations: welcomed=0 itegrated=0 
finalized=0 confirmed=1
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] Members Joined:
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: crm_update_quorum: 
Updating quorum status to false (call=135)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout) Stopping slapd:
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request: 
Operation complete: op cib_modify for section nodes 
(origin=local/crmd/133, version=0.117.10): ok (rc=0)
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] notice: 
pcmk_peer_update: Transitional membership event on ring 308: memb=1, 
new=0, lost=1
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
pcmk_peer_update: memb: genome-ldap2 1426129162
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
pcmk_peer_update: lost: genome-ldap1 1409351946
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] CLM 
CONFIGURATION CHANGE
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] New Configuration:
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ]         r(0) 
ip(10.1.1.85)
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] Members Left:
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [CLM   ] Members Joined:
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] notice: 
pcmk_peer_update: Stable membership event on ring 308: memb=1, new=0, lost=0
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
pcmk_peer_update: MEMB: genome-ldap2 1426129162
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node genome-ldap1 was not seen in the 
previous transition
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
update_member: Node 1409351946/genome-ldap1 is now: lost
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [pcmk  ] info: 
send_member_notification: Sending membership update 308 to 2 children
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [TOTEM ] A processor 
joined or left the membership and a new membership was formed.
Mar  9 06:59:23 genome-ldap2 corosync[1989]:   [MAIN  ] Completed 
service synchronization, ready to provide service.
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element: 
cib:diff: - <cib have-quorum="1" admin_epoch="0" epoch="117" 
num_updates="11" />
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: log_data_element: 
cib:diff: + <cib have-quorum="0" admin_epoch="0" epoch="118" 
num_updates="1" />
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request: 
Operation complete: op cib_modify for section cib 
(origin=local/crmd/135, version=0.118.1): ok (rc=0)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: abort_transition_graph: 
need_abort:59 - Triggered transition abort (complete=0) : Non-status change
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority: 
Abort priority upgraded from 0 to 1000000
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: update_abort_priority: 
Abort action done superceeded by restart
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: need_abort: Aborting on 
change to have-quorum
Mar  9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents: 
Archived previous version as /var/lib/heartbeat/crm/cib-42.raw
Mar  9 06:59:23 genome-ldap2 cib: [4397]: info: write_cib_contents: 
Wrote version 0.118.0 of the CIB to disk (digest: 
5ab39d8c6134011247378d7b7a8e8cb9)
Mar  9 06:59:23 genome-ldap2 cib: [4397]: info: retrieveCib: Reading 
cluster configuration from: /var/lib/heartbeat/crm/cib.3Fcmhq (digest: 
/var/lib/heartbeat/crm/cib.sbJeBo)
Mar  9 06:59:23 genome-ldap2 cib: [2034]: info: cib_process_request: 
Operation complete: op cib_modify for section crm_config 
(origin=local/crmd/137, version=0.118.1): ok (rc=0)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout) [
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout)   OK
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout) ]
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout)
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP:0:stop:stdout)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP:0_stop_0 (call=15, rc=0, cib-update=138, confirmed=true) ok
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP:0_stop_0 (2) confirmed on genome-ldap2 (rc=0)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 11 fired and confirmed
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph: 
====================================================
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition 
67 (Complete=4, Pending=0, Fired=0, Skipped=2, Incomplete=0, 
Source=/var/lib/pengine/pe-warn-100.bz2): Stopped
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger: 
Transition 67 is now complete
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ 
input=I_PE_CALC cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
All 1 cluster nodes are eligible to run resources.
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke: Query 
139: Requesting the current CIB: S_POLICY_ENGINE
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_pe_invoke_callback: 
Invoking the PE: query=139, ref=pe_calc-dc-1268146763-113, seq=308, 
quorate=0
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: unpack_config: On 
loss of CCM Quorum: Ignore
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: unpack_config: Node 
scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
determine_online_status: Node genome-ldap2 is online
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: unpack_rsc_op: 
Processing failed op LDAP:0_start_0 on genome-ldap2: unknown error (1)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: native_print: 
LDAP-IP     (ocf::heartbeat:IPaddr2):       Started genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: clone_print: 
Clone Set: LDAP-clone
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: short_print: 
Stopped: [ LDAP:0 LDAP:1 ]
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: get_failcount: 
LDAP-clone has failed 1000000 times on genome-ldap2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: 
common_apply_stickiness: Forcing LDAP-clone away from genome-ldap2 after 
1000000 failures (max=1000000)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: 
native_merge_weights: LDAP-clone: Rolling back scores from LDAP-IP
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:0 cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP:1 cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: native_color: 
Resource LDAP-IP cannot run anywhere
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Stop 
resource LDAP-IP (genome-ldap2)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave 
resource LDAP:0 (Stopped)
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: notice: LogActions: Leave 
resource LDAP:1 (Stopped)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ 
input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: unpack_graph: Unpacked 
transition 68: 2 actions in 2 synapses
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_te_invoke: 
Processing graph 68 (ref=pe_calc-dc-1268146763-113) derived from 
/var/lib/pengine/pe-warn-101.bz2
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_rsc_command: 
Initiating action 5: stop LDAP-IP_stop_0 on genome-ldap2 (local)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_lrm_rsc_op: 
Performing key=5:68:0:0cc80735-d478-48c0-8260-02b627bed719 
op=LDAP-IP_stop_0 )
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: rsc:LDAP-IP:16: stop
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP-IP_monitor_30000 (call=13, status=1, cib-update=0, 
confirmed=true) Cancelled
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: WARN: process_pe_message: 
Transition 68: WARNINGs found during PE processing. PEngine Input stored 
in: /var/lib/pengine/pe-warn-101.bz2
Mar  9 06:59:23 genome-ldap2 pengine: [2037]: info: process_pe_message: 
Configuration WARNINGs found during PE processing.  Please run 
"crm_verify -L" to identify issues.
Mar  9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -f inet addr delete 
10.1.1.83/16 dev eth0
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -f inet addr delete 
10.1.1.83/16 dev eth0
Mar  9 06:59:23 genome-ldap2 IPaddr2[4401]: INFO: ip -o -f inet addr 
show eth0
Mar  9 06:59:23 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:stop:stderr) 2010/03/09_06:59:23 INFO: ip -o -f inet addr show 
eth0
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: process_lrm_event: LRM 
operation LDAP-IP_stop_0 (call=16, rc=0, cib-update=140, confirmed=true) ok
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: match_graph_event: 
Action LDAP-IP_stop_0 (5) confirmed on genome-ldap2 (rc=0)
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_pseudo_action: 
Pseudo action 2 fired and confirmed
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: run_graph: 
====================================================
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: notice: run_graph: Transition 
68 (Complete=2, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-warn-101.bz2): Complete
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: te_graph_trigger: 
Transition 68 is now complete
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: notify_crmd: Transition 
68 status: done - <null>
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Mar  9 06:59:23 genome-ldap2 crmd: [2038]: info: do_state_transition: 
Starting PEngine Recheck Timer
Mar  9 06:59:26 genome-ldap2 lrmd: [2035]: info: RA output: 
(LDAP-IP:start:stderr) ARPING 10.1.1.83 from 10.1.1.83 eth0 Sent 5 
probes (5 broadcast(s)) Received 0 response(s)

I also thought the expected behavior was to stop LDAP first, start the 
IP, then restart LDAP.  I think I'd rather have it behave that way.  Any 
clue as to what may be causing the problem?

Here again is my config:

node genome-ldap1
node genome-ldap2
primitive LDAP lsb:ldap \
         op monitor interval="10s" timeout="15s" \
         meta target-role="Started"
primitive LDAP-IP ocf:heartbeat:IPaddr2 \
         params ip="10.1.1.83" nic="eth0" cidr_netmask="16" \
         op monitor interval="30s" timeout="20s" \
         meta target-role="Started"
clone LDAP-clone LDAP \
         meta clone-max="2" clone-node-max="1" globally-unique="false"
location LDAP-IP-placement-1 LDAP-IP 100: genome-ldap1
location LDAP-IP-placement-2 LDAP-IP 50: genome-ldap2
location LDAP-placement-1 LDAP-clone 100: genome-ldap1
location LDAP-placement-2 LDAP-clone 100: genome-ldap2
colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
order LDAP-after-IP inf: LDAP-IP LDAP-clone
property $id="cib-bootstrap-options" \
         dc-version="1.0.7-d3fa20fc76c7947d6de66db7e52526dc6bd7d782" \
         cluster-infrastructure="openais" \
         expected-quorum-votes="2" \
         stonith-enabled="false" \
         symmetric-cluster="false" \
         no-quorum-policy="ignore" \
         last-lrm-refresh="1268087338"
[root at genome-ldap2 ~]#

Thanks!

-erich

Andrew Beekhof wrote:
> On Tue, Mar 9, 2010 at 12:27 AM, Erich Weiler <weiler at soe.ucsc.edu> wrote:
>> I think I may have found an answer.  I had this in my config:
>>
>> order LDAP-after-IP inf: LDAP-IP LDAP-clone
>>
>> And, according to the logs, it *looks* like what happens when genome-ldap1
>> goes gown, the IP goes over to genome-ldap2, AND THEN tries to start LDAP
>> there, even though LDAP is already started there because it is an anonymous
>> clone.  LDAP cannot start (because it is already started) and throws an
>> error exit code, and presumably pacemaker freaks out because of that and
>> shuts down LDAP on all nodes.  Then the floating IP disappears because of
>> the line:
>>
>> colocation LDAP-with-IP inf: LDAP-IP LDAP-clone
>>
>> which is expected at that point.  It seems that when I tested this with
>> older versions of pacemaker, this didn't happen.  Should 'order' statements
>> be avoided entirely when dealing with anonymous clones?  Is that behavior
>> expected?
> 
> The ordering constraint should have caused the cluster to stop LDAP first.
> Have you checked both scripts are fully LSB compliant?
>    http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html
> 
> _______________________________________________
> Pacemaker mailing list
> Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker




More information about the Pacemaker mailing list