[Pacemaker] attrd "stuck" for 10 hours on heartbeat stop (probably not heartbeat specific)

Lars Ellenberg lars.ellenberg at linbit.com
Wed Jan 27 08:53:20 UTC 2010


has been on linux-ha-dev, reposted on pacemaker,
to increase the chance for beekhof commenting on this ;-)


This is a StopOnebyOne, three nodes rum, kugel, kokos.

On killing attrd, while the cib had some "A-Sync reply to crmd" pending.
This is not unusual, though.

But attrd sits there doing nothing for ~ 10 hours (!), until a new
"attrd_trigger_update" happens (which may or may not be related),
and finally the TERM handler is invoked.

Full logs below, first is a commented and snipped log that I think may help
understand the issue.
stopping rum (current DC) first:

Jan 25 22:15:20 sepp CTS: Running test StopOnebyOne           (kokos)       [108]
Jan 25 22:15:20 sepp CTS: debug: MARK: test StopOnebyOne start 1264454120
Jan 25 22:15:20 sepp CTS: debug: Setup: SimulStartLite
Jan 25 22:15:20 sepp CTS: debug: MARK: test StopOnebyOne start 1264454120
Jan 25 22:15:20 sepp CTS: debug: Stopping crm-lha on node rum
Jan 25 22:15:21 rum heartbeat: [11260]: info: killing /usr/lib/heartbeat/crmd process group 11278 with signal 15
Jan 25 22:15:21 rum crmd: [11278]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 25 22:15:21 rum crmd: [11278]: info: crm_shutdown: Requesting shutdown
Jan 25 22:15:21 rum crmd: [11278]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN origin=crm_shutdown ]
Jan 25 22:15:21 rum crmd: [11278]: info: do_state_transition: All 3 cluster nodes are eligible to run resources.
Jan 25 22:15:21 rum crmd: [11278]: info: do_shutdown_req: Sending shutdown request to DC: rum
Jan 25 22:15:21 rum crmd: [11278]: info: handle_shutdown_request: Creating shutdown request for rum (state=S_POLICY_ENGINE)
Jan 25 22:15:21 rum attrd: [11277]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown (1264454121)
Jan 25 22:15:21 rum attrd: [11277]: info: attrd_perform_update: Sent update 29: shutdown=1264454121
Jan 25 22:15:21 rum crmd: [11278]: info: abort_transition_graph: te_update_diff:146 - Triggered transition abort (complete=1, tag=transient_attributes, id=2adce750-3a3e-4cc5-81ef-f62e23675909, magic=NA, cib=1.172.83) : Transient attribute: update
Jan 25 22:15:21 rum crmd: [11278]: info: do_pe_invoke: Query 75: Requesting the current CIB: S_POLICY_ENGINE
Jan 25 22:15:21 rum crmd: [11278]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1264454121-103, seq=3, quorate=1

...

Jan 25 22:15:24 rum crmd: [11278]: info: run_graph: ====================================================
Jan 25 22:15:24 rum crmd: [11278]: notice: run_graph: Transition 7 (Complete=9, Pending=0, Fired=0, Skipped=11, Incomplete=0, Source=/var/lib/pengine/pe-warn-519.bz2): Stopped
Jan 25 22:15:24 rum crmd: [11278]: info: te_graph_trigger: Transition 7 is now complete
Jan 25 22:15:24 rum crmd: [11278]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_STOPPING [ input=I_STOP cause=C_FSA_INTERNAL origin=notify_crmd ]
Jan 25 22:15:24 rum crmd: [11278]: info: do_dc_release: DC role released

Jan 25 22:15:24 rum cib: [11274]: info: cib_process_disconnect: All clients disconnected...
Jan 25 22:15:24 rum cib: [11274]: info: initiate_exit: Sending disconnect notification to 3 peers...

Jan 25 22:15:25 rum cib: [11274]: info: cib_process_shutdown_req: Shutdown ACK from kokos
Jan 25 22:15:25 rum cib: [11274]: info: main: Done
Jan 25 22:15:25 rum ccm: [11273]: info: client (pid=11274) removed from ccm
Jan 25 22:15:25 rum heartbeat: [11260]: info: killing /usr/lib/heartbeat/ccm process group 11273 with signal 15
Jan 25 22:15:25 rum ccm: [11273]: info: received SIGTERM, going to shut down

Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBWRITE process 11264 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBREAD process 11265 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBWRITE process 11266 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBREAD process 11267 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBWRITE process 11268 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBREAD process 11269 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: killing HBFIFO process 11263 with signal 15
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11264 exited. 7 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11265 exited. 6 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11266 exited. 5 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11267 exited. 4 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11268 exited. 3 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11269 exited. 2 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: Core process 11263 exited. 1 remaining
Jan 25 22:15:26 rum heartbeat: [11260]: info: rum Heartbeat shutdown complete.
Jan 25 22:15:27 sepp CTS: debug: cmd: target=rum, rc=0: /etc/init.d/heartbeat stop  > /dev/null 2>&1

Jan 25 22:15:35 kokos crmd: [16162]: info: update_dc: Set DC to kugel (3.0.1)

Jan 25 22:15:36 kugel crmd: [22412]: info: update_dc: Set DC to kugel (3.0.1)

Jan 25 22:15:42 kugel crmd: [22412]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]

Jan 25 22:15:45 kokos crmd: [16162]: info: handle_request: Current ping state: S_NOT_DC
Jan 25 22:15:46 kugel crmd: [22412]: info: handle_request: Current ping state: S_IDLE


Great, that went well, the shutdown itself took about 5 seconds.
New DC is kugel.

Now stopping kokos:

Jan 25 22:15:46 sepp CTS: debug: Stopping crm-lha on node kokos
Jan 25 22:15:46 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/crmd process group 16162 with signal 15
Jan 25 22:15:46 kokos crmd: [16162]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 25 22:15:46 kokos crmd: [16162]: info: crm_shutdown: Requesting shutdown
Jan 25 22:15:46 kokos crmd: [16162]: info: do_shutdown_req: Sending shutdown request to DC: kugel
Jan 25 22:15:47 kugel crmd: [22412]: info: handle_shutdown_request: Creating shutdown request for kokos (state=S_IDLE)
Jan 25 22:15:47 kokos attrd: [16161]: info: attrd_ha_callback: Update relayed from kugel
Jan 25 22:15:47 kokos attrd: [16161]: info: attrd_trigger_update: Sending flush op to all hosts for: shutdown (1264454147)
Jan 25 22:15:47 kokos attrd: [16161]: info: attrd_perform_update: Sent update 55: shutdown=1264454147
Jan 25 22:15:47 kokos LSBDummy[17253]: [17264]: INFO:  Running OK
Jan 25 22:15:48 kugel attrd: [22411]: info: attrd_ha_callback: flush message from kokos
Jan 25 22:15:48 kugel crmd: [22412]: info: abort_transition_graph: te_update_diff:146 - Triggered transition abort (complete=1, tag=transient_attributes, id=05cb2ad1-6947-4a40-ab19-ade191871b09, magic=NA, cib=1.173.15) : Transient attribute: update
Jan 25 22:15:48 kugel crmd: [22412]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Jan 25 22:15:48 kugel crmd: [22412]: info: do_state_transition: All 2 cluster nodes are eligible to run resources.
Jan 25 22:15:48 kugel crmd: [22412]: info: do_pe_invoke: Query 61: Requesting the current CIB: S_POLICY_ENGINE
Jan 25 22:15:48 kugel crmd: [22412]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1264454148-38, seq=4, quorate=1

Resource migrations take place...

Jan 25 22:15:55 kugel pengine: [23138]: info: stage6: Scheduling Node kokos for shutdown

Jan 25 22:15:55 kokos crmd: [16162]: info: handle_request: Shutting down
Jan 25 22:15:55 kokos crmd: [16162]: info: do_state_transition: State transition S_NOT_DC -> S_STOPPING [ input=I_STOP cause=C_HA_MESSAGE origin=route_message ]
Jan 25 22:15:55 kokos crmd: [16162]: info: do_shutdown: All subsystems stopped, continuing
Jan 25 22:15:55 kokos crmd: [16162]: info: do_lrm_control: Disconnected from the LRM
Jan 25 22:15:55 kokos ccm: [16157]: info: client (pid=16162) removed from ccm
Jan 25 22:15:55 kokos crmd: [16162]: info: do_ha_control: Disconnected from Heartbeat
Jan 25 22:15:55 kokos crmd: [16162]: info: do_cib_control: Disconnecting CIB
Jan 25 22:15:55 kokos crmd: [16162]: info: crmd_cib_connection_destroy: Connection to the CIB terminated...
Jan 25 22:15:55 kokos crmd: [16162]: info: do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd
Jan 25 22:15:55 kokos crmd: [16162]: info: free_mem: Dropping I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ]
Jan 25 22:15:55 kokos crmd: [16162]: info: do_exit: [crmd] stopped (0)
Jan 25 22:15:55 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/attrd process group 16161 with signal 15
Jan 25 22:15:55 kokos cib: [16158]: WARN: send_ipc_message: IPC Channel to 16162 is not connected
Jan 25 22:15:55 kokos cib: [16158]: WARN: send_via_callback_channel: Delivery of reply to client 16162/31f8621c-9491-419a-929b-42a1d7d7d3f9 failed
Jan 25 22:15:55 kokos cib: [16158]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed
Jan 25 22:15:55 kokos cib: [16158]: WARN: send_ipc_message: IPC Channel to 16162 is not connected
Jan 25 22:15:55 kokos cib: [16158]: WARN: send_via_callback_channel: Delivery of reply to client 16162/31f8621c-9491-419a-929b-42a1d7d7d3f9 failed
Jan 25 22:15:55 kokos cib: [16158]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed

Jan 25 22:15:55 kugel crmd: [22412]: notice: crmd_client_status_callback: Status update: Client kokos/crmd now has status [offline] (DC=true)
Jan 25 22:15:55 kugel crmd: [22412]: info: crm_update_peer_proc: kokos.crmd is now offline
Jan 25 22:15:55 kugel crmd: [22412]: info: erase_node_from_join: Removed node kokos from join calculations: welcomed=0 itegrated=0 finalized=0 confirmed=1

Jan 25 22:22:30 kokos cib: [16158]: info: cib_stats: Processed 325 operations (338.00us average, 0% utilization) in the last 10min

(thats the last log message from kokos for 10 hours.
 remaining node kokos happily hums along...)

then, somewhen, some attribute change happens:


sepp happens to be the CTS coordinator, and was definetely reachable,
as it also collected these logs you are now reading...
and it did not lose any message either, as far as comparing with node local messages show.
there is apparently some other bogon in the pingd, generated pingd config, or elsewhere.
no matter, it triggers an attribute update:

Jan 26 08:13:24 kugel pingd: [22534]: info: stand_alone_ping: Node sepp is unreachable (read)
Jan 26 08:13:25 kugel pingd: [22534]: info: stand_alone_ping: Node sepp is unreachable (read)
Jan 26 08:13:31 kugel attrd: [22411]: info: attrd_trigger_update: Sending flush op to all hosts for: connected (1)
Jan 26 08:13:31 kugel attrd: [22411]: info: attrd_ha_callback: flush message from kugel

and YES,
that message awakens attrd, which now,
finally back in its mainloop, calls the event handler
for the signal caught 10 hours earlier.

Jan 26 08:13:31 kokos attrd: [16161]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 26 08:13:31 kokos attrd: [16161]: info: attrd_shutdown: Exiting
Jan 26 08:13:31 kokos attrd: [16161]: info: attrd_ha_callback: flush message from kugel
Jan 26 08:13:31 kokos attrd: [16161]: info: main: Exiting...
Jan 26 08:13:31 kokos attrd: [16161]: info: attrd_cib_connection_destroy: Connection to the CIB terminated...
Jan 26 08:13:31 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/stonithd process group 16160 with signal 15
Jan 26 08:13:31 kokos stonithd: [16160]: notice: /usr/lib/heartbeat/stonithd normally quit.
Jan 26 08:13:31 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/lrmd -r process group 16159 with signal 15
Jan 26 08:13:31 kokos lrmd: [16159]: info: lrmd is shutting down
Jan 26 08:13:31 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/cib process group 16158 with signal 15
Jan 26 08:13:31 kokos cib: [16158]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 26 08:13:31 kokos cib: [16158]: info: cib_shutdown: Disconnected 0 clients
Jan 26 08:13:31 kokos cib: [16158]: info: cib_process_disconnect: All clients disconnected...
Jan 26 08:13:31 kokos cib: [16158]: info: initiate_exit: Sending disconnect notification to 2 peers...
Jan 26 08:13:32 kugel cib: [22408]: info: cib_process_shutdown_req: Shutdown REQ from kokos
Jan 26 08:13:32 kugel cib: [22408]: info: cib_process_request: Operation complete: op cib_shutdown_req for section 'all' (origin=kokos/kokos/(null), version=1.173.42): ok (rc=0)
Jan 26 08:13:32 kokos cib: [16158]: info: cib_process_shutdown_req: Shutdown ACK from kugel
Jan 26 08:13:32 kokos cib: [16158]: info: terminate_cib: cib_process_shutdown_req: Disconnecting heartbeat
Jan 26 08:13:32 kokos cib: [16158]: info: terminate_cib: Exiting...
Jan 26 08:13:32 kokos cib: [16158]: info: cib_process_request: Operation complete: op cib_shutdown_req for section 'all' (origin=kugel/kugel/(null), version=0.0.0): ok (rc=0)
Jan 26 08:13:32 kokos cib: [16158]: info: ha_msg_dispatch: Lost connection to heartbeat service.
Jan 26 08:13:32 kokos cib: [16158]: info: main: Done
Jan 26 08:13:32 kokos ccm: [16157]: info: client (pid=16158) removed from ccm
Jan 26 08:13:32 kokos heartbeat: [16145]: info: killing /usr/lib/heartbeat/ccm process group 16157 with signal 15
Jan 26 08:13:32 kokos ccm: [16157]: info: received SIGTERM, going to shut down

Jan 26 08:13:32 kugel crmd: [22412]: info: mem_handle_event: Got an event OC_EV_MS_INVALID from ccm

huh? what exactly was invalid now?

...

Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBFIFO process 16148 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBWRITE process 16149 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBREAD process 16150 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBWRITE process 16151 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBREAD process 16152 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBWRITE process 16153 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: killing HBREAD process 16154 with signal 15
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16148 exited. 7 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16149 exited. 6 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16150 exited. 5 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16151 exited. 4 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16152 exited. 3 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16153 exited. 2 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: Core process 16154 exited. 1 remaining
Jan 26 08:13:33 kokos heartbeat: [16145]: info: kokos Heartbeat shutdown complete.
Jan 26 08:13:33 sepp CTS: debug: cmd: target=kokos, rc=0: /etc/init.d/heartbeat stop  > /dev/null 2>&1

Yippiyayeah!


And the last shutdown:

Jan 26 08:13:38 sepp CTS: debug: Stopping crm-lha on node kugel
Jan 26 08:13:38 kugel heartbeat: [22395]: WARN: node kokos: is dead
Jan 26 08:13:38 kugel heartbeat: [22395]: info: Link kokos:eth0 dead.
Jan 26 08:13:38 kugel crmd: [22412]: notice: crmd_ha_status_callback: Status update: Node kokos now has status [dead] (DC=true)
Jan 26 08:13:38 kugel crmd: [22412]: info: crm_update_peer_proc: kokos.ais is now offline
Jan 26 08:13:38 kugel crmd: [22412]: WARN: match_down_event: No match for shutdown action on 05cb2ad1-6947-4a40-ab19-ade191871b09
Jan 26 08:13:38 kugel crmd: [22412]: info: te_update_diff: Stonith/shutdown of 05cb2ad1-6947-4a40-ab19-ade191871b09 not matched
Jan 26 08:13:38 kugel crmd: [22412]: info: abort_transition_graph: te_update_diff:191 - Triggered transition abort (complete=1, tag=node_state, id=05cb2ad1-6947-4a40-ab19-ade191871b09, magic=NA, cib=1.174.16) : Node failure
Jan 26 08:13:38 kugel crmd: [22412]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ]
Jan 26 08:13:38 kugel crmd: [22412]: info: do_state_transition: All 1 cluster nodes are eligible to run resources.
Jan 26 08:13:38 kugel crmd: [22412]: info: do_pe_invoke: Query 137: Requesting the current CIB: S_POLICY_ENGINE
Jan 26 08:13:38 kugel crmd: [22412]: info: do_pe_invoke_callback: Invoking the PE: ref=pe_calc-dc-1264490018-122, seq=5, quorate=0

...

Jan 26 08:13:39 kugel crmd: [22412]: info: crmd_cib_connection_destroy: Connection to the CIB terminated...
Jan 26 08:13:39 kugel crmd: [22412]: info: do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd
Jan 26 08:13:39 kugel crmd: [22412]: info: free_mem: Dropping I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ]
Jan 26 08:13:39 kugel crmd: [22412]: info: do_exit: [crmd] stopped (0)
Jan 26 08:13:39 kugel heartbeat: [22395]: info: killing /usr/lib/heartbeat/attrd process group 22411 with signal 15
Jan 26 08:13:39 kugel attrd: [22411]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 26 08:13:39 kugel heartbeat: [22395]: info: killing /usr/lib/heartbeat/stonithd process group 22410 with signal 15
Jan 26 08:13:39 kugel cib: [22408]: info: cib_process_readwrite: We are now in R/O mode
Jan 26 08:13:39 kugel stonithd: [22410]: notice: /usr/lib/heartbeat/stonithd normally quit.
Jan 26 08:13:39 kugel attrd: [22411]: info: attrd_shutdown: Exiting
Jan 26 08:13:39 kugel cib: [22408]: WARN: send_ipc_message: IPC Channel to 22412 is not connected
Jan 26 08:13:39 kugel attrd: [22411]: info: main: Exiting...
Jan 26 08:13:39 kugel cib: [22408]: WARN: send_via_callback_channel: Delivery of reply to client 22412/fbe649b6-54d7-4617-a3f6-f95004698069 failed
Jan 26 08:13:39 kugel attrd: [22411]: info: attrd_cib_connection_destroy: Connection to the CIB terminated...
Jan 26 08:13:39 kugel cib: [22408]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed
Jan 26 08:13:39 kugel heartbeat: [22395]: info: killing /usr/lib/heartbeat/lrmd -r process group 22409 with signal 15
Jan 26 08:13:39 kugel cib: [22408]: info: crm_signal_dispatch: Invoking handler for signal 15: Terminated
Jan 26 08:13:39 kugel lrmd: [22409]: info: lrmd is shutting down
Jan 26 08:13:39 kugel heartbeat: [22395]: info: killing /usr/lib/heartbeat/cib process group 22408 with signal 15
Jan 26 08:13:39 kugel cib: [22408]: info: cib_shutdown: Disconnected 0 clients
Jan 26 08:13:39 kugel cib: [22408]: info: cib_process_disconnect: All clients disconnected...
Jan 26 08:13:39 kugel cib: [22408]: info: terminate_cib: initiate_exit: Disconnecting heartbeat
Jan 26 08:13:39 kugel cib: [22408]: info: terminate_cib: Exiting...
Jan 26 08:13:39 kugel cib: [22408]: info: main: Done
Jan 26 08:13:39 kugel ccm: [22407]: info: client (pid=22408) removed from ccm
Jan 26 08:13:39 kugel heartbeat: [22395]: info: killing /usr/lib/heartbeat/ccm process group 22407 with signal 15
Jan 26 08:13:39 kugel ccm: [22407]: info: received SIGTERM, going to shut down
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBFIFO process 22398 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBWRITE process 22399 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBREAD process 22400 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBWRITE process 22401 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBREAD process 22402 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBWRITE process 22403 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: killing HBREAD process 22404 with signal 15
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22398 exited. 7 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22399 exited. 6 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22401 exited. 5 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22402 exited. 4 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22403 exited. 3 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22404 exited. 2 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: Core process 22400 exited. 1 remaining
Jan 26 08:13:40 kugel heartbeat: [22395]: info: kugel Heartbeat shutdown complete.
Jan 26 08:13:41 sepp CTS: debug: cmd: target=kugel, rc=0: /etc/init.d/heartbeat stop  > /dev/null 2>&1


Soooo.

Random, or not so random, thoughts?


Full log attached.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: full-log.txt.bz2
Type: application/octet-stream
Size: 132210 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20100127/8cd8a97e/attachment-0001.obj>
-------------- next part --------------
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev at lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


More information about the Pacemaker mailing list