<div dir="ltr"><div><div><div><div><div>hi,<br><br></div>i have a hardware crash in a two-node drbd cluster.<br></div>the active node has a hardware failure is actual down.<br><br></div>i am wondering that my 2nd doesnīt migrate/move the resource.<br>
</div>the 2nd node wantīs to fence the device but thatīs not possible (itīs down)<br><br><br></div><div>how can i enable the services on the last "good" node?<br></div><div>and how can i optimize my config to handle that kind of error?<br>
<br></div><div>crm status<br><br>Last updated: Tue Mar 18 12:01:07 2014<br>Last change: Tue Mar 18 11:28:22 2014 via crmd on linux02<br>Stack: classic openais (with plugin)<br>Current DC: linux02 - partition WITHOUT quorum<br>
Version: 1.1.10-14.el6_5.2-368c726<br>2 Nodes configured, 2 expected votes<br>21 Resources configured<br><br><br>Node linux01: UNCLEAN (offline)<br>Online: [ linux02 ]<br><br> Resource Group: mysql<br> mysql_fs (ocf::heartbeat:Filesystem): Started linux01<br>
mysql_ip (ocf::heartbeat:IPaddr2): Started linux01 <br><br></div><div>.... and so on<br><br><br><br></div>cluster.log<br><br><br><div><div><div><div>Mar 18 11:54:43 [2234] linux02 crmd: notice: tengine_stonith_callback: Stonith operation 17 for linux01 failed (Timer expired): aborting transition.<br>
Mar 18 11:54:43 [2234] linux02 crmd: info: abort_transition_graph: tengine_stonith_callback:463 - Triggered transition abort (complete=0) : Stonith failed<br>Mar 18 11:54:43 [2234] linux02 crmd: notice: run_graph: Transition 15 (Complete=9, Pending=0, Fired=0, Skipped=36, Incomplete=19, Source=/var/lib/pacemaker/pengine/pe-warn-63.bz2): Stopped<br>
Mar 18 11:54:43 [2234] linux02 crmd: notice: too_many_st_failures: Too many failures to fence linux01 (16), giving up<br>Mar 18 11:54:43 [2234] linux02 crmd: info: do_log: FSA: Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE<br>
Mar 18 11:54:43 [2234] linux02 crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]<br>Mar 18 11:54:43 [2230] linux02 stonith-ng: info: stonith_command: Processed st_notify reply from linux02: OK (0)<br>
Mar 18 11:54:43 [2234] linux02 crmd: notice: tengine_stonith_notify: Peer linux01 was not terminated (reboot) by linux02 for linux02: Timer expired (ref=7939b264-699c-4d00-a89c-07e7e0193a80) by client crmd.2234<br>
Mar 18 11:54:44 [2229] linux02 cib: info: crm_client_new: Connecting 0x155ac00 for uid=0 gid=0 pid=23360 id=b88b2690-0c3f-48ac-b8b4-3a47b7f9114a<br>Mar 18 11:54:44 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_mon/2, version=0.125.2)<br>
Mar 18 11:54:44 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br>Mar 18 11:55:03 [2229] linux02 cib: info: crm_client_new: Connecting 0x155ac00 for uid=0 gid=0 pid=23415 id=62e7a9d8-588e-427f-8178-85febce00151<br>
Mar 18 11:55:03 [2229] linux02 cib: info: crm_client_new: Connecting 0x1585de0 for uid=0 gid=0 pid=23416 id=79795042-699b-4347-abcb-4c7c96ed2291<br>Mar 18 11:55:03 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.125.2)<br>
Mar 18 11:55:03 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.125.2)<br>Mar 18 11:55:03 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br>
Mar 18 11:55:03 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br>Mar 18 11:55:43 [2230] linux02 stonith-ng: error: remote_op_done: Already sent notifications for 'reboot of linux01 by linux02' (for=crmd.2234@linux02.7939b264, state=4): Timer expired<br>
Mar 18 11:55:59 [2229] linux02 cib: info: crm_client_new: Connecting 0x155ac00 for uid=0 gid=0 pid=23468 id=8dea3cab-9103-42fc-9747-76018c4a0500<br>Mar 18 11:55:59 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section 'all': OK (rc=0, origin=local/crm_mon/2, version=0.125.2)<br>
Mar 18 11:55:59 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br>Mar 18 11:56:03 [2229] linux02 cib: info: crm_client_new: Connecting 0x155ac00 for uid=0 gid=0 pid=23523 id=b681390a-51a3-4d68-abf1-514ee8ab9351<br>
Mar 18 11:56:03 [2229] linux02 cib: info: crm_client_new: Connecting 0x1585de0 for uid=0 gid=0 pid=23524 id=005421e4-b079-4a16-b4cc-0fc2c8c73246<br>Mar 18 11:56:03 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.125.2)<br>
Mar 18 11:56:03 [2229] linux02 cib: info: cib_process_request: Completed cib_query operation for section nodes: OK (rc=0, origin=local/crm_attribute/2, version=0.125.2)<br>Mar 18 11:56:03 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br>
Mar 18 11:56:03 [2229] linux02 cib: info: crm_client_destroy: Destroying 0 events<br><br></div><div>thanks<br></div><div>beo<br></div></div></div></div></div>