[Pacemaker] An internal error occurred in crmd

Kazunori INOUE kazunori.inoue3 at gmail.com
Mon Oct 21 00:59:34 EDT 2013


Hi,

I'm using pacemaker-1.1 (b6d42ed. the latest devel).

After having started corosync and pacemaker with three nodes,
I loaded configuration.
Then internal error occurred in crmd and was exited.

$ crm configure load update 3vm+2stonith.cli
$ for i in n{6..8};do ssh $i 'grep error: /var/log/ha-log';done
Oct 21 11:19:43 bl460g1n6 pengine[7684]:    error: unpack_resources:
Resource start-up disabled since no STONITH resources have been
defined
Oct 21 11:19:43 bl460g1n6 pengine[7684]:    error: unpack_resources:
Either configure some or disable STONITH with the stonith-enabled
option
Oct 21 11:19:43 bl460g1n6 pengine[7684]:    error: unpack_resources:
NOTE: Clusters with shared data need STONITH to ensure data integrity
Oct 21 11:20:51 bl460g1n6 crmd[7685]:    error: crm_element_value:
Couldn't find lrmd_callid in NULL
Oct 21 11:20:51 bl460g1n6 crmd[7685]:    error: crm_abort:
crm_element_value: Triggered assert at xml.c:3336 : data != NULL
Oct 21 11:20:51 bl460g1n6 crmd[7685]:    error: crm_element_value:
Couldn't find lrmd_rc in NULL
Oct 21 11:20:51 bl460g1n6 crmd[7685]:    error: crm_abort:
crm_element_value: Triggered assert at xml.c:3336 : data != NULL
Oct 21 11:20:53 bl460g1n6 crmd[7685]:    error:
internal_ipc_get_reply: Discarding old reply 90 (need 91)

Oct 21 11:20:51 bl460g1n7 crmd[12487]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_info operation (timeout=30000): -11:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_register operation (timeout=0): -114:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_info operation (timeout=30000): -114:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: get_lrm_resource:
Could not add resource prmStonith6-2 to LRM
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: do_lrm_invoke:
Invalid resource definition
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: do_log: FSA: Input
I_TERMINATE from do_recover() received in state S_RECOVERY
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error:
lrm_state_verify_stopped: 4 pending LRM operations at shutdown
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error:
lrm_state_verify_stopped: Pending action: prmVM3:13 (prmVM3_monitor_0)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error:
lrm_state_verify_stopped: Pending action: prmVM2:9 (prmVM2_monitor_0)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error:
lrm_state_verify_stopped: Pending action: prmVM1:5 (prmVM1_monitor_0)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error:
lrm_state_verify_stopped: Pending action: prmStonith6-1:17
(prmStonith6-1_monitor_0)
Oct 21 11:20:52 bl460g1n7 crmd[12487]:    error: crmd_fast_exit: Could
not recover from internal error
Oct 21 11:20:52 bl460g1n7 pacemakerd[12477]:    error:
pcmk_child_exit: Child process crmd (12487) exited: Generic Pacemaker
error (201)

Oct 21 11:20:51 bl460g1n8 crmd[1600]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_info operation (timeout=30000): -11:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_register operation (timeout=0): -114:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: lrmd_send_command:
Couldn't perform lrmd_rsc_info operation (timeout=30000): -114:
Connection timed out (110)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: get_lrm_resource:
Could not add resource prmStonith6-2 to LRM
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: do_lrm_invoke: Invalid
resource definition
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: do_log: FSA: Input
I_TERMINATE from do_recover() received in state S_RECOVERY
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error:
lrm_state_verify_stopped: 4 pending LRM operations at shutdown
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error:
lrm_state_verify_stopped: Pending action: prmVM3:13 (prmVM3_monitor_0)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error:
lrm_state_verify_stopped: Pending action: prmVM2:9 (prmVM2_monitor_0)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error:
lrm_state_verify_stopped: Pending action: prmVM1:5 (prmVM1_monitor_0)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error:
lrm_state_verify_stopped: Pending action: prmStonith6-1:17
(prmStonith6-1_monitor_0)
Oct 21 11:20:52 bl460g1n8 crmd[1600]:    error: crmd_fast_exit: Could
not recover from internal error
Oct 21 11:20:52 bl460g1n8 pacemakerd[1591]:    error: pcmk_child_exit:
Child process crmd (1600) exited: Generic Pacemaker error (201)

Best Regards,
Kazunori INOUE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: crmd_internal_error.tar.bz2
Type: application/x-bzip2
Size: 345346 bytes
Desc: not available
URL: <http://lists.clusterlabs.org/pipermail/pacemaker/attachments/20131021/3e47f98f/attachment-0002.bz2>


More information about the Pacemaker mailing list