<div dir="ltr">Thanks, your explanation is very helpful considering that it happens rarely and only on the first boot after VMs are created.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 22, 2017 at 9:34 PM, Ken Gaillot <span dir="ltr"><<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 05/19/2017 02:03 PM, Radoslaw Garbacz wrote:<br>

> Hi,<br>

><br>

</span><span class="">> I have some more information regarding this issue (pacemaker debug logs).<br>

><br>

> Firstly, I have not mentioned probably important facts:<br>

> 1) this happen rarely<br>

> 2) this happen only on first boot<br>

> 3) turning on debug in corosync/pacemaker significantly reduced<br>

> frequency of this happening, i.e. without debug every ~7 cluster<br>

> creation, with debug every ~66 cluster creation.<br>

><br>

> This is a 3 nodes cluster on Azure Cloud and it does not seem like the<br>

> resource agent is reporting an error, because all nodes logs proper "not<br>

> running" results:<br>

><br>

> The resource in question name is "dbx_head_head".<br>

><br>

> node1)<br>

> May 19 13:15:41 [6872] olegdbx39-vm-0 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.32 to 2.5.31<br>

</span>> <a href="http://head.ocf.sh" rel="noreferrer" target="_blank">head.ocf.sh</a> <<a href="http://head.ocf.sh" rel="noreferrer" target="_blank">http://head.ocf.sh</a>>(dbx_head_<wbr>head)[7717]:<br>

<span class="">> 2017/05/19_13:15:42 DEBUG: head_monitor: return 7<br>

> May 19 13:15:42 [6873] olegdbx39-vm-0       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:7717 - exited with rc=7<br>

> May 19 13:15:42 [6873] olegdbx39-vm-0       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:7717:<wbr>stderr [ -- empty -- ]<br>

> May 19 13:15:42 [6873] olegdbx39-vm-0       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:7717:<wbr>stdout [ -- empty -- ]<br>

> May 19 13:15:42 [6873] olegdbx39-vm-0       lrmd:    debug:<br>

> log_finished:    finished - rsc:dbx_head_head action:monitor call_id:14<br>

> pid:7717 exit-code:7 exec-time:932ms queue-time:0ms<br>

><br>

><br>

> node2)<br>

> May 19 13:15:41 [6266] olegdbx39-vm000002 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.31 to 2.5.30<br>

</span>> <a href="http://head.ocf.sh" rel="noreferrer" target="_blank">head.ocf.sh</a> <<a href="http://head.ocf.sh" rel="noreferrer" target="_blank">http://head.ocf.sh</a>>(dbx_head_<wbr>head)[6485]:<br>

<div><div class="h5">> 2017/05/19_13:15:41 DEBUG: head_monitor: return 7<br>

> May 19 13:15:41 [6267] olegdbx39-vm000002       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:6485 - exited with rc=7<br>

> May 19 13:15:41 [6267] olegdbx39-vm000002       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:6485:<wbr>stderr [ -- empty -- ]<br>

> May 19 13:15:41 [6267] olegdbx39-vm000002       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_monitor_0:6485:<wbr>stdout [ -- empty -- ]<br>

> May 19 13:15:41 [6267] olegdbx39-vm000002       lrmd:    debug:<br>

> log_finished:    finished - rsc:dbx_head_head action:monitor call_id:14<br>

> pid:6485 exit-code:7 exec-time:790ms queue-time:0ms<br>

> May 19 13:15:41 [6266] olegdbx39-vm000002 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.32 to 2.5.31<br>

> May 19 13:15:41 [6266] olegdbx39-vm000002 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.33 to 2.5.32<br>

><br>

><br>

> node3)<br>

> ==  the logs here are different - there is no probing, just stop attempt<br>

> (with proper exit code) ==<br>

><br>

> == reporting not existing resource ==<br>

><br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_info operation from<br>

> d2c8a871-410a-4006-be52-<wbr>ee684c0a5f38: rc=0, reply=0, notify=0<br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_exec operation from<br>

> d2c8a871-410a-4006-be52-<wbr>ee684c0a5f38: rc=10, reply=1, notify=0<br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> log_execute:    executing - rsc:dbx_first_datas action:monitor call_id:10<br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:     info:<br>

> process_lrmd_get_rsc_info:    Resource 'dbx_head_head' not found (2<br>

> active resources)<br>

<br>

</div></div>FYI, this is normal. It just means the lrmd hasn't been asked to do<br>

anything with this resource before, so it's not found in the lrmd's memory.<br>

<span class=""><br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_info operation from<br>

> d2c8a871-410a-4006-be52-<wbr>ee684c0a5f38: rc=0, reply=0, notify=0<br>

> May 19 13:15:29 [6293] olegdbx39-vm000003       lrmd:     info:<br>

> process_lrmd_rsc_register:    Added 'dbx_head_head' to the rsc list (3<br>

> active resources)<br>

> May 19 13:15:40 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_register operation from<br>

> d2c8a871-410a-4006-be52-<wbr>ee684c0a5f38: rc=0, reply=1, notify=1<br>

> May 19 13:15:29 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.9 to 2.5.8<br>

> May 19 13:15:40 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.10 to 2.5.9<br>

> May 19 13:15:40 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.11 to 2.5.10<br>

> May 19 13:15:40 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.12 to 2.5.11<br>

> May 19 13:15:40 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_info operation from<br>

> d2c8a871-410a-4006-be52-<wbr>ee684c0a5f38: rc=0, reply=0, notify=0<br>

><br>

><br>

> == and much later it tries to stop the resource on node3, even though it<br>

> has never been started, nor probed here ==<br>

<br>

</span>At any given time, one node in the cluster has been elected DC. The DC<br>

schedules all actions. So, the logs on the DC (not this node) will have<br>

the explanation for why there was no probe, and why there was a stop.<br>

<span class=""><br>

><br>

> May 19 13:15:45 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.59 to 2.5.58<br>

> May 19 13:15:45 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_info operation from<br>

> 069e399c-6c10-47cf-b55f-<wbr>de46985562b3: rc=0, reply=0, notify=0<br>

> May 19 13:15:45 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> process_lrmd_message:    Processed lrmd_rsc_exec operation from<br>

> 069e399c-6c10-47cf-b55f-<wbr>de46985562b3: rc=77, reply=1, notify=0<br>

> May 19 13:15:45 [6293] olegdbx39-vm000003       lrmd:     info:<br>

> log_execute:    executing - rsc:dbx_head_head action:stop call_id:77<br>

</span>> <a href="http://head.ocf.sh" rel="noreferrer" target="_blank">head.ocf.sh</a> <<a href="http://head.ocf.sh" rel="noreferrer" target="_blank">http://head.ocf.sh</a>>(dbx_head_<wbr>head)[7202]:<br>

<span class="">> 2017/05/19_13:15:45 INFO: head_stop: Stopping resource<br>

</span>> <a href="http://head.ocf.sh" rel="noreferrer" target="_blank">head.ocf.sh</a> <<a href="http://head.ocf.sh" rel="noreferrer" target="_blank">http://head.ocf.sh</a>>(dbx_head_<wbr>head)[7202]:<br>

<span class="">> 2017/05/19_13:15:47 DEBUG: head_monitor: return 7<br>

</span>> <a href="http://head.ocf.sh" rel="noreferrer" target="_blank">head.ocf.sh</a> <<a href="http://head.ocf.sh" rel="noreferrer" target="_blank">http://head.ocf.sh</a>>(dbx_head_<wbr>head)[7202]:<br>

<span class="">> 2017/05/19_13:15:47 INFO: Resource is already stopped<br>

> May 19 13:15:47 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_stop_0:7202 - exited with rc=0<br>

> May 19 13:15:47 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_stop_0:7202:<wbr>stderr [ -- empty -- ]<br>

> May 19 13:15:47 [6293] olegdbx39-vm000003       lrmd:    debug:<br>

> operation_finished:    dbx_head_head_stop_0:7202:<wbr>stdout [ -- empty -- ]<br>

> May 19 13:15:47 [6293] olegdbx39-vm000003       lrmd:     info:<br>

> log_finished:    finished - rsc:dbx_head_head action:stop call_id:77<br>

> pid:7202 exit-code:0 exec-time:1359ms queue-time:0ms<br>

> May 19 13:15:47 [6292] olegdbx39-vm000003 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.60 to 2.5.59<br>

><br>

><br>

><br>

><br>

> node1)<br>

> == reports an error on node3, even though the resource has never been<br>

> even configured, probed nor started here ==<br>

><br>

> The error is reported on node1 that it happened on node3 (acquired from<br>

> 'crm_mon'):<br>

><br>

> Failed Actions:<br>

> * dbx_head_head_monitor_0 on olegdbx39-vm000003 'not configured' (6):<br>

> call=999999999, status=Error, exitreason='none',<br>

>     last-rc-change='Fri May 19 13:15:40 2017', queued=0ms, exec=0ms<br>

<br>

</span>The combination of 'not configured' and call=999999999 has only one<br>

possible source: the crmd could not register the resource with the lrmd.<br>

<br>

Usually this means the resource definition in the CIB is inherently<br>

invalid. However, it can also be reported when there is any kind of<br>

error in the communication between the two daemons: the connection<br>

dropped, the machine was out of memory, the communication timed out<br>

(e.g. due to high load), etc.<br>

<br>

The "Added 'dbx_head_head' to the rsc list" message means that the lrmd<br>

did receive and complete the register request, so the problem occurred<br>

sometime between there and crmd getting the result back.<br>

<br>

This explains why you don't see a probe being executed. The crmd<br>

couldn't register the resource with the lrmd, so it immediately failed<br>

the probe on its side.<br>

<br>

The root cause is a mystery at this point, but hopefully that gives you<br>

some hints as to what to look for. Check the logs on the DC, and check<br>

the host's system logs for any signs of trouble at that time.<br>

<div class="HOEnZb"><div class="h5"><br>

> == Node1 logs "not configured" even though all return codes logged<br>

> earlier are OK, below are logs for thread, which reports this error: ==<br>

><br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:     info:<br>

> crm_xml_cleanup:    Cleaning up memory from libxml2<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:     info: main:<br>

> Starting crm_mon<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> qb_rb_open_2:    shm size:524301; real_size:528384; rb->word_size:132096<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> qb_rb_open_2:    shm size:524301; real_size:528384; rb->word_size:132096<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> qb_rb_open_2:    shm size:524301; real_size:528384; rb->word_size:132096<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> cib_native_signon_raw:    Connection to CIB successful<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    STONITH timeout: 60000<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    STONITH of failed nodes is disabled<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    Concurrent fencing is disabled<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    Stop all active resources: false<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    Cluster is symmetric - resources can run anywhere by<br>

> default<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    Default stickiness: 0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:   notice:<br>

> unpack_config:    On loss of CCM Quorum: Ignore<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> unpack_config:    Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:     info:<br>

> determine_online_status:    Node olegdbx39-vm000003 is online<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:     info:<br>

> determine_online_status:    Node olegdbx39-vm-0 is online<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:     info:<br>

> determine_online_status:    Node olegdbx39-vm000002 is online<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_head_head_monitor_0 on olegdbx39-vm000003<br>

> returned 'not configured' (6) instead of the expected value: 'not<br>

> running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for dbx_head_head<br>

> on olegdbx39-vm000003: not configured (6)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    error:<br>

> unpack_rsc_op:    Preventing dbx_head_head from re-starting anywhere:<br>

> operation monitor failed 'not configured' (6)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_head_head_monitor_0 on olegdbx39-vm000003<br>

> returned 'not configured' (6) instead of the expected value: 'not<br>

> running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for dbx_head_head<br>

> on olegdbx39-vm000003: not configured (6)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    error:<br>

> unpack_rsc_op:    Preventing dbx_head_head from re-starting anywhere:<br>

> operation monitor failed 'not configured' (6)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_bind_mounts_nodes on<br>

> olegdbx39-vm000003 to dbx_bind_mounts_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_bind_mounts_nodes_monitor_<wbr>0 on<br>

> olegdbx39-vm000003 returned 'unknown' (189) instead of the expected<br>

> value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_bind_mounts_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_bind_mounts_nodes_monitor_<wbr>0 on<br>

> olegdbx39-vm000003 returned 'unknown' (189) instead of the expected<br>

> value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_bind_mounts_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_swap_nodes on<br>

> olegdbx39-vm000003 to dbx_swap_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_swap_nodes_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_swap_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_nodes on<br>

> olegdbx39-vm000003 to dbx_nfs_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_nfs_nodes_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_nfs_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_first_datas on<br>

> olegdbx39-vm000003 to dbx_first_datas:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_first_datas_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_first_datas:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_first_datas_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_first_datas:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_first_head_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_first_head on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [6872] olegdbx39-vm-0 stonith-ng:    debug:<br>

> xml_patch_version_check:    Can apply patch 2.5.47 to 2.5.46<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_first_head_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_first_head on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_mounts_nodes on<br>

> olegdbx39-vm000003 to dbx_mounts_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_mounts_nodes_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_mounts_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_mounts_nodes_monitor_0 on olegdbx39-vm000003<br>

> returned 'unknown' (189) instead of the expected value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_mounts_nodes:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_mounts_datas on<br>

> olegdbx39-vm000003 to dbx_nfs_mounts_datas:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> determine_op_status:    dbx_nfs_mounts_datas_monitor_0 on<br>

> olegdbx39-vm000003 returned 'unknown' (189) instead of the expected<br>

> value: 'not running' (7)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:  warning:<br>

> unpack_rsc_op_failure:    Processing failed op monitor for<br>

> dbx_nfs_mounts_datas:0 on olegdbx39-vm000003: unknown (189)<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_ready_primary on<br>

> olegdbx39-vm000003 to dbx_ready_primary:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_first_datas on<br>

> olegdbx39-vm-0 to dbx_first_datas:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_swap_nodes on<br>

> olegdbx39-vm-0 to dbx_swap_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_mounts_nodes on<br>

> olegdbx39-vm-0 to dbx_mounts_nodes:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_bind_mounts_nodes on<br>

> olegdbx39-vm-0 to dbx_bind_mounts_nodes:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_mounts_datas on<br>

> olegdbx39-vm-0 to dbx_nfs_mounts_datas:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_nodes on<br>

> olegdbx39-vm-0 to dbx_nfs_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_ready_primary on<br>

> olegdbx39-vm-0 to dbx_ready_primary:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_first_datas on<br>

> olegdbx39-vm000002 to dbx_first_datas:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_swap_nodes on<br>

> olegdbx39-vm000002 to dbx_swap_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_mounts_datas on<br>

> olegdbx39-vm000002 to dbx_nfs_mounts_datas:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_mounts_nodes on<br>

> olegdbx39-vm000002 to dbx_mounts_nodes:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_bind_mounts_nodes on<br>

> olegdbx39-vm000002 to dbx_bind_mounts_nodes:1<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_nfs_nodes on<br>

> olegdbx39-vm000002 to dbx_nfs_nodes:0<br>

> May 19 13:15:42 [8114] olegdbx39-vm-0    crm_mon:    debug:<br>

> find_anonymous_clone:    Internally renamed dbx_ready_primary on<br>

> olegdbx39-vm000002 to dbx_ready_primary:0<br>

> [...]<br>

><br>

><br>

> Thanks in advance,<br>

><br>

><br>

> On Thu, May 18, 2017 at 4:37 PM, Ken Gaillot <<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a><br>

</div></div><div class="HOEnZb"><div class="h5">> <mailto:<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a>>> wrote:<br>

><br>

>     On 05/17/2017 06:10 PM, Radoslaw Garbacz wrote:<br>

>     > Hi,<br>

>     ><br>

>     > I have a question regarding '<lrm_resource><lrm_rsc_op> 'op-status<br>

>     > attribute getting value 4.<br>

>     ><br>

>     > In my case I have a strange behavior, when resources get those "monitor"<br>

>     > operation entries in the CIB with op-status=4, and they do not seem to<br>

>     > be called (exec-time=0).<br>

>     ><br>

>     > What does 'op-status' = 4 mean?<br>

><br>

>     The action had an error status<br>

><br>

>     ><br>

>     > I would appreciate some elaboration regarding this, since this is<br>

>     > interpreted by pacemaker as an error, which causes logs:<br>

>     > crm_mon:    error: unpack_rsc_op:    Preventing dbx_head_head from<br>

>     > re-starting anywhere: operation monitor failed 'not configured' (6)<br>

><br>

>     The rc-code="6" is the more interesting number; it's the result returned<br>

>     by the resource agent. As you can see above, it means "not configured".<br>

>     What that means exactly is up to the resource agent's interpretation.<br>

><br>

>     > and I am pretty sure the resource agent was not called (no logs,<br>

>     > exec-time=0)<br>

><br>

>     Normally this could only come from the resource agent.<br>

><br>

>     However there are two cases where pacemaker generates this error itself:<br>

>     if the resource definition in the CIB is invalid; and if your version of<br>

>     pacemaker was compiled with support for reading sensitive parameter<br>

>     values from a file, but that file could not be read.<br>

><br>

>     It doesn't sound like your case is either one of those though, since<br>

>     they would prevent the resource from even starting. Most likely it's<br>

>     coming from the resource agent. I'd look at the resource agent source<br>

>     code and see where it can return OCF_ERR_CONFIGURED.<br>

><br>

>     > There are two aspects of this:<br>

>     ><br>

>     > 1) harmless (pacemaker seems to not bother about it), which I guess<br>

>     > indicates cancelled monitoring operations:<br>

>     > op-status=4, rc-code=189<br>

><br>

>     This error means the connection between the crmd and lrmd daemons was<br>

>     lost -- most commonly, that shows up for operations that were pending at<br>

>     shutdown.<br>

><br>

>     ><br>

>     > * Example:<br>

>     > <lrm_rsc_op id="dbx_first_datas_last_<wbr>failure_0"<br>

>     > operation_key="dbx_first_<wbr>datas_monitor_0" operation="monitor"<br>

>     > crm-debug-origin="do_update_<wbr>resource" crm_feature_set="3.0.12"<br>

>     > transition-key="38:0:7:<wbr>c8b63d9d-9c70-4f99-aa1b-<wbr>e993de6e4739"<br>

>     > transition-magic="4:189;38:0:<wbr>7:c8b63d9d-9c70-4f99-aa1b-<wbr>e993de6e4739"<br>

>     > on_node="olegdbx61-vm000001" call-id="10" rc-code="189" op-status="4"<br>

>     > interval="0" last-run="1495057378" last-rc-change="1495057378"<br>

>     > exec-time="0" queue-time="0"<br>

>     op-digest="<wbr>f6bd1386a336e8e6ee25ecb651a9ef<wbr>b6"/><br>

>     ><br>

>     ><br>

>     > 2) error level one (op-status=4, rc-code=6), which generates logs:<br>

>     > crm_mon:    error: unpack_rsc_op:    Preventing dbx_head_head from<br>

>     > re-starting anywhere: operation monitor failed 'not configured' (6)<br>

>     ><br>

>     > * Example:<br>

>     > <lrm_rsc_op id="dbx_head_head_last_<wbr>failure_0"<br>

>     > operation_key="dbx_head_head_<wbr>monitor_0" operation="monitor"<br>

>     > crm-debug-origin="do_update_<wbr>resource" crm_feature_set="3.0.12"<br>

>     > transition-key="39:0:7:<wbr>c8b63d9d-9c70-4f99-aa1b-<wbr>e993de6e4739"<br>

>     > transition-magic="4:6;39:0:7:<wbr>c8b63d9d-9c70-4f99-aa1b-<wbr>e993de6e4739"<br>

>     > on_node="olegdbx61-vm000001" call-id="999999999" rc-code="6"<br>

>     > op-status="4" interval="0" last-run="1495057389"<br>

>     > last-rc-change="1495057389" exec-time="0" queue-time="0"<br>

>     > op-digest="<wbr>60cdc9db1c5b77e8dba698d3d0c8cd<wbr>a8"/><br>

>     ><br>

>     ><br>

>     > Could it be some hardware (VM hyperviser) issue?<br>

>     ><br>

>     ><br>

>     > Thanks in advance,<br>

>     ><br>

>     > --<br>

>     > Best Regards,<br>

>     ><br>

>     > Radoslaw Garbacz<br>

>     > XtremeData Incorporated<br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>Best Regards,<br><br>Radoslaw Garbacz<br></div>XtremeData Incorporated<br></div></div></div></div>

</div>