<div dir="ltr"><div><div>Hi Andrew,<br><br></div>Here is the output of the verbose crm_failcount.<br><br> trace: set_crm_log_level: New log level: 8<br> trace: cib_native_signon_raw: Connecting cib_rw channel<br> trace: pick_ipc_buffer: Using max message size of 524288<br> debug: qb_rb_open_2: shm size:524301; real_size:528384; rb->word_size:132096<br> debug: qb_rb_open_2: shm size:524301; real_size:528384; rb->word_size:132096<br> debug: qb_rb_open_2: shm size:524301; real_size:528384; rb->word_size:132096<br> trace: mainloop_add_fd: Added connection 1 for cib_rw[0x1fd79c0].4<br> trace: pick_ipc_buffer: Using max message size of 51200<br> trace: crm_ipc_send: Sending from client: cib_rw request id: 1 bytes: 131 timeout:-1 msg...<br> trace: crm_ipc_send: Recieved response 1, size=140, rc=140, text: <cib_common_callback_worker cib_op="register" cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/><br> trace: cib_native_signon_raw: reg-reply <cib_common_callback_worker cib_op="register" cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/><br> debug: cib_native_signon_raw: Connection to CIB successful<br> trace: cib_create_op: Sending call options: 00001100, 4352<br> trace: cib_native_perform_op_delegate: Sending cib_query message to CIB service (timeout=120s)<br> trace: crm_ipc_send: Sending from client: cib_rw request id: 2 bytes: 211 timeout:120000 msg...<br> trace: internal_ipc_get_reply: client cib_rw waiting on reply to msg id 2<br> trace: crm_ipc_send: Recieved response 2, size=944, rc=944, text: <cib-reply t="cib" cib_op="cib_query" cib_callid="2" cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352" cib_rc="0"><cib_calldata><nodes><node uname="<a href="http://node2.domain.com">node2.domain.com</a>" id="o<br> trace: cib_native_perform_op_delegate: Reply <cib-reply t="cib" cib_op="cib_query" cib_callid="2" cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352" cib_rc="0"><br> trace: cib_native_perform_op_delegate: Reply <cib_calldata><br> trace: cib_native_perform_op_delegate: Reply <nodes><br> trace: cib_native_perform_op_delegate: Reply <node uname="<a href="http://node2.domain.com">node2.domain.com</a>" id="<a href="http://node2.domain.com">node2.domain.com</a>"><br> trace: cib_native_perform_op_delegate: Reply <instance_attributes id="<a href="http://nodes-node2.domain.com">nodes-node2.domain.com</a>"><br> trace: cib_native_perform_op_delegate: Reply <nvpair id="nodes-node2.domain.com-postgres_msg-data-status" name="postgres_msg-data-status" value="STREAMING|SYNC"/><br> trace: cib_native_perform_op_delegate: Reply <nvpair id="nodes-node2.domain.com-standby" name="standby" value="off"/><br> trace: cib_native_perform_op_delegate: Reply </instance_attributes><br> trace: cib_native_perform_op_delegate: Reply </node><br> trace: cib_native_perform_op_delegate: Reply <node uname="<a href="http://node1.domain.com">node1.domain.com</a>" id="<a href="http://node1.domain.com">node1.domain.com</a>"><br> trace: cib_native_perform_op_delegate: Reply <instance_attributes id="<a href="http://nodes-node1.domain.com">nodes-node1.domain.com</a>"><br> trace: cib_native_perform_op_delegate: Reply <nvpair id="nodes-node1.domain.com-postgres_msg-data-status" name="postgres_msg-data-status" value="LATEST"/><br> trace: cib_native_perform_op_delegate: Reply <nvpair id="nodes-node1.domain.com-standby" name="standby" value="off"/><br> trace: cib_native_perform_op_delegate: Reply </instance_attributes><br> trace: cib_native_perform_op_delegate: Reply </node><br> trace: cib_native_perform_op_delegate: Reply </nodes><br> trace: cib_native_perform_op_delegate: Reply </cib_calldata><br> trace: cib_native_perform_op_delegate: Reply </cib-reply><br> trace: cib_native_perform_op_delegate: Syncronous reply 2 received<br> debug: get_cluster_node_uuid: Result section <nodes><br> debug: get_cluster_node_uuid: Result section <node uname="<a href="http://node2.domain.com">node2.domain.com</a>" id="<a href="http://node2.domain.com">node2.domain.com</a>"><br> debug: get_cluster_node_uuid: Result section <instance_attributes id="<a href="http://nodes-node2.domain.com">nodes-node2.domain.com</a>"><br> debug: get_cluster_node_uuid: Result section <nvpair id="nodes-node2.domain.com-postgres_msg-data-status" name="postgres_msg-data-status" value="STREAMING|SYNC"/><br> debug: get_cluster_node_uuid: Result section <nvpair id="nodes-node2.domain.com-standby" name="standby" value="off"/><br> debug: get_cluster_node_uuid: Result section </instance_attributes><br> debug: get_cluster_node_uuid: Result section </node><br> debug: get_cluster_node_uuid: Result section <node uname="<a href="http://node1.domain.com">node1.domain.com</a>" id="<a href="http://node1.domain.com">node1.domain.com</a>"><br> debug: get_cluster_node_uuid: Result section <instance_attributes id="<a href="http://nodes-node1.domain.com">nodes-node1.domain.com</a>"><br> debug: get_cluster_node_uuid: Result section <nvpair id="nodes-node1.domain.com-postgres_msg-data-status" name="postgres_msg-data-status" value="LATEST"/><br> debug: get_cluster_node_uuid: Result section <nvpair id="nodes-node1.domain.com-standby" name="standby" value="off"/><br> debug: get_cluster_node_uuid: Result section </instance_attributes><br> debug: get_cluster_node_uuid: Result section </node><br> debug: get_cluster_node_uuid: Result section </nodes><br> info: query_node_uuid: Mapped <a href="http://node1.domain.com">node1.domain.com</a> to <a href="http://node1.domain.com">node1.domain.com</a><br> trace: pick_ipc_buffer: Using max message size of 51200<br> info: attrd_update_delegate: Connecting to cluster... 5 retries remaining<br> debug: qb_rb_open_2: shm size:51213; real_size:53248; rb->word_size:13312<br> debug: qb_rb_open_2: shm size:51213; real_size:53248; rb->word_size:13312<br> debug: qb_rb_open_2: shm size:51213; real_size:53248; rb->word_size:13312<br> trace: crm_ipc_send: Sending from client: attrd request id: 3 bytes: 168 timeout:5000 msg...<br> trace: internal_ipc_get_reply: client attrd waiting on reply to msg id 3<br> trace: crm_ipc_send: Recieved response 3, size=88, rc=88, text: <ack function="attrd_ipc_dispatch" line="129"/><br> debug: attrd_update_delegate: Sent update: (null)=(null) for <a href="http://node1.domain.com">node1.domain.com</a><br> info: main: Update (null)=<none> sent via attrd<br> debug: cib_native_signoff: Signing out of the CIB Service<br> trace: mainloop_del_fd: Removing client cib_rw[0x1fd79c0]<br> trace: mainloop_gio_destroy: Destroying client cib_rw[0x1fd79c0]<br> trace: crm_ipc_close: Disconnecting cib_rw IPC connection 0x1fdb020 (0x1fdb1a0.(nil))<br> debug: qb_ipcc_disconnect: qb_ipcc_disconnect()<br> trace: qb_rb_close: ENTERING qb_rb_close()<br> debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cib_rw-request-8347-9344-14-header<br> trace: qb_rb_close: ENTERING qb_rb_close()<br> debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cib_rw-response-8347-9344-14-header<br> trace: qb_rb_close: ENTERING qb_rb_close()<br> debug: qb_rb_close: Closing ringbuffer: /dev/shm/qb-cib_rw-event-8347-9344-14-header<br> trace: cib_native_destroy: destroying 0x1fd7910<br> trace: crm_ipc_destroy: Destroying IPC connection to cib_rw: 0x1fdb020<br> trace: mainloop_gio_destroy: Destroyed client cib_rw[0x1fd79c0]<br> trace: crm_exit: cleaning up libxml<br> info: crm_xml_cleanup: Cleaning up memory from libxml2<br> trace: crm_exit: exit 0<br><br></div>I hope it helps.<br></div><div class="gmail_extra"><br><div class="gmail_quote">2015-05-20 6:34 GMT+02:00 Andrew Beekhof <span dir="ltr"><<a href="mailto:andrew@beekhof.net" target="_blank">andrew@beekhof.net</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>
> On 4 May 2015, at 6:43 pm, Alexandre <<a href="mailto:alxgomz@gmail.com">alxgomz@gmail.com</a>> wrote:<br>
><br>
> Hi,<br>
><br>
> I have a pacemaker / corosync / cman cluster running on redhat 6.6.<br>
> Although cluster is working as expected, I have some trace of old failures (several monthes ago) I can't gert rid of.<br>
> Basically I have set cluster-recheck-interval="300" and failure-timeout="600" (in rsc_defaults) as shown bellow:<br>
><br>
> property $id="cib-bootstrap-options" \<br>
> dc-version="1.1.10-14.el6-368c726" \<br>
> cluster-infrastructure="cman" \<br>
> expected-quorum-votes="2" \<br>
> no-quorum-policy="ignore" \<br>
> stonith-enabled="false" \<br>
> last-lrm-refresh="1429702408" \<br>
> maintenance-mode="false" \<br>
> cluster-recheck-interval="300"<br>
> rsc_defaults $id="rsc-options" \<br>
> failure-timeout="600"<br>
><br>
> So I would expect old failure to be purged from the cib long ago, but actually I have the following when issuing crm_mon -frA1.<br>
<br>
</span>I think automatic deletion didnt arrive until later.<br>
<span class=""><br>
><br>
> Migration summary:<br>
> * Node host1:<br>
> etc_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure='Sat Feb 14 17:04:05 2015'<br>
> spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244 last-failure='Sat Feb 14 17:04:05 2015'<br>
> lib_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure='Sat Feb 14 17:04:05 2015'<br>
> lib_imap_drbd: migration-threshold=1000000 fail-count=244 last-failure='Sat Feb 14 17:04:05 2015'<br>
> spool_imap_drbd: migration-threshold=1000000 fail-count=11654 last-failure='Sat Feb 14 17:04:05 2015'<br>
> spool_ml_drbd: migration-threshold=1000000 fail-count=244 last-failure='Sat Feb 14 17:04:05 2015'<br>
> documents_drbd: migration-threshold=1000000 fail-count=248 last-failure='Sat Feb 14 17:58:55 2015'<br>
> * Node host2<br>
> documents_drbd: migration-threshold=1000000 fail-count=548 last-failure='Sat Feb 14 16:26:33 2015'<br>
><br>
> I have tried to crm_failcount -D the resources also tried cleanup... but it's still there!<br>
<br>
</span>Oh? Can you re-run with -VVVVVV and show us the result?<br>
<span class=""><br>
> How can I get reid of those record (so my monitoring tools stop complaining) .<br>
><br>
> Regards.<br>
</span>> _______________________________________________<br>
> Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
> <a href="http://clusterlabs.org/mailman/listinfo/users" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
><br>
> Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
> Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
> Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
<br>
<br>
_______________________________________________<br>
Users mailing list: <a href="mailto:Users@clusterlabs.org">Users@clusterlabs.org</a><br>
<a href="http://clusterlabs.org/mailman/listinfo/users" target="_blank">http://clusterlabs.org/mailman/listinfo/users</a><br>
<br>
Project Home: <a href="http://www.clusterlabs.org" target="_blank">http://www.clusterlabs.org</a><br>
Getting started: <a href="http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf" target="_blank">http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf</a><br>
Bugs: <a href="http://bugs.clusterlabs.org" target="_blank">http://bugs.clusterlabs.org</a><br>
</blockquote></div><br></div>