[ClusterLabs] Cannot clean history

Alexandre alxgomz at gmail.com
Mon Jun 8 20:02:48 UTC 2015


I still have this pending records which brake the monitoring tool I have
(check _crm.pl).

Any help much appreciated.
Le 26 mai 2015 10:58, "Alexandre" <alxgomz at gmail.com> a écrit :

> Hi Andrew,
>
> Here is the output of the verbose crm_failcount.
>
>    trace: set_crm_log_level:     New log level: 8
>    trace: cib_native_signon_raw:     Connecting cib_rw channel
>    trace: pick_ipc_buffer:     Using max message size of 524288
>    debug: qb_rb_open_2:     shm size:524301; real_size:528384;
> rb->word_size:132096
>    debug: qb_rb_open_2:     shm size:524301; real_size:528384;
> rb->word_size:132096
>    debug: qb_rb_open_2:     shm size:524301; real_size:528384;
> rb->word_size:132096
>    trace: mainloop_add_fd:     Added connection 1 for cib_rw[0x1fd79c0].4
>    trace: pick_ipc_buffer:     Using max message size of 51200
>    trace: crm_ipc_send:     Sending from client: cib_rw request id: 1
> bytes: 131 timeout:-1 msg...
>    trace: crm_ipc_send:     Recieved response 1, size=140, rc=140, text:
> <cib_common_callback_worker cib_op="register"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
>    trace: cib_native_signon_raw:     reg-reply
> <cib_common_callback_worker cib_op="register"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
>    debug: cib_native_signon_raw:     Connection to CIB successful
>    trace: cib_create_op:     Sending call options: 00001100, 4352
>    trace: cib_native_perform_op_delegate:     Sending cib_query message to
> CIB service (timeout=120s)
>    trace: crm_ipc_send:     Sending from client: cib_rw request id: 2
> bytes: 211 timeout:120000 msg...
>    trace: internal_ipc_get_reply:     client cib_rw waiting on reply to
> msg id 2
>    trace: crm_ipc_send:     Recieved response 2, size=944, rc=944, text:
> <cib-reply t="cib" cib_op="cib_query" cib_callid="2"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
> cib_rc="0"><cib_calldata><nodes><node uname="node2.domain.com" id="o
>    trace: cib_native_perform_op_delegate:     Reply   <cib-reply t="cib"
> cib_op="cib_query" cib_callid="2"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
> cib_rc="0">
>    trace: cib_native_perform_op_delegate:     Reply     <cib_calldata>
>    trace: cib_native_perform_op_delegate:     Reply       <nodes>
>    trace: cib_native_perform_op_delegate:     Reply         <node uname="
> node2.domain.com" id="node2.domain.com">
>    trace: cib_native_perform_op_delegate:     Reply
> <instance_attributes id="nodes-node2.domain.com">
>    trace: cib_native_perform_op_delegate:     Reply             <nvpair
> id="nodes-node2.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="STREAMING|SYNC"/>
>    trace: cib_native_perform_op_delegate:     Reply             <nvpair
> id="nodes-node2.domain.com-standby" name="standby" value="off"/>
>    trace: cib_native_perform_op_delegate:     Reply
> </instance_attributes>
>    trace: cib_native_perform_op_delegate:     Reply         </node>
>    trace: cib_native_perform_op_delegate:     Reply         <node uname="
> node1.domain.com" id="node1.domain.com">
>    trace: cib_native_perform_op_delegate:     Reply
> <instance_attributes id="nodes-node1.domain.com">
>    trace: cib_native_perform_op_delegate:     Reply             <nvpair
> id="nodes-node1.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="LATEST"/>
>    trace: cib_native_perform_op_delegate:     Reply             <nvpair
> id="nodes-node1.domain.com-standby" name="standby" value="off"/>
>    trace: cib_native_perform_op_delegate:     Reply
> </instance_attributes>
>    trace: cib_native_perform_op_delegate:     Reply         </node>
>    trace: cib_native_perform_op_delegate:     Reply       </nodes>
>    trace: cib_native_perform_op_delegate:     Reply     </cib_calldata>
>    trace: cib_native_perform_op_delegate:     Reply   </cib-reply>
>    trace: cib_native_perform_op_delegate:     Syncronous reply 2 received
>    debug: get_cluster_node_uuid:     Result section   <nodes>
>    debug: get_cluster_node_uuid:     Result section     <node uname="
> node2.domain.com" id="node2.domain.com">
>    debug: get_cluster_node_uuid:     Result section
> <instance_attributes id="nodes-node2.domain.com">
>    debug: get_cluster_node_uuid:     Result section         <nvpair
> id="nodes-node2.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="STREAMING|SYNC"/>
>    debug: get_cluster_node_uuid:     Result section         <nvpair
> id="nodes-node2.domain.com-standby" name="standby" value="off"/>
>    debug: get_cluster_node_uuid:     Result section
> </instance_attributes>
>    debug: get_cluster_node_uuid:     Result section     </node>
>    debug: get_cluster_node_uuid:     Result section     <node uname="
> node1.domain.com" id="node1.domain.com">
>    debug: get_cluster_node_uuid:     Result section
> <instance_attributes id="nodes-node1.domain.com">
>    debug: get_cluster_node_uuid:     Result section         <nvpair
> id="nodes-node1.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="LATEST"/>
>    debug: get_cluster_node_uuid:     Result section         <nvpair
> id="nodes-node1.domain.com-standby" name="standby" value="off"/>
>    debug: get_cluster_node_uuid:     Result section
> </instance_attributes>
>    debug: get_cluster_node_uuid:     Result section     </node>
>    debug: get_cluster_node_uuid:     Result section   </nodes>
>     info: query_node_uuid:     Mapped node1.domain.com to node1.domain.com
>    trace: pick_ipc_buffer:     Using max message size of 51200
>     info: attrd_update_delegate:     Connecting to cluster... 5 retries
> remaining
>    debug: qb_rb_open_2:     shm size:51213; real_size:53248;
> rb->word_size:13312
>    debug: qb_rb_open_2:     shm size:51213; real_size:53248;
> rb->word_size:13312
>    debug: qb_rb_open_2:     shm size:51213; real_size:53248;
> rb->word_size:13312
>    trace: crm_ipc_send:     Sending from client: attrd request id: 3
> bytes: 168 timeout:5000 msg...
>    trace: internal_ipc_get_reply:     client attrd waiting on reply to msg
> id 3
>    trace: crm_ipc_send:     Recieved response 3, size=88, rc=88, text:
> <ack function="attrd_ipc_dispatch" line="129"/>
>    debug: attrd_update_delegate:     Sent update: (null)=(null) for
> node1.domain.com
>     info: main:     Update (null)=<none> sent via attrd
>    debug: cib_native_signoff:     Signing out of the CIB Service
>    trace: mainloop_del_fd:     Removing client cib_rw[0x1fd79c0]
>    trace: mainloop_gio_destroy:     Destroying client cib_rw[0x1fd79c0]
>    trace: crm_ipc_close:     Disconnecting cib_rw IPC connection 0x1fdb020
> (0x1fdb1a0.(nil))
>    debug: qb_ipcc_disconnect:     qb_ipcc_disconnect()
>    trace: qb_rb_close:     ENTERING qb_rb_close()
>    debug: qb_rb_close:     Closing ringbuffer:
> /dev/shm/qb-cib_rw-request-8347-9344-14-header
>    trace: qb_rb_close:     ENTERING qb_rb_close()
>    debug: qb_rb_close:     Closing ringbuffer:
> /dev/shm/qb-cib_rw-response-8347-9344-14-header
>    trace: qb_rb_close:     ENTERING qb_rb_close()
>    debug: qb_rb_close:     Closing ringbuffer:
> /dev/shm/qb-cib_rw-event-8347-9344-14-header
>    trace: cib_native_destroy:     destroying 0x1fd7910
>    trace: crm_ipc_destroy:     Destroying IPC connection to cib_rw:
> 0x1fdb020
>    trace: mainloop_gio_destroy:     Destroyed client cib_rw[0x1fd79c0]
>    trace: crm_exit:     cleaning up libxml
>     info: crm_xml_cleanup:     Cleaning up memory from libxml2
>    trace: crm_exit:     exit 0
>
> I hope it helps.
>
> 2015-05-20 6:34 GMT+02:00 Andrew Beekhof <andrew at beekhof.net>:
>
>>
>> > On 4 May 2015, at 6:43 pm, Alexandre <alxgomz at gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I have a pacemaker / corosync / cman cluster running on redhat 6.6.
>> > Although cluster is working as expected, I have some trace of old
>> failures (several monthes ago) I can't gert rid of.
>> > Basically I have set cluster-recheck-interval="300" and
>> failure-timeout="600" (in rsc_defaults) as shown bellow:
>> >
>> > property $id="cib-bootstrap-options" \
>> >     dc-version="1.1.10-14.el6-368c726" \
>> >     cluster-infrastructure="cman" \
>> >     expected-quorum-votes="2" \
>> >     no-quorum-policy="ignore" \
>> >     stonith-enabled="false" \
>> >     last-lrm-refresh="1429702408" \
>> >     maintenance-mode="false" \
>> >     cluster-recheck-interval="300"
>> > rsc_defaults $id="rsc-options" \
>> >     failure-timeout="600"
>> >
>> > So I would expect old failure to be purged from the cib long ago, but
>> actually I have the following when issuing crm_mon -frA1.
>>
>> I think automatic deletion didnt arrive until later.
>>
>> >
>> > Migration summary:
>> > * Node host1:
>> >    etc_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    lib_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    lib_imap_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    spool_imap_drbd: migration-threshold=1000000 fail-count=11654
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    spool_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> >    documents_drbd: migration-threshold=1000000 fail-count=248
>> last-failure='Sat Feb 14 17:58:55 2015'
>> > * Node host2
>> >    documents_drbd: migration-threshold=1000000 fail-count=548
>> last-failure='Sat Feb 14 16:26:33 2015'
>> >
>> > I have tried to crm_failcount -D the resources also tried cleanup...
>> but it's still there!
>>
>> Oh?  Can you re-run with -VVVVVV and show us the result?
>>
>> > How can I get reid of those record (so my monitoring tools stop
>> complaining) .
>> >
>> > Regards.
>> > _______________________________________________
>> > Users mailing list: Users at clusterlabs.org
>> > http://clusterlabs.org/mailman/listinfo/users
>> >
>> > Project Home: http://www.clusterlabs.org
>> > Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> > Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150608/2d475b0f/attachment-0003.html>


More information about the Users mailing list