[ClusterLabs] Cannot clean history
Alexandre
alxgomz at gmail.com
Mon Jun 8 20:02:48 UTC 2015
I still have this pending records which brake the monitoring tool I have
(check _crm.pl).
Any help much appreciated.
Le 26 mai 2015 10:58, "Alexandre" <alxgomz at gmail.com> a écrit :
> Hi Andrew,
>
> Here is the output of the verbose crm_failcount.
>
> trace: set_crm_log_level: New log level: 8
> trace: cib_native_signon_raw: Connecting cib_rw channel
> trace: pick_ipc_buffer: Using max message size of 524288
> debug: qb_rb_open_2: shm size:524301; real_size:528384;
> rb->word_size:132096
> debug: qb_rb_open_2: shm size:524301; real_size:528384;
> rb->word_size:132096
> debug: qb_rb_open_2: shm size:524301; real_size:528384;
> rb->word_size:132096
> trace: mainloop_add_fd: Added connection 1 for cib_rw[0x1fd79c0].4
> trace: pick_ipc_buffer: Using max message size of 51200
> trace: crm_ipc_send: Sending from client: cib_rw request id: 1
> bytes: 131 timeout:-1 msg...
> trace: crm_ipc_send: Recieved response 1, size=140, rc=140, text:
> <cib_common_callback_worker cib_op="register"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
> trace: cib_native_signon_raw: reg-reply
> <cib_common_callback_worker cib_op="register"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17"/>
> debug: cib_native_signon_raw: Connection to CIB successful
> trace: cib_create_op: Sending call options: 00001100, 4352
> trace: cib_native_perform_op_delegate: Sending cib_query message to
> CIB service (timeout=120s)
> trace: crm_ipc_send: Sending from client: cib_rw request id: 2
> bytes: 211 timeout:120000 msg...
> trace: internal_ipc_get_reply: client cib_rw waiting on reply to
> msg id 2
> trace: crm_ipc_send: Recieved response 2, size=944, rc=944, text:
> <cib-reply t="cib" cib_op="cib_query" cib_callid="2"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
> cib_rc="0"><cib_calldata><nodes><node uname="node2.domain.com" id="o
> trace: cib_native_perform_op_delegate: Reply <cib-reply t="cib"
> cib_op="cib_query" cib_callid="2"
> cib_clientid="f8cfae2d-51e6-4cd7-97f8-2d6f49bf1f17" cib_callopt="4352"
> cib_rc="0">
> trace: cib_native_perform_op_delegate: Reply <cib_calldata>
> trace: cib_native_perform_op_delegate: Reply <nodes>
> trace: cib_native_perform_op_delegate: Reply <node uname="
> node2.domain.com" id="node2.domain.com">
> trace: cib_native_perform_op_delegate: Reply
> <instance_attributes id="nodes-node2.domain.com">
> trace: cib_native_perform_op_delegate: Reply <nvpair
> id="nodes-node2.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="STREAMING|SYNC"/>
> trace: cib_native_perform_op_delegate: Reply <nvpair
> id="nodes-node2.domain.com-standby" name="standby" value="off"/>
> trace: cib_native_perform_op_delegate: Reply
> </instance_attributes>
> trace: cib_native_perform_op_delegate: Reply </node>
> trace: cib_native_perform_op_delegate: Reply <node uname="
> node1.domain.com" id="node1.domain.com">
> trace: cib_native_perform_op_delegate: Reply
> <instance_attributes id="nodes-node1.domain.com">
> trace: cib_native_perform_op_delegate: Reply <nvpair
> id="nodes-node1.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="LATEST"/>
> trace: cib_native_perform_op_delegate: Reply <nvpair
> id="nodes-node1.domain.com-standby" name="standby" value="off"/>
> trace: cib_native_perform_op_delegate: Reply
> </instance_attributes>
> trace: cib_native_perform_op_delegate: Reply </node>
> trace: cib_native_perform_op_delegate: Reply </nodes>
> trace: cib_native_perform_op_delegate: Reply </cib_calldata>
> trace: cib_native_perform_op_delegate: Reply </cib-reply>
> trace: cib_native_perform_op_delegate: Syncronous reply 2 received
> debug: get_cluster_node_uuid: Result section <nodes>
> debug: get_cluster_node_uuid: Result section <node uname="
> node2.domain.com" id="node2.domain.com">
> debug: get_cluster_node_uuid: Result section
> <instance_attributes id="nodes-node2.domain.com">
> debug: get_cluster_node_uuid: Result section <nvpair
> id="nodes-node2.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="STREAMING|SYNC"/>
> debug: get_cluster_node_uuid: Result section <nvpair
> id="nodes-node2.domain.com-standby" name="standby" value="off"/>
> debug: get_cluster_node_uuid: Result section
> </instance_attributes>
> debug: get_cluster_node_uuid: Result section </node>
> debug: get_cluster_node_uuid: Result section <node uname="
> node1.domain.com" id="node1.domain.com">
> debug: get_cluster_node_uuid: Result section
> <instance_attributes id="nodes-node1.domain.com">
> debug: get_cluster_node_uuid: Result section <nvpair
> id="nodes-node1.domain.com-postgres_msg-data-status"
> name="postgres_msg-data-status" value="LATEST"/>
> debug: get_cluster_node_uuid: Result section <nvpair
> id="nodes-node1.domain.com-standby" name="standby" value="off"/>
> debug: get_cluster_node_uuid: Result section
> </instance_attributes>
> debug: get_cluster_node_uuid: Result section </node>
> debug: get_cluster_node_uuid: Result section </nodes>
> info: query_node_uuid: Mapped node1.domain.com to node1.domain.com
> trace: pick_ipc_buffer: Using max message size of 51200
> info: attrd_update_delegate: Connecting to cluster... 5 retries
> remaining
> debug: qb_rb_open_2: shm size:51213; real_size:53248;
> rb->word_size:13312
> debug: qb_rb_open_2: shm size:51213; real_size:53248;
> rb->word_size:13312
> debug: qb_rb_open_2: shm size:51213; real_size:53248;
> rb->word_size:13312
> trace: crm_ipc_send: Sending from client: attrd request id: 3
> bytes: 168 timeout:5000 msg...
> trace: internal_ipc_get_reply: client attrd waiting on reply to msg
> id 3
> trace: crm_ipc_send: Recieved response 3, size=88, rc=88, text:
> <ack function="attrd_ipc_dispatch" line="129"/>
> debug: attrd_update_delegate: Sent update: (null)=(null) for
> node1.domain.com
> info: main: Update (null)=<none> sent via attrd
> debug: cib_native_signoff: Signing out of the CIB Service
> trace: mainloop_del_fd: Removing client cib_rw[0x1fd79c0]
> trace: mainloop_gio_destroy: Destroying client cib_rw[0x1fd79c0]
> trace: crm_ipc_close: Disconnecting cib_rw IPC connection 0x1fdb020
> (0x1fdb1a0.(nil))
> debug: qb_ipcc_disconnect: qb_ipcc_disconnect()
> trace: qb_rb_close: ENTERING qb_rb_close()
> debug: qb_rb_close: Closing ringbuffer:
> /dev/shm/qb-cib_rw-request-8347-9344-14-header
> trace: qb_rb_close: ENTERING qb_rb_close()
> debug: qb_rb_close: Closing ringbuffer:
> /dev/shm/qb-cib_rw-response-8347-9344-14-header
> trace: qb_rb_close: ENTERING qb_rb_close()
> debug: qb_rb_close: Closing ringbuffer:
> /dev/shm/qb-cib_rw-event-8347-9344-14-header
> trace: cib_native_destroy: destroying 0x1fd7910
> trace: crm_ipc_destroy: Destroying IPC connection to cib_rw:
> 0x1fdb020
> trace: mainloop_gio_destroy: Destroyed client cib_rw[0x1fd79c0]
> trace: crm_exit: cleaning up libxml
> info: crm_xml_cleanup: Cleaning up memory from libxml2
> trace: crm_exit: exit 0
>
> I hope it helps.
>
> 2015-05-20 6:34 GMT+02:00 Andrew Beekhof <andrew at beekhof.net>:
>
>>
>> > On 4 May 2015, at 6:43 pm, Alexandre <alxgomz at gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I have a pacemaker / corosync / cman cluster running on redhat 6.6.
>> > Although cluster is working as expected, I have some trace of old
>> failures (several monthes ago) I can't gert rid of.
>> > Basically I have set cluster-recheck-interval="300" and
>> failure-timeout="600" (in rsc_defaults) as shown bellow:
>> >
>> > property $id="cib-bootstrap-options" \
>> > dc-version="1.1.10-14.el6-368c726" \
>> > cluster-infrastructure="cman" \
>> > expected-quorum-votes="2" \
>> > no-quorum-policy="ignore" \
>> > stonith-enabled="false" \
>> > last-lrm-refresh="1429702408" \
>> > maintenance-mode="false" \
>> > cluster-recheck-interval="300"
>> > rsc_defaults $id="rsc-options" \
>> > failure-timeout="600"
>> >
>> > So I would expect old failure to be purged from the cib long ago, but
>> actually I have the following when issuing crm_mon -frA1.
>>
>> I think automatic deletion didnt arrive until later.
>>
>> >
>> > Migration summary:
>> > * Node host1:
>> > etc_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > spool_postfix_drbd_msg: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > lib_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > lib_imap_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > spool_imap_drbd: migration-threshold=1000000 fail-count=11654
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > spool_ml_drbd: migration-threshold=1000000 fail-count=244
>> last-failure='Sat Feb 14 17:04:05 2015'
>> > documents_drbd: migration-threshold=1000000 fail-count=248
>> last-failure='Sat Feb 14 17:58:55 2015'
>> > * Node host2
>> > documents_drbd: migration-threshold=1000000 fail-count=548
>> last-failure='Sat Feb 14 16:26:33 2015'
>> >
>> > I have tried to crm_failcount -D the resources also tried cleanup...
>> but it's still there!
>>
>> Oh? Can you re-run with -VVVVVV and show us the result?
>>
>> > How can I get reid of those record (so my monitoring tools stop
>> complaining) .
>> >
>> > Regards.
>> > _______________________________________________
>> > Users mailing list: Users at clusterlabs.org
>> > http://clusterlabs.org/mailman/listinfo/users
>> >
>> > Project Home: http://www.clusterlabs.org
>> > Getting started:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> > Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20150608/2d475b0f/attachment-0003.html>
More information about the Users
mailing list