[ClusterLabs] 'crm node standby' command failing with "Error performing operation: Communication error on send . Return code is 70"
Ken Gaillot
kgaillot at redhat.com
Mon Sep 24 09:57:06 EDT 2018
On Fri, 2018-09-21 at 13:34 +0530, Prasad Nagaraj wrote:
> Hi -
>
> Yesterday, I noticed that when I am trying to execute 'crm node
> standby' command on one of my cluster nodes, it was failing with
>
> "Error performing operation: Communication error on send . Return
> code is 70"
>
> My corosync logs had these entries during that time:
>
> Sep 20 22:14:54 [4454] vm5c336912f1 crmd: notice:
> throttle_handle_load: High CPU load detected: 1.850000
> Sep 20 22:14:57 [4449] vm5c336912f1 cib: info:
> cib_process_ping: Reporting our current digest to vmb546073338:
> 8fe67fcfcd20515c246c225a124a8902 for 0.481.2 (0x2742230 0)
> Sep 20 22:15:09 [4449] vm5c336912f1 cib: info:
> cib_process_request: Forwarding cib_modify operation for section
> nodes to master (origin=local/crm_attribute/4)
> Sep 20 22:15:24 [4454] vm5c336912f1 crmd: notice:
> throttle_handle_load: High CPU load detected: 1.640000
> Sep 20 22:15:54 [4454] vm5c336912f1 crmd: info:
> throttle_handle_load: Moderate CPU load detected: 0.990000
> Sep 20 22:15:54 [4454] vm5c336912f1 crmd: info:
> throttle_send_command: New throttle mode: 0010 (was 0100)
> Sep 20 22:16:24 [4454] vm5c336912f1 crmd: info:
> throttle_send_command: New throttle mode: 0001 (was 0010)
> Sep 20 22:16:54 [4454] vm5c336912f1 crmd: info:
> throttle_send_command: New throttle mode: 0000 (was 0001)
> Sep 20 22:17:09 [4449] vm5c336912f1 cib: info:
> cib_process_request: Forwarding cib_modify operation for section
> nodes to master (origin=local/crm_attribute/4)
> Sep 20 22:19:10 [4449] vm5c336912f1 cib: info:
> cib_process_request: Forwarding cib_modify operation for section
> nodes to master (origin=local/crm_attribute/4)
> Sep 20 22:23:08 [4449] vm5c336912f1 cib: info:
> cib_perform_op: Diff: --- 0.481.2 2
> Sep 20 22:23:08 [4449] vm5c336912f1 cib: info:
> cib_perform_op: Diff: +++ 0.482.0
> 9bacc862b8713430c81ea91694942a41
> Sep 20 22:23:08 [4449] vm5c336912f1 cib: info:
> cib_perform_op: + /cib: @epoch=482, @num_updates=0
>
>
> Is the above behavior due to pacemaker thinking that cluster is
> highly loaded and trying to throttle the execution of commands ? What
> is the best way to resolve or work-around such problems. We do have
> high io load on our cluster - which hosts mysql database.
Throttling is a natural way to handle occasional high load and is not a
problem in itself. I wouldn't expect a load of 1.85 to make a big
difference, so I wouldn't worry about that unless other load-related
problems emerge.
The error message you reported sounds more like a networking issue than
a load issue. Are you seeing any network issues around that time?
Especially corosync retransmits or token timeouts could be significant.
>
> Also from the thread,
> https://lists.clusterlabs.org/pipermail/users/2017-May/005702.html
>
> it was asked :
> >There is not much detail about “load-threshold”.
> > Please can someone share steps or any commands to modify “load-
> threshold”.
> Could someone advise whether this is the way to control the
> throttling of cluster operations and how to set this parameter ?
>
> Thanks in advance,
> Prasad
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
--
Ken Gaillot <kgaillot at redhat.com>
More information about the Users
mailing list