[ClusterLabs] Pacemaker remote - invalid message detected, endian mismatch
Radoslaw Garbacz
radoslaw.garbacz at xtremedatainc.com
Fri Sep 30 16:28:20 UTC 2016
Hi,
I have posted a question about this error attached to another thread, but
because it was old and there is no answer I thought it could have been
missed, so I am sorry for repeating it.
Regarding the problem.
I have a cluster, and when the cluster gets bigger (around 40 remote nodes)
some remote nodes go offline after a while and their logs report some
message errors, there is no indication about anything wrong in the other
logs.
Details:
- 40 ec2 m3.xlarge nodes, 1 corosync ring member, 39 remote
- maybe irrelevant, but either "cib" or "pengine" process goes to ~100% CPU
- it does not happen immediately
- smaller cluster (~20 remote nodes) does not show any problems
- pacemaker: 1.1.15-1.1f8e642.git.el6.x86_64
- corosync: 2.4.1-1.2.0da1.el6.x86_64
- libqb-1.0.0-1.28.4dff.el6.x86_64
- CentOS 6
Logs:
[...]
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_abort: crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_remote_header: Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_abort: crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_remote_header: Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_abort: crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_remote_header: Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: info:
lrmd_remote_client_msg: Client disconnect detected in tls msg dispatcher.
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: info:
ipc_proxy_remove_provider: ipc proxy connection for client
ca8df213-6da7-4c42-8cb3-b8bc0887f2ce pid 21815 destroyed because cluster
node disconnected.
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: info:
cancel_recurring_action: Cancelling ocf operation
monitor_all_monitor_191000
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_send_tls: Connection terminated rc = -53
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_send_tls: Connection terminated rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
crm_remote_send: Failed to send remote msg, rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: error:
lrmd_tls_send_msg: Failed to send remote lrmd tls msg, rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: warning:
send_client_notify: Notification of client
remote-lrmd-ip-10-237-223-67:3121/b6034d3a-e296-492f-b296-725735d17e22
failed
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted: notice:
lrmd_remote_client_destroy: LRMD client disconnecting remote client -
name: remote-lrmd-ip-10-237-223-67:3121 id: b6034d3a-e296-492f-b296-
725735d17e22
Sep 27 17:19:35 [19626] ip-10-237-223-67 pacemaker_remoted: error:
ipc_proxy_accept: No ipc providers available for uid 0 gid 0
Sep 27 17:19:35 [19626] ip-10-237-223-67 pacemaker_remoted: error:
handle_new_connection: Error in connection setup (19626-21815-14):
Remote I/O error (121)
Sep 27 17:19:50 [19626] ip-10-237-223-67 pacemaker_remoted: error:
ipc_proxy_accept: No ipc providers available for uid 0 gid 0
Sep 27 17:19:50 [19626] ip-10-237-223-67 pacemaker_remoted: error:
handle_new_connection: Error in connection setup (19626-21815-14):
Remote I/O error (121)
[...]
--
Best Regards,
Radoslaw Garbacz
XtremeData Incorporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20160930/e7330c9c/attachment-0003.html>
More information about the Users
mailing list