[ClusterLabs] Pacemaker remote - invalid message detected, endian mismatch

Radoslaw Garbacz radoslaw.garbacz at xtremedatainc.com
Fri Sep 30 12:28:20 EDT 2016


Hi,

I have posted a question about this error attached to another thread, but
because it was old and there is no answer I thought it could have been
missed, so I am sorry for repeating it.

Regarding the problem.
I have a cluster, and when the cluster gets bigger (around 40 remote nodes)
some remote nodes go offline after a while and their logs report some
message errors, there is no indication about anything wrong in the other
logs.

Details:
- 40 ec2 m3.xlarge nodes, 1 corosync ring member, 39 remote
- maybe irrelevant, but either "cib" or "pengine" process goes to ~100% CPU
- it does not happen immediately
- smaller cluster (~20 remote nodes) does not show any problems
- pacemaker: 1.1.15-1.1f8e642.git.el6.x86_64
- corosync: 2.4.1-1.2.0da1.el6.x86_64
- libqb-1.0.0-1.28.4dff.el6.x86_64
- CentOS 6

Logs:

[...]
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_abort:        crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_remote_header:        Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_abort:        crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_remote_header:        Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_abort:        crm_remote_header: Triggered assert at remote.c:119 :
endian == ENDIAN_LOCAL
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_remote_header:        Invalid message detected, endian mismatch:
badadbbd is neither 63646330 nor the swab'd 30636463
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:     info:
lrmd_remote_client_msg:   Client disconnect detected in tls msg dispatcher.
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:     info:
ipc_proxy_remove_provider:        ipc proxy connection for client
ca8df213-6da7-4c42-8cb3-b8bc0887f2ce pid 21815 destroyed because cluster
node disconnected.
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:     info:
cancel_recurring_action:  Cancelling ocf operation
monitor_all_monitor_191000
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_send_tls:     Connection terminated rc = -53
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_send_tls:     Connection terminated rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
crm_remote_send:  Failed to send remote msg, rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
lrmd_tls_send_msg:        Failed to send remote lrmd tls msg, rc = -10
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:  warning:
send_client_notify:       Notification of client
remote-lrmd-ip-10-237-223-67:3121/b6034d3a-e296-492f-b296-725735d17e22
failed
Sep 27 17:18:31 [19626] ip-10-237-223-67 pacemaker_remoted:   notice:
lrmd_remote_client_destroy:       LRMD client disconnecting remote client -
name: remote-lrmd-ip-10-237-223-67:3121 id: b6034d3a-e296-492f-b296-
725735d17e22
Sep 27 17:19:35 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
ipc_proxy_accept: No ipc providers available for uid 0 gid 0
Sep 27 17:19:35 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
handle_new_connection:    Error in connection setup (19626-21815-14):
Remote I/O error (121)
Sep 27 17:19:50 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
ipc_proxy_accept: No ipc providers available for uid 0 gid 0
Sep 27 17:19:50 [19626] ip-10-237-223-67 pacemaker_remoted:    error:
handle_new_connection:    Error in connection setup (19626-21815-14):
Remote I/O error (121)
[...]



-- 
Best Regards,

Radoslaw Garbacz
XtremeData Incorporation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20160930/e7330c9c/attachment-0002.html>


More information about the Users mailing list