[ClusterLabs] Upgrade corosync problem

Jan Pokorný jpokorny at redhat.com
Wed Jun 27 02:34:18 EDT 2018

On 26/06/18 17:56 +0200, Salvatore D'angelo wrote:
> I did another test. I modified docker container in order to be able to run strace.
> Running strace corosync-quorumtool -ps I got the following:

> [snipped]
> connect(5, {sa_family=AF_LOCAL, sun_path=@"cfg"}, 110) = 0
> setsockopt(5, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0
> sendto(5, "\377\377\377\377\0\0\0\0\30\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0", 24, MSG_NOSIGNAL, NULL, 0) = 24
> setsockopt(5, SOL_SOCKET, SO_PASSCRED, [0], 4) = 0
> recvfrom(5, 0x7ffd73bd7ac0, 12328, 16640, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
> poll([{fd=5, events=POLLIN}], 1, 4294967295) = 1 ([{fd=5, revents=POLLIN}])
> recvfrom(5, "\377\377\377\377\0\0\0\0(0\0\0\0\0\0\0\365\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0"..., 12328, MSG_WAITALL|MSG_NOSIGNAL, NULL, NULL) = 12328
> shutdown(5, SHUT_RDWR)                  = 0
> close(5)                                = 0
> write(2, "Cannot initialise CFG service\n", 30Cannot initialise CFG service) = 30
> [snipped]

This just demonstrated the effect of already detailed server-side
error in the client, which communicates with the server just fine,
but as soon as the server hits the mmap-based problem, it bails
out the observed way, leaving client unsatisfied.

Note one thing, abstract Unix sockets are being used for the
communication like this (observe the first line in the strace
output excerpt above), and if you happen to run container via
a docker command with --network=host, you may also be affected with
issues arising from abstract sockets not being isolated but rather
sharing the same namespace.  At least that was the case some years
back and what asked for a switch in underlying libqb library to
use strictly the file-backed sockets, where the isolation
semantics matches the intuition:


+ way to enable (presumably only for container environments, note
that there's no per process straightforward granularity):

(scroll down to "IPC sockets (Linux only)")

You may test that if you are using said --network=host switch.

> I tried to understand what happen behind the scene but it is not easy for me.
> Hoping someone on this list can help.

Containers are tricky, just as Ansible (as shown earlier on the list)
can be, when encumbered with false believes and/or misunderstandings.
Virtual machines may serve better wrt. insights for the later bare
metal deployments.

Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180627/79c6a5bd/attachment-0002.sig>

More information about the Users mailing list