<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Hi,<div class=""><br class=""></div><div class="">Thanks for reply and detailed explaination. I am not using the —network=host option.</div><div class="">I have a docker image based on Ubuntu 14.04 where I only deploy this additional software:</div><div class=""><br class=""></div><div class=""><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>RUN apt-get update && apt-get install -y wget git xz-utils openssh-server \</b></div><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>systemd-services make gcc pkg-config psmisc fuse libpython2.7 libopenipmi0 \</b></div><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>libdbus-glib-1-2 libsnmp30 libtimedate-perl libpcap0.8</b></div><div class=""><br class=""></div><div class="">configure ssh with key pairs to communicate easily. The containers are created with these simple commands:</div><div class=""><br class=""></div><div class=""><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device /dev/loop0 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish <span class="Apple-tab-span" style="white-space:pre"> </span>${PG1_SSH_PORT}:22 --ip ${PG1_PUBLIC_IP} --name ${PG1_PRIVATE_NAME} --hostname ${PG1_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash</b></div><div class=""><br class=""></div><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device /dev/loop1 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish ${PG2_SSH_PORT}:22 --ip ${PG2_PUBLIC_IP} --name ${PG2_PRIVATE_NAME} --hostname ${PG2_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash<span class="Apple-tab-span" style="white-space:pre"> </span></b></div><div class=""><br class=""></div><div class=""><b class=""><span class="Apple-tab-span" style="white-space:pre"> </span>docker create -it --cap-add=MKNOD --cap-add SYS_ADMIN --device /dev/loop2 --device /dev/fuse --net ${PUBLIC_NETWORK_NAME} --publish ${PG3_SSH_PORT}:22 --ip ${PG3_PUBLIC_IP} --name ${PG3_PRIVATE_NAME} --hostname ${PG3_PRIVATE_NAME} -v ${MOUNT_FOLDER}:/Users ngha /bin/bash</b></div></div><div class=""><br class=""></div><div class="">/dev/fuse is used to configure glusterfs on two others nodes and /dev/loopX just to simulate better my bare metal env.</div><div class=""><br class=""></div><div class="">One thing that I do not understand is that I tried to compare corosync 2.3.5 (the old version that worked fine) and 2.4.4 to understand differences but I haven’t found anything related to the piece of code that affects the issue. The quorum tool.c and cfg.c are almost the same. Probably the issue is somewhere else.</div><div class=""><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class="">On 27 Jun 2018, at 08:34, Jan Pokorný <<a href="mailto:jpokorny@redhat.com" class="">jpokorny@redhat.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div class="">On 26/06/18 17:56 +0200, Salvatore D'angelo wrote:<br class=""><blockquote type="cite" class="">I did another test. I modified docker container in order to be able to run strace.<br class="">Running strace corosync-quorumtool -ps I got the following:<br class=""></blockquote><br class=""><blockquote type="cite" class="">[snipped]<br class="">connect(5, {sa_family=AF_LOCAL, sun_path=@"cfg"}, 110) = 0<br class="">setsockopt(5, SOL_SOCKET, SO_PASSCRED, [1], 4) = 0<br class="">sendto(5, "\377\377\377\377\0\0\0\0\30\0\0\0\0\0\0\0\0\0\20\0\0\0\0\0", 24, MSG_NOSIGNAL, NULL, 0) = 24<br class="">setsockopt(5, SOL_SOCKET, SO_PASSCRED, [0], 4) = 0<br class="">recvfrom(5, 0x7ffd73bd7ac0, 12328, 16640, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)<br class="">poll([{fd=5, events=POLLIN}], 1, 4294967295) = 1 ([{fd=5, revents=POLLIN}])<br class="">recvfrom(5, "\377\377\377\377\0\0\0\0(0\0\0\0\0\0\0\365\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0"..., 12328, MSG_WAITALL|MSG_NOSIGNAL, NULL, NULL) = 12328<br class="">shutdown(5, SHUT_RDWR) = 0<br class="">close(5) = 0<br class="">write(2, "Cannot initialise CFG service\n", 30Cannot initialise CFG service) = 30<br class="">[snipped]<br class=""></blockquote><br class="">This just demonstrated the effect of already detailed server-side<br class="">error in the client, which communicates with the server just fine,<br class="">but as soon as the server hits the mmap-based problem, it bails<br class="">out the observed way, leaving client unsatisfied.<br class=""><br class="">Note one thing, abstract Unix sockets are being used for the<br class="">communication like this (observe the first line in the strace<br class="">output excerpt above), and if you happen to run container via<br class="">a docker command with --network=host, you may also be affected with<br class="">issues arising from abstract sockets not being isolated but rather<br class="">sharing the same namespace. At least that was the case some years<br class="">back and what asked for a switch in underlying libqb library to<br class="">use strictly the file-backed sockets, where the isolation<br class="">semantics matches the intuition:<br class=""><br class=""><a href="https://lists.clusterlabs.org/pipermail/users/2017-May/013003.html" class="">https://lists.clusterlabs.org/pipermail/users/2017-May/013003.html</a><br class=""><br class="">+ way to enable (presumably only for container environments, note<br class="">that there's no per process straightforward granularity):<br class=""><br class="">https://clusterlabs.github.io/libqb/1.0.2/doxygen/qb_ipc_overview.html<br class="">(scroll down to "IPC sockets (Linux only)")<br class=""><br class="">You may test that if you are using said --network=host switch.<br class=""><br class=""><blockquote type="cite" class="">I tried to understand what happen behind the scene but it is not easy for me.<br class="">Hoping someone on this list can help.<br class=""></blockquote><br class="">Containers are tricky, just as Ansible (as shown earlier on the list)<br class="">can be, when encumbered with false believes and/or misunderstandings.<br class="">Virtual machines may serve better wrt. insights for the later bare<br class="">metal deployments.<br class=""><br class="">-- <br class="">Jan (Poki)<br class="">_______________________________________________<br class="">Users mailing list: Users@clusterlabs.org<br class="">https://lists.clusterlabs.org/mailman/listinfo/users<br class=""><br class="">Project Home: http://www.clusterlabs.org<br class="">Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf<br class="">Bugs: http://bugs.clusterlabs.org<br class=""></div></div></blockquote></div><br class=""></div></body></html>