[ClusterLabs] Upgrade corosync problem
Salvatore D'angelo
sasadangelo at gmail.com
Tue Jun 26 04:40:11 EDT 2018
Hi,
Yes,
I am reproducing only the required part for test. I think the original system has a larger shm. The problem is that I do not know exactly how to change it.
I tried the following steps, but I have the impression I didn’t performed the right one:
1. remove everything under /tmp
2. Added the following line to /etc/fstab
tmpfs /tmp tmpfs defaults,nodev,nosuid,mode=1777,size=128M 0 0
3. mount /tmp
4. df -h
Filesystem Size Used Avail Use% Mounted on
overlay 63G 11G 49G 19% /
tmpfs 64M 4.0K 64M 1% /dev
tmpfs 1000M 0 1000M 0% /sys/fs/cgroup
osxfs 466G 158G 305G 35% /Users
/dev/sda1 63G 11G 49G 19% /etc/hosts
shm 64M 11M 54M 16% /dev/shm
tmpfs 1000M 0 1000M 0% /sys/firmware
tmpfs 128M 0 128M 0% /tmp
The errors are exactly the same.
I have the impression that I changed the wrong parameter. Probably I have to change:
shm 64M 11M 54M 16% /dev/shm
but I do not know how to do that. Any suggestion?
> On 26 Jun 2018, at 09:48, Christine Caulfield <ccaulfie at redhat.com> wrote:
>
> On 25/06/18 20:41, Salvatore D'angelo wrote:
>> Hi,
>>
>> Let me add here one important detail. I use Docker for my test with 5 containers deployed on my Mac.
>> Basically the team that worked on this project installed the cluster on soft layer bare metal.
>> The PostgreSQL cluster was hard to test and if a misconfiguration occurred recreate the cluster from scratch is not easy.
>> Test it was a cumbersome if you consider that we access to the machines with a complex system hard to describe here.
>> For this reason I ported the cluster on Docker for test purpose. I am not interested to have it working for months, I just need a proof of concept.
>>
>> When the migration works I’ll port everything on bare metal where the size of resources are ambundant.
>>
>> Now I have enough RAM and disk space on my Mac so if you tell me what should be an acceptable size for several days of running it is ok for me.
>> It is ok also have commands to clean the shm when required.
>> I know I can find them on Google but if you can suggest me these info I’ll appreciate. I have OS knowledge to do that but I would like to avoid days of guesswork and try and error if possible.
>
>
> I would recommend at least 128MB of space on /dev/shm, 256MB if you can
> spare it. My 'standard' system uses 75MB under normal running allowing
> for one command-line query to run.
>
> If I read this right then you're reproducing a bare-metal system in
> containers now? so the original systems will have a default /dev/shm
> size which is probably much larger than your containers?
>
> I'm just checking here that we don't have a regression in memory usage
> as Poki suggested.
>
> Chrissie
>
>>> On 25 Jun 2018, at 21:18, Jan Pokorný <jpokorny at redhat.com> wrote:
>>>
>>> On 25/06/18 19:06 +0200, Salvatore D'angelo wrote:
>>>> Thanks for reply. I scratched my cluster and created it again and
>>>> then migrated as before. This time I uninstalled pacemaker,
>>>> corosync, crmsh and resource agents with make uninstall
>>>>
>>>> then I installed new packages. The problem is the same, when
>>>> I launch:
>>>> corosync-quorumtool -ps
>>>>
>>>> I got: Cannot initialize QUORUM service
>>>>
>>>> Here the log with debug enabled:
>>>>
>>>>
>>>> [18019] pg3 corosyncerror [QB ] couldn't create circular mmap on /dev/shm/qb-cfg-event-18020-18028-23-data
>>>> [18019] pg3 corosyncerror [QB ] qb_rb_open:cfg-event-18020-18028-23: Resource temporarily unavailable (11)
>>>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-request-18020-18028-23-header
>>>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-response-18020-18028-23-header
>>>> [18019] pg3 corosyncerror [QB ] shm connection FAILED: Resource temporarily unavailable (11)
>>>> [18019] pg3 corosyncerror [QB ] Error in connection setup (18020-18028-23): Resource temporarily unavailable (11)
>>>>
>>>> I tried to check /dev/shm and I am not sure these are the right
>>>> commands, however:
>>>>
>>>> df -h /dev/shm
>>>> Filesystem Size Used Avail Use% Mounted on
>>>> shm 64M 16M 49M 24% /dev/shm
>>>>
>>>> ls /dev/shm
>>>> qb-cmap-request-18020-18036-25-data qb-corosync-blackbox-data qb-quorum-request-18020-18095-32-data
>>>> qb-cmap-request-18020-18036-25-header qb-corosync-blackbox-header qb-quorum-request-18020-18095-32-header
>>>>
>>>> Is 64 Mb enough for /dev/shm. If no, why it worked with previous
>>>> corosync release?
>>>
>>> For a start, can you try configuring corosync with
>>> --enable-small-memory-footprint switch?
>>>
>>> Hard to say why the space provisioned to /dev/shm is the direct
>>> opposite of generous (per today's standards), but may be the result
>>> of automatic HW adaptation, and if RAM is so scarce in your case,
>>> the above build-time toggle might help.
>>>
>>> If not, then exponentially increasing size of /dev/shm space is
>>> likely your best bet (I don't recommended fiddling with mlockall()
>>> and similar measures in corosync).
>>>
>>> Of course, feel free to raise a regression if you have a reproducible
>>> comparison between two corosync (plus possibly different libraries
>>> like libqb) versions, one that works and one that won't, in
>>> reproducible conditions (like this small /dev/shm, VM image, etc.).
>>>
>>> --
>>> Jan (Poki)
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
>> https://lists.clusterlabs.org/mailman/listinfo/users <https://lists.clusterlabs.org/mailman/listinfo/users>
>>
>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
>>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
> https://lists.clusterlabs.org/mailman/listinfo/users <https://lists.clusterlabs.org/mailman/listinfo/users>
>
> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20180626/627eefd4/attachment-0002.html>
More information about the Users
mailing list