[ClusterLabs] Upgrade corosync problem
Salvatore D'angelo
sasadangelo at gmail.com
Mon Jun 25 15:41:24 EDT 2018
Hi,
Let me add here one important detail. I use Docker for my test with 5 containers deployed on my Mac.
Basically the team that worked on this project installed the cluster on soft layer bare metal.
The PostgreSQL cluster was hard to test and if a misconfiguration occurred recreate the cluster from scratch is not easy.
Test it was a cumbersome if you consider that we access to the machines with a complex system hard to describe here.
For this reason I ported the cluster on Docker for test purpose. I am not interested to have it working for months, I just need a proof of concept.
When the migration works I’ll port everything on bare metal where the size of resources are ambundant.
Now I have enough RAM and disk space on my Mac so if you tell me what should be an acceptable size for several days of running it is ok for me.
It is ok also have commands to clean the shm when required.
I know I can find them on Google but if you can suggest me these info I’ll appreciate. I have OS knowledge to do that but I would like to avoid days of guesswork and try and error if possible.
> On 25 Jun 2018, at 21:18, Jan Pokorný <jpokorny at redhat.com> wrote:
>
> On 25/06/18 19:06 +0200, Salvatore D'angelo wrote:
>> Thanks for reply. I scratched my cluster and created it again and
>> then migrated as before. This time I uninstalled pacemaker,
>> corosync, crmsh and resource agents with make uninstall
>>
>> then I installed new packages. The problem is the same, when
>> I launch:
>> corosync-quorumtool -ps
>>
>> I got: Cannot initialize QUORUM service
>>
>> Here the log with debug enabled:
>>
>>
>> [18019] pg3 corosyncerror [QB ] couldn't create circular mmap on /dev/shm/qb-cfg-event-18020-18028-23-data
>> [18019] pg3 corosyncerror [QB ] qb_rb_open:cfg-event-18020-18028-23: Resource temporarily unavailable (11)
>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-request-18020-18028-23-header
>> [18019] pg3 corosyncdebug [QB ] Free'ing ringbuffer: /dev/shm/qb-cfg-response-18020-18028-23-header
>> [18019] pg3 corosyncerror [QB ] shm connection FAILED: Resource temporarily unavailable (11)
>> [18019] pg3 corosyncerror [QB ] Error in connection setup (18020-18028-23): Resource temporarily unavailable (11)
>>
>> I tried to check /dev/shm and I am not sure these are the right
>> commands, however:
>>
>> df -h /dev/shm
>> Filesystem Size Used Avail Use% Mounted on
>> shm 64M 16M 49M 24% /dev/shm
>>
>> ls /dev/shm
>> qb-cmap-request-18020-18036-25-data qb-corosync-blackbox-data qb-quorum-request-18020-18095-32-data
>> qb-cmap-request-18020-18036-25-header qb-corosync-blackbox-header qb-quorum-request-18020-18095-32-header
>>
>> Is 64 Mb enough for /dev/shm. If no, why it worked with previous
>> corosync release?
>
> For a start, can you try configuring corosync with
> --enable-small-memory-footprint switch?
>
> Hard to say why the space provisioned to /dev/shm is the direct
> opposite of generous (per today's standards), but may be the result
> of automatic HW adaptation, and if RAM is so scarce in your case,
> the above build-time toggle might help.
>
> If not, then exponentially increasing size of /dev/shm space is
> likely your best bet (I don't recommended fiddling with mlockall()
> and similar measures in corosync).
>
> Of course, feel free to raise a regression if you have a reproducible
> comparison between two corosync (plus possibly different libraries
> like libqb) versions, one that works and one that won't, in
> reproducible conditions (like this small /dev/shm, VM image, etc.).
>
> --
> Jan (Poki)
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list