[ClusterLabs] Antw: Re: Using different folder for /var/lib/pacemaker and usage of /dev/shm files
Ken Gaillot
kgaillot at redhat.com
Tue May 17 17:23:06 UTC 2016
On 05/17/2016 12:02 PM, Nikhil Utane wrote:
> OK. Will do that.
>
> Actually I gave the /dev/shm usage when the cluster wasn't up.
> When it is up, I see it occupies close to 300 MB (it's also the DC).
Hmmm, there should be no usage if the cluster is stopped. Any memory
used by the cluster will start with "qb-", so anything else is from
something else.
If all executables using libqb (including corosync and pacemaker) are
stopped, it's safe to remove any /dev/shm/qb-* files that remain. That
should be rare, probably only after a core dump or such.
> tmpfs 500.0M 329.4M 170.6M 66% /dev/shm
>
> On another node the same is 115 MB.
>
> Anyways, I'll monitor the usage to know what size is needed.
>
> Thank you Ken and Ulrich.
>
> On Tue, May 17, 2016 at 8:23 PM, Ken Gaillot <kgaillot at redhat.com
> <mailto:kgaillot at redhat.com>> wrote:
>
> On 05/17/2016 04:07 AM, Nikhil Utane wrote:
> > What I would like to understand is how much total shared memory
> > (approximately) would Pacemaker need so that accordingly I can define
> > the partition size. Currently it is 300 MB in our system. I recently ran
> > into insufficient shared memory issue because of improper clean-up. So
> > would like to understand how much Pacemaker would need for a 6-node
> > cluster so that accordingly I can increase it.
>
> I have no idea :-)
>
> I don't think there's any way to pre-calculate it. The libqb library is
> the part of the software stack that actually manages the shared memory,
> but it's used by everything -- corosync (including its cpg and
> votequorum components) and each pacemaker daemon.
>
> The size depends directly on the amount of communication activity in the
> cluster, which is only indirectly related to the number of
> nodes/resources/etc., the size of the CIB, etc. A cluster with nodes
> joining/leaving frequently and resources moving around a lot will use
> more shared memory than a cluster of the same size that's quiet. Cluster
> options such as cluster-recheck-interval would also matter.
>
> Practically, I think all you can do is simulate expected cluster
> configurations and loads, and see what it comes out to be.
>
> > # df -kh
> > tmpfs 300.0M 27.5M 272.5M 9% /dev/shm
> >
> > Thanks
> > Nikhil
> >
> > On Tue, May 17, 2016 at 12:09 PM, Ulrich Windl
> > <Ulrich.Windl at rz.uni-regensburg.de
> <mailto:Ulrich.Windl at rz.uni-regensburg.de>
> > <mailto:Ulrich.Windl at rz.uni-regensburg.de
> <mailto:Ulrich.Windl at rz.uni-regensburg.de>>> wrote:
> >
> > Hi!
> >
> > One of the main problems I identified with POSIX shared memory
> > (/dev/shm) in Linux is that changes to the shared memory don't
> > affect the i-node, so you cannot tell from a "ls -rtl" which
> > segments are still active and which are not. You can only see the
> > creation time.
> >
> > Maybe there should be a tool that identifies and cleans up obsolete
> > shared memory.
> > I don't understand the part talking about the size of /dev/shm: It's
> > shared memory. See "kernel.shmmax" and "kernel.shmall" in you sysctl
> > settings (/etc/sysctl.conf).
> >
> > Regards,
> > Ulrich
> >
> > >>> Nikhil Utane <nikhil.subscribed at gmail.com <mailto:nikhil.subscribed at gmail.com>
> > <mailto:nikhil.subscribed at gmail.com
> <mailto:nikhil.subscribed at gmail.com>>> schrieb am 16.05.2016 um 14:31 in
> > Nachricht
> >
> <CAGNWmJVSye5PJgkdbFAi5AzO+Qq-j=2fS1c+0rGnqS994vV48w at mail.gmail.com
> <mailto:2fS1c%2B0rGnqS994vV48w at mail.gmail.com>
> > <mailto:2fS1c%2B0rGnqS994vV48w at mail.gmail.com
> <mailto:2fS1c%252B0rGnqS994vV48w at mail.gmail.com>>>:
> > > Thanks Ken.
> > >
> > > Could you also respond on the second question?
> > >
> > >> Also, in /dev/shm I see that it created around 300+ files of
> > around
> > >> 250 MB.
> > >>
> > >> For e.g.
> > >> -rw-rw---- 1 hacluste hacluste 8232 May 6 13:03
> > >> qb-cib_rw-response-25035-25038-10-header
> > >> -rw-rw---- 1 hacluste hacluste 540672 May 6 13:03
> > >> qb-cib_rw-response-25035-25038-10-data
> > >> -rw------- 1 hacluste hacluste 8232 May 6 13:03
> > >> qb-cib_rw-response-25035-25036-12-header
> > >> -rw------- 1 hacluste hacluste 540672 May 6 13:03
> > >> qb-cib_rw-response-25035-25036-12-data
> > >> And many more..
> > >>
> > >> We have limited space in /dev/shm and all these files are
> > filling it
> > >> up. Are these all needed? Any way to limit? Do we need to do any
> > >> clean-up if pacemaker termination was not graceful? What's the
> > > recommended size for this folder for Pacemaker? Our cluster will have
> > > maximum 6 nodes.
> > >
> > > -Regards
> > > Nikhil
> > >
> > > On Sat, May 14, 2016 at 3:11 AM, Ken Gaillot <kgaillot at redhat.com <mailto:kgaillot at redhat.com>
> > <mailto:kgaillot at redhat.com <mailto:kgaillot at redhat.com>>> wrote:
> > >
> > >> On 05/08/2016 11:19 PM, Nikhil Utane wrote:
> > >> > Moving these questions to a different thread.
> > >> >
> > >> > Hi,
> > >> >
> > >> > We have limited storage capacity in our system for
> > different folders.
> > >> > How can I configure to use a different folder for
> > /var/lib/pacemaker?
> > >>
> > >> ./configure --localstatedir=/wherever (defaults to /var or
> > ${prefix}/var)
> > >>
> > >> That will change everything that normally is placed or
> looked for
> > under
> > >> /var (/var/lib/pacemaker, /var/lib/heartbeat, /var/run, etc.).
> > >>
> > >> Note that while ./configure lets you change the location of
> nearly
> > >> everything, /usr/lib/ocf/resource.d is an exception,
> because it is
> > >> specified in the OCF standard.
> > >>
> > >> >
> > >> >
> > >> > Also, in /dev/shm I see that it created around 300+ files
> > of around
> > >> > 250 MB.
> > >> >
> > >> > For e.g.
> > >> > -rw-rw---- 1 hacluste hacluste 8232 May 6 13:03
> > >> > qb-cib_rw-response-25035-25038-10-header
> > >> > -rw-rw---- 1 hacluste hacluste 540672 May 6 13:03
> > >> > qb-cib_rw-response-25035-25038-10-data
> > >> > -rw------- 1 hacluste hacluste 8232 May 6 13:03
> > >> > qb-cib_rw-response-25035-25036-12-header
> > >> > -rw------- 1 hacluste hacluste 540672 May 6 13:03
> > >> > qb-cib_rw-response-25035-25036-12-data
> > >> > And many more..
> > >> >
> > >> > We have limited space in /dev/shm and all these files are
> > filling it
> > >> > up. Are these all needed? Any way to limit? Do we need to
> > do any
> > >> > clean-up if pacemaker termination was not graceful?
> > >> >
> > >> > -Thanks
> > >> > Nikhil
More information about the Users
mailing list