[ClusterLabs] Antw: Re: Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Tue Oct 12 02:42:49 EDT 2021
>>> Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am 11.10.2021 um
11:57 in
Nachricht <20211011115737.7cc99e69 at firost>:
> Hi,
>
> I kept the full answer in history to keep the list informed of your full
> answer.
>
> My answer down below.
>
> On Mon, 11 Oct 2021 11:33:12 +0200
> damiano giuliani <damianogiuliani87 at gmail.com> wrote:
>
>> ehy guys sorry for being late, was busy during the WE
>>
>> here i im:
>>
>>
>> > Did you see the swap activity (in/out, not just swap occupation) happen
in
>> > the
>> >
>> > same time the member was lost on corosync side?
>> > Did you check corosync or some of its libs were indeed in swap?
>> >
>> >
>> no and i dont know how do it, i just noticed the swap occupation which
>> suggest me (and my collegue) to find out if it could cause some trouble.
>>
>> > First, corosync now sit on a lot of memory because of knet. Did you try
to
>> > switch back to udpu which is using way less memory?
>>
>>
>> No i havent move to udpd, cast stop processes at all.
>>
>> "Could not lock memory of service to avoid page faults"
>>
>>
>> grep ‑rn 'Could not lock memory of service to avoid page faults'
/var/log/*
>> returns noting
Maybe the expression is too specific (try "lock memory", maybe), or syslog in
in journal only (journalctl -b | grep "lock memory").
>
> This message should appears on corosync startup. Make sure the logs hadn't
> been
> rotated to a blackhole in the meantime...
>
>> > On my side, mlocks is unlimited on ulimit settings. Check the values
>> > in /proc/$(coro PID)/limits (be careful with the ulimit command, check
the
>> > proc itself).
>>
>>
>> cat /proc/101350/limits
>> Limit Soft Limit Hard Limit Units
>> Max cpu time unlimited unlimited
seconds
>> Max file size unlimited unlimited bytes
>> Max data size unlimited unlimited bytes
>> Max stack size 8388608 unlimited bytes
>> Max core file size 0 unlimited bytes
>> Max resident set unlimited unlimited bytes
>> Max processes 770868 770868
>> processes
>> Max open files 1024 4096 files
>> Max locked memory unlimited unlimited bytes
>> Max address space unlimited unlimited bytes
>> Max file locks unlimited unlimited locks
>> Max pending signals 770868 770868
signals
>> Max msgqueue size 819200 819200 bytes
>> Max nice priority 0 0
>> Max realtime priority 0 0
>> Max realtime timeout unlimited unlimited us
>>
>> Ah... That's the first thing I change.
>> > In SLES, that is defaulted to 10s and so far I have never seen an
>> > environment that is stable enough for the default 1s timeout.
>>
>>
>> old versions have 10s default
>> you are not going to fix the problem lthis way, 1s timeout for a bonded
>> network and overkill hardware is enourmous time.
>>
>> hostnamectl | grep Kernel
>> Kernel: Linux 3.10.0‑1160.6.1.el7.x86_64
>> [root at ltaoperdbs03 ~]# cat /etc/os‑release
>> NAME="CentOS Linux"
>> VERSION="7 (Core)"
>>
>> > Indeed. But it's an arbitrage between swapping process mem or freeing
>> > mem by removing data from cache. For database servers, it is advised to
>> > use a
>> > lower value for swappiness anyway, around 5‑10, as a swapped process
means
>> > longer query, longer data in caches, piling sessions, etc.
>>
>>
>> totally agree, for db server swappines has to be 5‑10.
>>
>> kernel?
>> > What are your settings for vm.dirty_* ?
>>
>>
>>
>> hostnamectl | grep Kernel
>> Kernel: Linux 3.10.0‑1160.6.1.el7.x86_64
>> [root at ltaoperdbs03 ~]# cat /etc/os‑release
>> NAME="CentOS Linux"
>> VERSION="7 (Core)"
>>
>>
>> sysctl ‑a | grep dirty
>> vm.dirty_background_bytes = 0
>> vm.dirty_background_ratio = 10
>
> Considering your 256GB of physical memory, this means you can dirty up to
> 25GB
> pages in cache before the kernel start to write them on storage.
>
> You might want to trigger these background, lighter syncs much before
> hitting
> this limit.
>
>> vm.dirty_bytes = 0
>> vm.dirty_expire_centisecs = 3000
>> vm.dirty_ratio = 20
>
> This is 20% of your 256GB physical memory. After this limit, writes have to
> go
> to disks, directly. Considering the time to write to SSD compared to memory
> and the amount of data to sync in the background as well (52GB), this could
> be
> very painful.
Wowever (unless doing really large commits) databases should flush buffers
rather frequently, so I doubt database operations would fill the dirty buffer
rate.
"watch cat /proc/meminfo" could be your friend.
>
>> vm.dirty_writeback_centisecs = 500
>>
>>
>> > Do you have a proof that swap was the problem?
>>
>>
>> not at all but after switch to swappiness to 10, cluster doesnt sunndletly
>> swap anymore from a month
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
More information about the Users
mailing list