[ClusterLabs] Antw: [EXT] unexpected fenced node and promotion of the new master PAF ‑ postgres

Jehan-Guillaume de Rorthais jgdr at dalibo.com
Sat Oct 9 15:28:43 EDT 2021


On Sat, 9 Oct 2021 09:55:28 +0300
Andrei Borzenkov <arvidjaar at gmail.com> wrote:

> On 08.10.2021 16:00, damiano giuliani wrote:
> > ...
> > the servers are all resoruce overkills with 80 cpus and 256 gb ram even if
> > the db ingest milions records x day, the network si bonded 10gbs, ssd disks.

I don't remember if we discussed this: have you ever experienced some IO burst
during eg. batch or high R/W concurrency? I'm thinking about IO/cache pressure
where the system can stall at some point...
We even experienced really bad IO behavior under pressure because of some
misunderstanding between MegaRAID and kernel 4.x (4.15 maybe?)...

What are your settings for vm.dirty_* ?

> > ...
> > So it turn out that a lil bit of swap was used and i suspect corosync
> > process were swapped to disks creating lag where 1s default corosync
> > timeout was not enough.
> 
> But you do not know whether corosync was swapped out at all. So it is
> just guess.

Exact. Moreover, corosync mlock itself. As I wrote in earlier answer, Damiano
should probably check its Corosync logs for error on **startup**.

> > So it is, swap doesnt log anything and moving process to allocated ram to
> > swap take times more that 1s default timeout (probably many many mores).
> > i fix it changing the swappiness of each servers to 10 (at minimum)
> > avoinding the corosync process could swap.
> 
> swappiness kernel parameter does not really prevent swap from being used.

Indeed. But it's an arbitrage between swapping process mem or freeing
mem by removing data from cache. For database servers, it is advised to use a
lower value for swappiness anyway, around 5-10, as a swapped process means
longer query, longer data in caches, piling sessions, etc.

But I still doubt corosync could be swapped, unless it cried on startup it
couldn't mlock its memory.

> What is your kernel version? On several consecutive kernel versions I
> observed the following effect - once swap started being used at all
> system experienced periodical stalls for several seconds. It feeled like
> frozen system. It did not matter how much swap was in allocated -
> several megabytes was already enough.
> 
> As far as I understand, the problem was not really time to swap out/in,
> but time kernel spent traversing page tables to make decision. I think
> it start with kernel 5.3 (or may be 5.2) and I do not see it any more
> since I believe kernel 5.7.

Interesting.

++


More information about the Users mailing list