[ClusterLabs] Master Postgresql in Cluster has high CPU load with idling processes

Thu Aug 15 04:37:28 EDT 2019

Hi,

Postgres Version: 9.6
OS: RHEL 6.9

We have  a Postgres Cluster of 4 nodes in active-passive mode.

Cluster set up is A, B, C, D virtual machines. A is master, synchronizes to
B and C. C synchronizes to D. B, C and D are slaves. A and B are on Prod
server. C and D are on DR server.

A is reported to have a high CPU load, thanks to Nagios. Did a top command
and the result is as follows:

top - 16:28:47 up 52 days,  6:16,  1 user,  load average: 11.84, 10.46, 8.30
Tasks: 382 total,  12 running, 370 sleeping,   0 stopped,   0 zombie
Cpu(s): 58.7%us, 40.9%sy,  0.2%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.1%si,
 0.1%st
Mem:   8160248k total,  8081924k used,    78324k free,    15140k buffers
Swap:  1048572k total,   248464k used,   800108k free,  6976424k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3467 root      20   0  162m 4420 3120 R 35.3  0.1  16:26.69 pg_basebackup
32753 postgres  20   0  329m 139m 138m R 26.7  1.8 804:59.36 postgres
32705 postgres  20   0  329m 139m 138m R 26.4  1.8 779:24.82 postgres
32742 postgres  20   0  329m 139m 138m S 25.8  1.8 878:46.92 postgres
32743 postgres  20   0  329m 139m 138m R 25.5  1.8 803:09.02 postgres
32744 postgres  20   0  329m 139m 138m R 25.2  1.8 826:17.30 postgres
32749 postgres  20   0  329m 139m 138m R 25.2  1.8 789:19.71 postgres
32751 postgres  20   0  329m 139m 138m S 24.9  1.8 906:57.35 postgres
32757 postgres  20   0  329m 139m 138m R 24.3  1.8 870:15.83 postgres
  300 postgres  20   0  329m 139m 138m R 23.9  1.8 803:15.69 postgres
32766 postgres  20   0  329m 139m 138m S 23.9  1.8 756:14.30 postgres
32759 postgres  20   0  329m 139m 138m R 19.6  1.8 810:19.53 postgres
29746 root      20   0  363m 255m 3956 S 16.0  3.2   1:59.70 puppet
12381 root      20   0  363m 251m  640 R  8.9  3.2   0:00.29 puppet
 2282 hacluste  20   0 98620 7764 6224 S  4.0  0.1   2932:00 cib
 8000 cas-qoyy  20   0 17376 1608 1028 R  2.8  0.0   0:02.80 top
 1415 root      20   0  135m  35m 2104 S  1.2  0.4 661:59.01 ruby
 1153 root      39  19  107m 5372 1280 S  0.9  0.1   0:07.37 rkhunter
 3477 postgres  20   0  328m 7108 6184 S  0.6  0.1   0:38.18 postgres
    7 root      RT   0     0    0    0 S  0.3  0.0   6:45.34 migration/1
 1139 root      16  -4 29764  700  612 S  0.3  0.0 229:44.85 auditd
 2048 root      RT   0  607m  90m  59m S  0.3  1.1  48:08.57 corosync
 2276 root      20   0 84656 2284 2244 S  0.3  0.0  12:41.27 pacemakerd
12380 root      39  19  107m 4424  328 S  0.3  0.1   0:00.01 rkhunter
31843 root      20   0  427m 2040 1184 S  0.3  0.0   2:40.27 rsyslogd
32188 postgres  20   0  327m 5704 5372 S  0.3  0.1  15:57.13 postgres
    1 root      20   0 21452 1116  920 S  0.0  0.0   8:17.28 init
    2 root      20   0     0    0    0 S  0.0  0.0   0:12.17 kthreadd
    3 root      RT   0     0    0    0 S  0.0  0.0   7:04.41 migration/0
    4 root      20   0     0    0    0 S  0.0  0.0   0:50.84 ksoftirqd/0
    5 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/0
    6 root      RT   0     0    0    0 S  0.0  0.0   0:16.97 watchdog/0
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/1
    9 root      20   0     0    0    0 S  0.0  0.0   0:38.79 ksoftirqd/1
   10 root      RT   0     0    0    0 S  0.0  0.0   0:20.39 watchdog/1
   11 root      RT   0     0    0    0 S  0.0  0.0   6:41.52 migration/2
   12 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/2
   13 root      20   0     0    0    0 S  0.0  0.0   0:37.38 ksoftirqd/2
   14 root      RT   0     0    0    0 S  0.0  0.0   0:22.16 watchdog/2
   15 root      RT   0     0    0    0 S  0.0  0.0   6:44.95 migration/3
   16 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 stopper/3
   17 root      20   0     0    0    0 S  0.0  0.0   0:37.82 ksoftirqd/3
   18 root      RT   0     0    0    0 S  0.0  0.0   0:22.38 watchdog/3
   19 root      20   0     0    0    0 S  0.0  0.0  10:29.62 events/0
   20 root      20   0     0    0    0 S  0.0  0.0   8:41.09 events/1
   21 root      20   0     0    0    0 S  0.0  0.0   9:27.40 events/2
   22 root      20   0     0    0    0 S  0.0  0.0  11:39.86 events/3
   23 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/0
   24 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/1
   25 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/2
   26 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events/3
   27 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/0
   28 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/1
   29 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/2
   30 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_long/3
   31 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef
   32 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef
   33 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef
   34 root      20   0     0    0    0 S  0.0  0.0   0:00.00 events_power_ef
   35 root      20   0     0    0    0 S  0.0  0.0   0:00.00 cgroup
   36 root      20   0     0    0    0 S  0.0  0.0   0:09.28 khelper
   37 root      20   0     0    0    0 S  0.0  0.0   0:00.00 netns
   38 root      20   0     0    0    0 S  0.0  0.0   0:00.00 async/mgr
   39 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pm
   40 root      20   0     0    0    0 S  0.0  0.0   0:00.00 xenwatch
   41 root      20   0     0    0    0 S  0.0  0.0   0:51.00 xenbus
   42 root      20   0     0    0    0 S  0.0  0.0   0:30.13 sync_supers
   43 root      20   0     0    0    0 S  0.0  0.0   0:02.40 bdi-default
   44 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/0
   45 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/1
   46 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/2
   47 root      20   0     0    0    0 S  0.0  0.0   0:00.00 kintegrityd/3
   48 root      20   0     0    0    0 S  0.0  0.0   0:54.85 kblockd/0
   49 root      20   0     0    0    0 S  0.0  0.0   0:22.47 kblockd/1
   50 root      20   0     0    0    0 S  0.0  0.0   0:21.88 kblockd/2
   51 root      20   0     0    0    0 S  0.0  0.0   0:21.29 kblockd/3
   52 root      20   0     0    0    0 S  0.0  0.0   0:00.00 ata_aux

Log messages doesn't show much but noticed that there are a lot of postgres
idle processes in this master VM. Unsure if killing these idle processes
would be wise or in use by any slaves or application.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20190815/fee04b79/attachment-0001.html>