[ClusterLabs] Regression in Filesystem RA
Christian Balzer
chibi at gol.com
Thu Oct 12 02:30:30 EDT 2017
Hello,
2nd post in 10 years, lets see if this one gets an answer unlike the first
one...
One of the main use cases for pacemaker here are DRBD replicated
active/active mailbox servers (dovecot/exim) on Debian machines.
We've been doing this for a loong time, as evidenced by the oldest pair
still running Wheezy with heartbeat and pacemaker 1.1.7.
The majority of cluster pairs is on Jessie with corosync and backported
pacemaker 1.1.16.
Yesterday we had a hiccup, resulting in half the machines loosing
their upstream router for 50 seconds which in turn caused the pingd RA to
trigger a fail-over of the DRBD RA and associated resource group
(filesystem/IP) to the other node.
The old cluster performed flawlessly, the newer clusters all wound up with
DRBD and FS resource being BLOCKED as the processes holding open the
filesystem didn't get killed fast enough.
Comparing the 2 RAs (no versioning T_T) reveals a large change in the
"signal_processes" routine.
So with the old Filesystem RA using fuser we get something like this and
thousands of processes killed per second:
---
Oct 11 15:06:35 mbx07 lrmd: [4731]: info: RA output: (res_Filesystem_mb07:stop:stdout) 3478 3593 3597 3618 3654 3705 3708 3716 3736 3781 3792 3804 3963 3964 3972 3974 3978 3980 3981 3982 3985 3987 3991 3996 4002 4008 4013 4030
Oct 11 15:06:35 mbx07 lrmd: [4731]: info: RA output: (res_Filesystem_mb07:stop:stderr) cmccmccmccmcmcmcmcmccmccmcmcmcmcmcmcmcmcmcmcmcmccmcm
Oct 11 15:06:35 mbx07 lrmd: [4731]: info: RA output: (res_Filesystem_mb07:stop:stdout) 4032 4058 4086 4107 4199 4230 4320 4336 4362 4420 4429 4432 4435 4450 4468 4470 4471 4498 4510 4519 4584 4592 4604 4607 4632 4638 4640 4649 4676 4722 4765
---
Whereas the new RA (newer isn't better) that goes around killing processes
individually with beautiful logging was a total fail at about 4 processes
per second killed...
---
Oct 11 15:06:46 mbx10 Filesystem(res_Filesystem_mb10)[288712]: INFO: sending signal TERM to: mail 4226 4909 0 09:43 ? S 0:00 dovecot/imap
Oct 11 15:06:46 mbx10 Filesystem(res_Filesystem_mb10)[288712]: INFO: sending signal TERM to: mail 4229 4909 0 09:43 ? S 0:00 dovecot/imap [idling]
Oct 11 15:06:46 mbx10 Filesystem(res_Filesystem_mb10)[288712]: INFO: sending signal TERM to: mail 4238 4909 0 09:43 ? S 0:00 dovecot/imap
Oct 11 15:06:46 mbx10 Filesystem(res_Filesystem_mb10)[288712]: INFO: sending signal TERM to: mail 4239 4909 0 09:43 ? S 0:00 dovecot/imap
---
So my questions are:
1. Am I the only one with more than a handful of processes per FS who
can't afford to wait hours the new routine to finish?
2. Can we have the old FUSER (kill) mode back?
Regards,
Christian
--
Christian Balzer Network/Systems Engineer
chibi at gol.com Rakuten Communications
More information about the Users
mailing list