[ClusterLabs] Timeout stopping corosync-qdevice service
jfriesse at redhat.com
Thu May 2 02:45:46 EDT 2019
> 30.04.2019 9:51, Jan Friesse пишет:
>>> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
>>> installs SIGTERM handler that does not exit and only closes some socket.
>>> May be this should trigger termination of main loop, but somehow it does
>> Yep, this is exactly how qdevice daemon shutdown works. Signal just
>> closes socket (should be signal safe) and poll in main loop do its job
>> so main loop is terminated.
> That is bug in corosync 2.4.4 which is still used in TW. stop is using
> pidof, I have two corosync-qdevice processes so corosync-qdevice never
> gets signal in the first place.
Oh, that explains it.
> ++ pidof corosync-qdevice
> + kill -TERM '1812 1811'
> Current git was changed to use PID file (although for different
> reasons), so bug should not be fixed here as side effect.
It's probably time for 2.4.5 release.
Anyway, thanks a lot for digging into the problem and finding solution!
> commit 1965225e3e2728beb1f77bed2e8f14edb72fe586 (tag: v2.93.0)
> Author: Jan Friesse <jfriesse at redhat.com>
> Date: Wed Nov 14 17:52:11 2018 +0100
> init: Fix init scripts to work with containers
> Previously init scripts were not using pid file so pidof was used. This
> is usually not a problem, but when containers are used it may result to
> killing improper instance when issued on host.
> Solution is to always use pidfile.
More information about the Users