[ClusterLabs] Timeout stopping corosync-qdevice service
Jan Friesse
jfriesse at redhat.com
Thu May 2 02:45:46 EDT 2019
Andrei,
> 30.04.2019 9:51, Jan Friesse пишет:
>>
>>> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
>>> installs SIGTERM handler that does not exit and only closes some socket.
>>> May be this should trigger termination of main loop, but somehow it does
>>> not.
>>
>> Yep, this is exactly how qdevice daemon shutdown works. Signal just
>> closes socket (should be signal safe) and poll in main loop do its job
>> so main loop is terminated.
>>
>
> That is bug in corosync 2.4.4 which is still used in TW. stop is using
> pidof, I have two corosync-qdevice processes so corosync-qdevice never
> gets signal in the first place.
Oh, that explains it.
>
>
> ++ pidof corosync-qdevice
> + kill -TERM '1812 1811'
>
> Current git was changed to use PID file (although for different
> reasons), so bug should not be fixed here as side effect.
It's probably time for 2.4.5 release.
Anyway, thanks a lot for digging into the problem and finding solution!
Regards,
Honza
>
> commit 1965225e3e2728beb1f77bed2e8f14edb72fe586 (tag: v2.93.0)
> Author: Jan Friesse <jfriesse at redhat.com>
> Date: Wed Nov 14 17:52:11 2018 +0100
>
> init: Fix init scripts to work with containers
>
> Previously init scripts were not using pid file so pidof was used. This
> is usually not a problem, but when containers are used it may result to
> killing improper instance when issued on host.
>
> Solution is to always use pidfile.
>
More information about the Users
mailing list