[ClusterLabs] Timeout stopping corosync-qdevice service

Andrei Borzenkov arvidjaar at gmail.com
Tue Apr 30 13:39:11 EDT 2019


30.04.2019 9:51, Jan Friesse пишет:
> 
>> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
>> installs SIGTERM handler that does not exit and only closes some socket.
>> May be this should trigger termination of main loop, but somehow it does
>> not.
> 
> Yep, this is exactly how qdevice daemon shutdown works. Signal just
> closes socket (should be signal safe) and poll in main loop do its job
> so main loop is terminated.
> 

That is bug in corosync 2.4.4 which is still used in TW. stop is using
pidof, I have two corosync-qdevice processes so corosync-qdevice never
gets signal in the first place.


++ pidof corosync-qdevice
+ kill -TERM '1812 1811'

Current git was changed to use PID file (although for different
reasons), so bug should not be fixed here as side effect.

commit 1965225e3e2728beb1f77bed2e8f14edb72fe586 (tag: v2.93.0)
Author: Jan Friesse <jfriesse at redhat.com>
Date:   Wed Nov 14 17:52:11 2018 +0100

    init: Fix init scripts to work with containers

    Previously init scripts were not using pid file so pidof was used. This
    is usually not a problem, but when containers are used it may result to
    killing improper instance when issued on host.

    Solution is to always use pidfile.



More information about the Users mailing list