[ClusterLabs] Timeout stopping corosync-qdevice service

Jan Friesse jfriesse at redhat.com
Tue Apr 30 02:51:49 EDT 2019


Andrei,

> 29.04.2019 14:32, Jan Friesse пишет:
>> Andrei,
>>
>>> I setup qdevice in openSUSE Tumbleweed and while it works as expected I
>>
>> Is it corosync-qdevice or corosync-qnetd daemon?
>>
> 
> corosync-qdevice
> 
>>> cannot stop it - it always results in timeout and service finally gets
>>> killed by systemd.
>>>
>>> Is it a known issue? TW is having quite up-to-date versions, it usually
>>
>> Nope
>>
>>> follows upstream GIT pretty closely.
>>
>> Anything in logs?
>>
> 
> Nothing except corosync-qdevice not being stopped by SIGTERM:
> 
> Apr 29 20:16:57 ha2 systemd[1]: Stopping Corosync Qdevice daemon...
> Apr 29 20:16:57 ha2 corosync-qdevice[3419]: Signaling Corosync Qdevice
> daemon (corosync-qdevice) to terminate: [  OK  ]
> Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Stopping timed
> out. Terminating.
> Apr 29 20:18:27 ha2 corosync-qdevice[3085]: Lost connection with
> heuristics worker
> Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Control
> process exited, code=killed, status=15/TERM
> Apr 29 20:18:27 ha2 corosync-qdevice[3419]: Waiting for corosync-qdevice
> services to unload:...................................>
> Apr 29 20:18:27 ha2 corosync-qdevice[3085]: Heuristics worker waitpid
> failed (10): No child processes
> Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Failed with
> result 'timeout'.
> Apr 29 20:18:27 ha2 systemd[1]: Stopped Corosync Qdevice daemon.
> 

Hmm nothing obvious. Would you mind to try increase log to debug/trace? 
(just add "debug: trace" to corosync.conf and reload config/restart 
corosync. corosync-qdevice will use this value).

> Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
> installs SIGTERM handler that does not exit and only closes some socket.
> May be this should trigger termination of main loop, but somehow it does
> not.

Yep, this is exactly how qdevice daemon shutdown works. Signal just 
closes socket (should be signal safe) and poll in main loop do its job 
so main loop is terminated.

Regards,
   Honza

> 
> corosync-qnetd stops normally BTW.
> 



More information about the Users mailing list