[ClusterLabs] Timeout stopping corosync-qdevice service

Andrei Borzenkov arvidjaar at gmail.com
Tue Apr 30 00:01:33 EDT 2019


29.04.2019 14:32, Jan Friesse пишет:
> Andrei,
> 
>> I setup qdevice in openSUSE Tumbleweed and while it works as expected I
> 
> Is it corosync-qdevice or corosync-qnetd daemon?
> 

corosync-qdevice

>> cannot stop it - it always results in timeout and service finally gets
>> killed by systemd.
>>
>> Is it a known issue? TW is having quite up-to-date versions, it usually
> 
> Nope
> 
>> follows upstream GIT pretty closely.
> 
> Anything in logs?
> 

Nothing except corosync-qdevice not being stopped by SIGTERM:

Apr 29 20:16:57 ha2 systemd[1]: Stopping Corosync Qdevice daemon...
Apr 29 20:16:57 ha2 corosync-qdevice[3419]: Signaling Corosync Qdevice
daemon (corosync-qdevice) to terminate: [  OK  ]
Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Stopping timed
out. Terminating.
Apr 29 20:18:27 ha2 corosync-qdevice[3085]: Lost connection with
heuristics worker
Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Control
process exited, code=killed, status=15/TERM
Apr 29 20:18:27 ha2 corosync-qdevice[3419]: Waiting for corosync-qdevice
services to unload:...................................>
Apr 29 20:18:27 ha2 corosync-qdevice[3085]: Heuristics worker waitpid
failed (10): No child processes
Apr 29 20:18:27 ha2 systemd[1]: corosync-qdevice.service: Failed with
result 'timeout'.
Apr 29 20:18:27 ha2 systemd[1]: Stopped Corosync Qdevice daemon.

Now, corosync-qdevice gets SIGTERM as "signal to terminate", but it
installs SIGTERM handler that does not exit and only closes some socket.
May be this should trigger termination of main loop, but somehow it does
not.

corosync-qnetd stops normally BTW.


More information about the Users mailing list