[ClusterLabs] corosync-qdevice[3772]: Heuristics worker waitpid failed (10): No child processes

Andrei Borzenkov arvidjaar at gmail.com
Sun May 5 01:36:02 EDT 2019


While testing corosync-qdevice I repeatedly got the above message. The
reason seems to be startup sequence in corosync-qdevice. Consider:


● corosync-qdevice.service - Corosync Qdevice daemon
   Loaded: loaded (/etc/systemd/system/corosync-qdevice.service;
disabled; vendor preset: disabled)
   Active: active (running) since Sun 2019-05-05 08:22:03 MSK; 2s ago
     Docs: man:corosync-qdevice
  Process: 3770 ExecStart=/usr/sbin/corosync-qdevice
$COROSYNC_QDEVICE_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 3772 (corosync-qdevic)
    Tasks: 2 (limit: 553)
   Memory: 2.1M
   CGroup: /system.slice/corosync-qdevice.service
           ├─3771 /usr/sbin/corosync-qdevice
           └─3772 /usr/sbin/corosync-qdevice

...
May 05 08:11:41 ha2 corosync-qdevice[3772]: Heuristics worker waitpid
failed (10): No child processes
May 05 08:11:41 ha2 systemd[1]: Stopping Corosync Qdevice daemon...

Startup sequence of corosync-qdevice is

1. PID 3770 forks off heuristics worker (PID 3771) in
qdevice_heuristics_init(). Parent of PID 3771 is PID 3770.
2. PID 3770 calls utils_tty_detach() to daemonize. PID 3770 forks off
child (PID 3772) and exits. At this point both PID 3771 and PID 3772 are
reparented to PID 1, so 3772 can NOT receive status of 3771.

Backgrounding is default behavior. In case of systemd it can trivially
be turned off and service defined as simple. As there is no consumer of
corosync-qdevice it does not matter - nothing needs to wait for it. Here
is example service which seems to work for me:

[Unit]
Description=Corosync Qdevice daemon
Documentation=man:corosync-qdevice
ConditionKernelCommandLine=!nocluster
Wants=corosync.service
After=corosync.service

[Service]
EnvironmentFile=-/etc/sysconfig/corosync-qdevice
ExecStart=/usr/sbin/corosync-qdevice -f $COROSYNC_QDEVICE_OPTIONS
Type=simple
RuntimeDirectory=corosync-qdevice
RuntimeDirectoryMode=0770
KillMode=mixed

[Install]
WantedBy=multi-user.target



with result

● corosync-qdevice.service - Corosync Qdevice daemon
   Loaded: loaded (/etc/systemd/system/corosync-qdevice.service;
disabled; vendor preset: disabled)
   Active: active (running) since Sun 2019-05-05 08:28:51 MSK; 13s ago
     Docs: man:corosync-qdevice
 Main PID: 4075 (corosync-qdevic)
    Tasks: 2 (limit: 553)
   Memory: 2.0M
   CGroup: /system.slice/corosync-qdevice.service
           ├─4075 /usr/sbin/corosync-qdevice -f
           └─4076 /usr/sbin/corosync-qdevice -f

and after stop

● corosync-qdevice.service - Corosync Qdevice daemon
   Loaded: loaded (/etc/systemd/system/corosync-qdevice.service;
disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:corosync-qdevice

May 05 08:27:04 ha2 systemd[1]: corosync-qdevice.service: Succeeded.
May 05 08:27:51 ha2 systemd[1]: Started Corosync Qdevice daemon.
May 05 08:28:14 ha2 systemd[1]: Stopping Corosync Qdevice daemon...
May 05 08:28:14 ha2 systemd[1]: corosync-qdevice.service: Succeeded.
May 05 08:28:14 ha2 systemd[1]: Stopped Corosync Qdevice daemon.
May 05 08:28:51 ha2 systemd[1]: Started Corosync Qdevice daemon.
May 05 08:29:19 ha2 systemd[1]: Stopping Corosync Qdevice daemon...
May 05 08:29:19 ha2 systemd[1]: corosync-qdevice.service: Succeeded.
May 05 08:29:19 ha2 systemd[1]: Stopped Corosync Qdevice daemon.




More information about the Users mailing list