<div dir="ltr"><div>Hi, Ulrich. I presume you're talking about the log_finished line, which looks like this in pacemaker.log:</div><div><br></div><div>Feb 05 01:48:47.192 fastvm-rhel-8-0-23 pacemaker-execd [15446] (log_finished@execd_commands.c:214) info: dummy start (call 23, PID 18743) exited with status 0 (execution time 11ms, queue time 0ms)</div><div><br></div><div>Is that correct?<br></div><div><br></div><div>I got curious and took a look. It looks like this comes down to the action_complete() function:<br></div><div> - <a href="https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.5/daemons/execd/execd_commands.c#L896-L1050">https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.0.5/daemons/execd/execd_commands.c#L896-L1050</a></div><div><br></div><div>This function calls cmd_finalize() at the end, which calls a chain of functions ending in log_finished(). log_finished() logs the "start ... exited with status" line shown above.</div><div><br></div><div>However, systemd resources have to be handled a bit differently. We have to schedule a monitor operation to run after the start operation, in order to check whether the start was truly a success.<br></div><div>~~~</div><div> if (pcmk__str_eq(rclass, PCMK_RESOURCE_CLASS_SYSTEMD, pcmk__str_casei)) {<br> if ((cmd->exec_rc == PCMK_OCF_OK)<br> && pcmk__strcase_any_of(cmd->action, "start", "stop", NULL)) {<br> /* systemd returns from start and stop actions after the action<br> * begins, not after it completes. We have to jump through a few<br> * hoops so that we don't report 'complete' to the rest of pacemaker<br> * until it's actually done.<br> */<br> goagain = true;<br> cmd->real_action = cmd->action;<br> cmd->action = strdup("monitor");</div><div>...</div><div> if (goagain) {</div><div> ...</div><div> schedule_lrmd_cmd(rsc, cmd);</div><div><br></div><div> /* Don't finalize cmd, we're not done with it yet */<br> return;<br></div><div>~~~</div><div><br></div><div>So for the start operation, it never reaches the cmd_finalize() call at the end, until the follow-up monitor runs. The follow-up monitor operation does end up calling cmd_finalize() at the end of action_complete. But the "log_finished" message is logged at debug level for monitor operations. So you won't see it unless debugging is enabled.</div><div><br></div><div>Does this make sense?</div><div><br></div><div>Example:</div><div>~~~</div><div>Feb 05 02:06:51.123 fastvm-rhel-8-0-23 pacemaker-execd [19354] (log_execute) info: executing - rsc:nfs-daemon action:start call_id:20<br>Feb 05 02:06:51.123 fastvm-rhel-8-0-23 pacemaker-execd [19354] (systemd_unit_exec) debug: Performing asynchronous start op on systemd unit nfs-server.service named 'nfs-daemon'<br>Feb 05 02:06:51.124 fastvm-rhel-8-0-23 pacemaker-execd [19354] (systemd_unit_exec_with_unit) debug: Calling StartUnit for nfs-daemon: /org/freedesktop/systemd1/unit/nfs_2dserver_2eservice<br>Feb 05 02:06:51.517 fastvm-rhel-8-0-23 pacemaker-execd [19354] (action_complete) debug: nfs-daemon start may still be in progress: re-scheduling (elapsed=394ms, remaining=99606ms, start_delay=2000ms)<br>Feb 05 02:06:53.518 fastvm-rhel-8-0-23 pacemaker-execd [19354] (log_execute) debug: executing - rsc:nfs-daemon action:monitor call_id:20<br>Feb 05 02:06:53.518 fastvm-rhel-8-0-23 pacemaker-execd [19354] (systemd_unit_exec) debug: Performing asynchronous status op on systemd unit nfs-server.service named 'nfs-daemon'<br>Feb 05 02:06:53.521 fastvm-rhel-8-0-23 pacemaker-execd [19354] (action_complete) debug: nfs-daemon systemd start is now complete (elapsed=2397ms, remaining=97603ms): ok (0)<br>Feb 05 02:06:53.521 fastvm-rhel-8-0-23 pacemaker-execd [19354] (log_finished) debug: nfs-daemon monitor (call 20) exited with status 0 (execution time 2397ms, queue time 0ms)</div><div>~~~<br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Feb 4, 2021 at 11:25 PM Ulrich Windl <<a href="mailto:Ulrich.Windl@rz.uni-regensburg.de">Ulrich.Windl@rz.uni-regensburg.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi!<br>
<br>
While analyzing cluster problems I noticed this:<br>
Normal resources executed via OCF RAs create two log entries by pacemaker-execd: One when starting the resource and another when the resource completed starting.<br>
However for systemd units I only get a start message. Is that intentional? Does that mean systemd starts are asynchronous in general (i.e.: The process returns before start in complete)?<br>
(Still I get a completed message from pacemaker-controld)<br>
<br>
Example:<br>
Feb 04 15:41:25 h19 pacemaker-execd[7793]: notice: executing - rsc:prm_virtlockd action:start call_id:95<br>
Feb 04 15:41:27 h19 pacemaker-execd[7793]: notice: executing - rsc:prm_libvirtd action:start call_id:97<br>
<br>
So one could guess that lirtlockd and libvirtd are staring concurrently, but the did not because of this sequence:<br>
Feb 04 15:41:27 h19 pacemaker-controld[7796]: notice: Result of start operation for prm_virtlockd on h19: ok<br>
Feb 04 15:41:27 h19 pacemaker-execd[7793]: notice: executing - rsc:prm_libvirtd action:start call_id:97<br>
<br>
Regards,<br>
Ulrich<br>
<br>
<br>
_______________________________________________<br>
Manage your subscription:<br>
<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
<br>
ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
<br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Senior Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div>