<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Mar 5, 2021 at 10:13 AM Ken Gaillot <<a href="mailto:kgaillot@redhat.com">kgaillot@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, 2021-03-05 at 11:39 +0100, Ulrich Windl wrote:<br>

> Hi!<br>

> <br>

> I'm unsure what actually causes a problem I see (a resource was<br>

> "detected running" when it actually was not), but I'm sure some probe<br>

> started on cluster node start cannot provide a useful result until<br>

> some other resource has been started. AFAIK there is no way to make a<br>

> probe obey odering or colocation constraints, so the only work-around <br>

> seems to be a delay. However I'm unsure whether probes can actually<br>

> be delayed.<br>

> <br>

> Ideas?<br>

<br>

Ordered probes are a thorny problem that we've never been able to come<br>

up with a general solution for. We do order certain probes where we<br>

have enough information to know it's safe. The problem is that it is<br>

very easy to introduce ordering loops.<br>

<br>

I don't remember if there any workarounds.<br></blockquote><div><br></div><div>Maybe as a workaround:<br></div><div>  - Add an ocf:pacemaker:attribute resource after-and-with rsc1</div><div>  - Then configure a location rule for rsc2 with resource-discovery=never and score=-INFINITY with expression (in pseudocode) "attribute is not set to active value"</div><div><br></div><div>I haven't tested but that might cause rsc2's probe to wait until rsc1 is active.</div><div><br></div><div>And of course, use the usual constraints/rules to ensure rsc2's probe only runs on rsc1's node.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> Despite of that I wonder whether some probe/monitor returncode like<br>

> OCF_NOT_READY would make sense if the operation detects that it<br>

> cannot return a current status (so both "running" and "stopped" would<br>

> be as inadequate as "starting" and "stopping" would be (despite of<br>

> the fact that the latter two do not exist)).<br></blockquote><div><br></div><div>This seems logically reasonable, independent of any implementation complexity and considerations of what we would do with that return code.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> Regards,<br>

> Ulrich<br>

-- <br>

Ken Gaillot <<a href="mailto:kgaillot@redhat.com" target="_blank">kgaillot@redhat.com</a>><br>

<br>

_______________________________________________<br>

Manage your subscription:<br>

<a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

<br>

ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

<br>

</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div>Regards,<br><br></div>Reid Wahl, RHCA<br></div><div>Senior Software Maintenance Engineer, Red Hat<br></div>CEE - Platform Support Delivery - ClusterHA</div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>