<div dir="ltr">Thanks Honza. I have raised these on both upstream projects.<div>I will leave upto implementer how best this can be done, considering the technical limitations you mentioned.</div><div><div><br></div><div><a href="https://github.com/corosync/corosync-qdevice/issues/13">https://github.com/corosync/corosync-qdevice/issues/13</a> </div><div><a href="https://github.com/ClusterLabs/booth/issues/99">https://github.com/ClusterLabs/booth/issues/99</a></div><div><br></div><div>Thanks,</div><div>Rohit <br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse <<a href="mailto:jfriesse@redhat.com">jfriesse@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Rohit,<br>

<br>

> Hi Honza,<br>

> Thanks for your reply. Please find the attached image below:<br>

> <br>

> [image: image.png]<br>

> <br>

> Yes, I am talking about pacemaker alerts only.<br>

> <br>

> Please find my suggestions/requirements below:<br>

> <br>

> *Booth:*<br>

> 1. Node5 booth-arbitrator should be able to give event when any of the<br>

> booth node joins or leaves. booth-ip can be passed in event.<br>

<br>

This is not how booth works. Ticket leader (so site booth, never <br>

arbitrator) executes election and get replies from other <br>

sites/arbitrator. Follower executes election when leader hasn't for <br>

configured timeout.<br>

<br>

What I want to say is, that there is no "membership" - as in (for <br>

example) corosync fashion.<br>

<br>

The best we could get is the rough estimation based on election <br>

request/replies.<br>

<br>

> 2. Event when booth-arbitrator is up successfully and has started<br>

> monitoring the booth nodes.<br>

<br>

This is basically start of service. I think it's doable with small <br>

change in unit file (something like <br>

<a href="https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html" rel="noreferrer" target="_blank">https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html</a>)<br>

<br>

> 2. Geo site booth should be able to give event when its booth peers<br>

> joins/leaves. For example, Geo site1 gives an event when node5<br>

> booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can be<br>

> passed in event.<br>

> 3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5)<br>

> should give events.<br>

<br>

That would be doable<br>

<br>

> <br>

> Note: pacemaker alerts works in a cluster. Since, arbitrator is a<br>

> non-cluster node, not sure how exactly it will work there. But this is good<br>

> to have feature.<br>

> <br>

> *Qnetd/Qdevice:*<br>

> This is similar to above.<br>

> 1. Node5 qnetd should be able to raise an event when any of the cluster<br>

> node joins/leaves the quorum.<br>

<br>

Doable<br>

<br>

> 2. Event when qnetd is up successfully and has started monitoring the<br>

> cluster nodes<br>

<br>

Qnetd itself is not monitoring qdevice nodes (it doesn't have list of <br>

nodes). It monitors node status after node joins (= it would be possible <br>

to trigger event on leave). So that may be enough.<br>

<br>

> 3. Cluster node should be able to give event when any of the quorum node<br>

> leaves/joins.<br>

<br>

You mean qdevice should be able to trigger event when connected to qnetd?<br>

<br>

> <br>

> If you see on high level, then these are kind of node/resource events wrt<br>

> booth and qnetd/qdevice.<br>

<br>

Yeah<br>

<br>

> <br>

> As of today wrt booth/qnetd, I don't see any provision where any of the<br>

> nodes gives any event when its peer leaves/joins. This makes it difficult<br>

> to know whether geo sites nodes can see booth-arbitrator or not. This is<br>

<br>

Got it. That's exactly what would be really problematic to implement, <br>

because of no "membership" in booth. It would be, however, possible to <br>

implement message when ticket was granted/rejected and have a list of <br>

other booths replies and what was their votes.<br>

<br>

> true the other way around also where booth-arbitrator cannot see geo booth<br>

> sites.<br>

> I am not sure how others are doing it in today's deployment, but I see need<br>

> of monitoring of every other booth/qnet node. So that on basis of event,<br>

> appropriate alarms can be raised and action can be taken accordingly.<br>

> <br>

> Please let me know if you agree on the usecases. I'll raise feature-request<br>

<br>

I can agree on usecases, but (especially with booth) there are technical <br>

problems on realizing them.<br>

<br>

> on the pacemaker upstream project accordingly.<br>

<br>

Please use booth (<a href="https://github.com/ClusterLabs/booth" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/booth</a>) and qdevice <br>

(<a href="https://github.com/corosync/corosync-qdevice" rel="noreferrer" target="_blank">https://github.com/corosync/corosync-qdevice</a>) upstream rather than <br>

pacemaker, because these requests has really nothing to do with pcmk.<br>

<br>

Regards,<br>

   honza<br>

<br>

> <br>

> Thanks,<br>

> Rohit<br>

> <br>

> On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse <<a href="mailto:jfriesse@redhat.com" target="_blank">jfriesse@redhat.com</a>> wrote:<br>

> <br>

>> Hi Rohit,<br>

>><br>

>> Rohit Saini napsal(a):<br>

>>> Hi Team,<br>

>>><br>

>>> Question-1:<br>

>>> Similar to pcs alerts, do we have something similar for qdevice/qnetd?<br>

>> This<br>

>><br>

>> You mean pacemaker alerts right?<br>

>><br>

>>> is to detect asynchronously if any of the member is<br>

>> unreachable/joined/left<br>

>>> and if that member is qdevice or qnetd.<br>

>><br>

>> Nope but actually shouldn't be that hard to implement. What exactly<br>

>> would you like to see there?<br>

>><br>

>>><br>

>>> Question-2:<br>

>>> Same above question for booth nodes and arbitrator. Is there any way to<br>

>>> receive events from booth daemon?<br>

>><br>

>> Not directly (again, shouldn't be that hard to implement). But pacemaker<br>

>> alerts should be triggered when service changes state because of ticket<br>

>> grant/reject, isn't it?<br>

>><br>

>>><br>

>>> My main objective is to see if these daemons give events related to<br>

>>> their internal state transitions  and raise some alarms accordingly. For<br>

>>> example, boothd arbitrator is unreachable, ticket moved from x to y, etc.<br>

>><br>

>> I don't think "boothd arbitrator is unreachable" alert is really doable.<br>

>> Ticket moved from x to y would be probably two alerts - 1. ticket<br>

>> rejected on X and 2. granted on Y.<br>

>><br>

>> Would you mind to elaborate a bit more on events you would like to see<br>

>> and potentially open issue for upstream project (or, if you have a RH<br>

>> subscription try to contact GSS, so I get more time to work on this issue).<br>

>><br>

>> Regards,<br>

>>     Honza<br>

>><br>

>>><br>

>>> Thanks,<br>

>>> Rohit<br>

>>><br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> Manage your subscription:<br>

>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>

>>><br>

>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>

>>><br>

>><br>

>><br>

> <br>

<br>

</blockquote></div>