<div dir="ltr">Thanks Honza. I have raised these on both upstream projects.<div>I will leave upto implementer how best this can be done, considering the technical limitations you mentioned.</div><div><div><br></div><div><a href="https://github.com/corosync/corosync-qdevice/issues/13">https://github.com/corosync/corosync-qdevice/issues/13</a> </div><div><a href="https://github.com/ClusterLabs/booth/issues/99">https://github.com/ClusterLabs/booth/issues/99</a></div><div><br></div><div>Thanks,</div><div>Rohit <br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse <<a href="mailto:jfriesse@redhat.com">jfriesse@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Rohit,<br>
<br>
> Hi Honza,<br>
> Thanks for your reply. Please find the attached image below:<br>
> <br>
> [image: image.png]<br>
> <br>
> Yes, I am talking about pacemaker alerts only.<br>
> <br>
> Please find my suggestions/requirements below:<br>
> <br>
> *Booth:*<br>
> 1. Node5 booth-arbitrator should be able to give event when any of the<br>
> booth node joins or leaves. booth-ip can be passed in event.<br>
<br>
This is not how booth works. Ticket leader (so site booth, never <br>
arbitrator) executes election and get replies from other <br>
sites/arbitrator. Follower executes election when leader hasn't for <br>
configured timeout.<br>
<br>
What I want to say is, that there is no "membership" - as in (for <br>
example) corosync fashion.<br>
<br>
The best we could get is the rough estimation based on election <br>
request/replies.<br>
<br>
> 2. Event when booth-arbitrator is up successfully and has started<br>
> monitoring the booth nodes.<br>
<br>
This is basically start of service. I think it's doable with small <br>
change in unit file (something like <br>
<a href="https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html" rel="noreferrer" target="_blank">https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html</a>)<br>
<br>
> 2. Geo site booth should be able to give event when its booth peers<br>
> joins/leaves. For example, Geo site1 gives an event when node5<br>
> booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can be<br>
> passed in event.<br>
> 3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5)<br>
> should give events.<br>
<br>
That would be doable<br>
<br>
> <br>
> Note: pacemaker alerts works in a cluster. Since, arbitrator is a<br>
> non-cluster node, not sure how exactly it will work there. But this is good<br>
> to have feature.<br>
> <br>
> *Qnetd/Qdevice:*<br>
> This is similar to above.<br>
> 1. Node5 qnetd should be able to raise an event when any of the cluster<br>
> node joins/leaves the quorum.<br>
<br>
Doable<br>
<br>
> 2. Event when qnetd is up successfully and has started monitoring the<br>
> cluster nodes<br>
<br>
Qnetd itself is not monitoring qdevice nodes (it doesn't have list of <br>
nodes). It monitors node status after node joins (= it would be possible <br>
to trigger event on leave). So that may be enough.<br>
<br>
> 3. Cluster node should be able to give event when any of the quorum node<br>
> leaves/joins.<br>
<br>
You mean qdevice should be able to trigger event when connected to qnetd?<br>
<br>
> <br>
> If you see on high level, then these are kind of node/resource events wrt<br>
> booth and qnetd/qdevice.<br>
<br>
Yeah<br>
<br>
> <br>
> As of today wrt booth/qnetd, I don't see any provision where any of the<br>
> nodes gives any event when its peer leaves/joins. This makes it difficult<br>
> to know whether geo sites nodes can see booth-arbitrator or not. This is<br>
<br>
Got it. That's exactly what would be really problematic to implement, <br>
because of no "membership" in booth. It would be, however, possible to <br>
implement message when ticket was granted/rejected and have a list of <br>
other booths replies and what was their votes.<br>
<br>
> true the other way around also where booth-arbitrator cannot see geo booth<br>
> sites.<br>
> I am not sure how others are doing it in today's deployment, but I see need<br>
> of monitoring of every other booth/qnet node. So that on basis of event,<br>
> appropriate alarms can be raised and action can be taken accordingly.<br>
> <br>
> Please let me know if you agree on the usecases. I'll raise feature-request<br>
<br>
I can agree on usecases, but (especially with booth) there are technical <br>
problems on realizing them.<br>
<br>
> on the pacemaker upstream project accordingly.<br>
<br>
Please use booth (<a href="https://github.com/ClusterLabs/booth" rel="noreferrer" target="_blank">https://github.com/ClusterLabs/booth</a>) and qdevice <br>
(<a href="https://github.com/corosync/corosync-qdevice" rel="noreferrer" target="_blank">https://github.com/corosync/corosync-qdevice</a>) upstream rather than <br>
pacemaker, because these requests has really nothing to do with pcmk.<br>
<br>
Regards,<br>
honza<br>
<br>
> <br>
> Thanks,<br>
> Rohit<br>
> <br>
> On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse <<a href="mailto:jfriesse@redhat.com" target="_blank">jfriesse@redhat.com</a>> wrote:<br>
> <br>
>> Hi Rohit,<br>
>><br>
>> Rohit Saini napsal(a):<br>
>>> Hi Team,<br>
>>><br>
>>> Question-1:<br>
>>> Similar to pcs alerts, do we have something similar for qdevice/qnetd?<br>
>> This<br>
>><br>
>> You mean pacemaker alerts right?<br>
>><br>
>>> is to detect asynchronously if any of the member is<br>
>> unreachable/joined/left<br>
>>> and if that member is qdevice or qnetd.<br>
>><br>
>> Nope but actually shouldn't be that hard to implement. What exactly<br>
>> would you like to see there?<br>
>><br>
>>><br>
>>> Question-2:<br>
>>> Same above question for booth nodes and arbitrator. Is there any way to<br>
>>> receive events from booth daemon?<br>
>><br>
>> Not directly (again, shouldn't be that hard to implement). But pacemaker<br>
>> alerts should be triggered when service changes state because of ticket<br>
>> grant/reject, isn't it?<br>
>><br>
>>><br>
>>> My main objective is to see if these daemons give events related to<br>
>>> their internal state transitions and raise some alarms accordingly. For<br>
>>> example, boothd arbitrator is unreachable, ticket moved from x to y, etc.<br>
>><br>
>> I don't think "boothd arbitrator is unreachable" alert is really doable.<br>
>> Ticket moved from x to y would be probably two alerts - 1. ticket<br>
>> rejected on X and 2. granted on Y.<br>
>><br>
>> Would you mind to elaborate a bit more on events you would like to see<br>
>> and potentially open issue for upstream project (or, if you have a RH<br>
>> subscription try to contact GSS, so I get more time to work on this issue).<br>
>><br>
>> Regards,<br>
>> Honza<br>
>><br>
>>><br>
>>> Thanks,<br>
>>> Rohit<br>
>>><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> Manage your subscription:<br>
>>> <a href="https://lists.clusterlabs.org/mailman/listinfo/users" rel="noreferrer" target="_blank">https://lists.clusterlabs.org/mailman/listinfo/users</a><br>
>>><br>
>>> ClusterLabs home: <a href="https://www.clusterlabs.org/" rel="noreferrer" target="_blank">https://www.clusterlabs.org/</a><br>
>>><br>
>><br>
>><br>
> <br>
<br>
</blockquote></div>