[ClusterLabs] Alerts for qdevice/qnetd/booth

Jan Friesse jfriesse at redhat.com
Thu Aug 13 03:33:46 EDT 2020


Hi Rohit,

> Hi Honza,
> Thanks for your reply. Please find the attached image below:
> 
> [image: image.png]
> 
> Yes, I am talking about pacemaker alerts only.
> 
> Please find my suggestions/requirements below:
> 
> *Booth:*
> 1. Node5 booth-arbitrator should be able to give event when any of the
> booth node joins or leaves. booth-ip can be passed in event.

This is not how booth works. Ticket leader (so site booth, never 
arbitrator) executes election and get replies from other 
sites/arbitrator. Follower executes election when leader hasn't for 
configured timeout.

What I want to say is, that there is no "membership" - as in (for 
example) corosync fashion.

The best we could get is the rough estimation based on election 
request/replies.

> 2. Event when booth-arbitrator is up successfully and has started
> monitoring the booth nodes.

This is basically start of service. I think it's doable with small 
change in unit file (something like 
https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html)

> 2. Geo site booth should be able to give event when its booth peers
> joins/leaves. For example, Geo site1 gives an event when node5
> booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can be
> passed in event.
> 3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5)
> should give events.

That would be doable

> 
> Note: pacemaker alerts works in a cluster. Since, arbitrator is a
> non-cluster node, not sure how exactly it will work there. But this is good
> to have feature.
> 
> *Qnetd/Qdevice:*
> This is similar to above.
> 1. Node5 qnetd should be able to raise an event when any of the cluster
> node joins/leaves the quorum.

Doable

> 2. Event when qnetd is up successfully and has started monitoring the
> cluster nodes

Qnetd itself is not monitoring qdevice nodes (it doesn't have list of 
nodes). It monitors node status after node joins (= it would be possible 
to trigger event on leave). So that may be enough.

> 3. Cluster node should be able to give event when any of the quorum node
> leaves/joins.

You mean qdevice should be able to trigger event when connected to qnetd?

> 
> If you see on high level, then these are kind of node/resource events wrt
> booth and qnetd/qdevice.

Yeah

> 
> As of today wrt booth/qnetd, I don't see any provision where any of the
> nodes gives any event when its peer leaves/joins. This makes it difficult
> to know whether geo sites nodes can see booth-arbitrator or not. This is

Got it. That's exactly what would be really problematic to implement, 
because of no "membership" in booth. It would be, however, possible to 
implement message when ticket was granted/rejected and have a list of 
other booths replies and what was their votes.

> true the other way around also where booth-arbitrator cannot see geo booth
> sites.
> I am not sure how others are doing it in today's deployment, but I see need
> of monitoring of every other booth/qnet node. So that on basis of event,
> appropriate alarms can be raised and action can be taken accordingly.
> 
> Please let me know if you agree on the usecases. I'll raise feature-request

I can agree on usecases, but (especially with booth) there are technical 
problems on realizing them.

> on the pacemaker upstream project accordingly.

Please use booth (https://github.com/ClusterLabs/booth) and qdevice 
(https://github.com/corosync/corosync-qdevice) upstream rather than 
pacemaker, because these requests has really nothing to do with pcmk.

Regards,
   honza

> 
> Thanks,
> Rohit
> 
> On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse <jfriesse at redhat.com> wrote:
> 
>> Hi Rohit,
>>
>> Rohit Saini napsal(a):
>>> Hi Team,
>>>
>>> Question-1:
>>> Similar to pcs alerts, do we have something similar for qdevice/qnetd?
>> This
>>
>> You mean pacemaker alerts right?
>>
>>> is to detect asynchronously if any of the member is
>> unreachable/joined/left
>>> and if that member is qdevice or qnetd.
>>
>> Nope but actually shouldn't be that hard to implement. What exactly
>> would you like to see there?
>>
>>>
>>> Question-2:
>>> Same above question for booth nodes and arbitrator. Is there any way to
>>> receive events from booth daemon?
>>
>> Not directly (again, shouldn't be that hard to implement). But pacemaker
>> alerts should be triggered when service changes state because of ticket
>> grant/reject, isn't it?
>>
>>>
>>> My main objective is to see if these daemons give events related to
>>> their internal state transitions  and raise some alarms accordingly. For
>>> example, boothd arbitrator is unreachable, ticket moved from x to y, etc.
>>
>> I don't think "boothd arbitrator is unreachable" alert is really doable.
>> Ticket moved from x to y would be probably two alerts - 1. ticket
>> rejected on X and 2. granted on Y.
>>
>> Would you mind to elaborate a bit more on events you would like to see
>> and potentially open issue for upstream project (or, if you have a RH
>> subscription try to contact GSS, so I get more time to work on this issue).
>>
>> Regards,
>>     Honza
>>
>>>
>>> Thanks,
>>> Rohit
>>>
>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>>
>>
> 



More information about the Users mailing list