[ClusterLabs] Alerts for qdevice/qnetd/booth
Rohit Saini
rohitsaini111.forum at gmail.com
Fri Aug 14 00:22:48 EDT 2020
Thanks Honza. I have raised these on both upstream projects.
I will leave upto implementer how best this can be done, considering the
technical limitations you mentioned.
https://github.com/corosync/corosync-qdevice/issues/13
https://github.com/ClusterLabs/booth/issues/99
Thanks,
Rohit
On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse <jfriesse at redhat.com> wrote:
> Hi Rohit,
>
> > Hi Honza,
> > Thanks for your reply. Please find the attached image below:
> >
> > [image: image.png]
> >
> > Yes, I am talking about pacemaker alerts only.
> >
> > Please find my suggestions/requirements below:
> >
> > *Booth:*
> > 1. Node5 booth-arbitrator should be able to give event when any of the
> > booth node joins or leaves. booth-ip can be passed in event.
>
> This is not how booth works. Ticket leader (so site booth, never
> arbitrator) executes election and get replies from other
> sites/arbitrator. Follower executes election when leader hasn't for
> configured timeout.
>
> What I want to say is, that there is no "membership" - as in (for
> example) corosync fashion.
>
> The best we could get is the rough estimation based on election
> request/replies.
>
> > 2. Event when booth-arbitrator is up successfully and has started
> > monitoring the booth nodes.
>
> This is basically start of service. I think it's doable with small
> change in unit file (something like
>
> https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html
> )
>
> > 2. Geo site booth should be able to give event when its booth peers
> > joins/leaves. For example, Geo site1 gives an event when node5
> > booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can
> be
> > passed in event.
> > 3. On ticket movements (revoke/grant), every booth node(Site1/2 and
> node5)
> > should give events.
>
> That would be doable
>
> >
> > Note: pacemaker alerts works in a cluster. Since, arbitrator is a
> > non-cluster node, not sure how exactly it will work there. But this is
> good
> > to have feature.
> >
> > *Qnetd/Qdevice:*
> > This is similar to above.
> > 1. Node5 qnetd should be able to raise an event when any of the cluster
> > node joins/leaves the quorum.
>
> Doable
>
> > 2. Event when qnetd is up successfully and has started monitoring the
> > cluster nodes
>
> Qnetd itself is not monitoring qdevice nodes (it doesn't have list of
> nodes). It monitors node status after node joins (= it would be possible
> to trigger event on leave). So that may be enough.
>
> > 3. Cluster node should be able to give event when any of the quorum node
> > leaves/joins.
>
> You mean qdevice should be able to trigger event when connected to qnetd?
>
> >
> > If you see on high level, then these are kind of node/resource events wrt
> > booth and qnetd/qdevice.
>
> Yeah
>
> >
> > As of today wrt booth/qnetd, I don't see any provision where any of the
> > nodes gives any event when its peer leaves/joins. This makes it difficult
> > to know whether geo sites nodes can see booth-arbitrator or not. This is
>
> Got it. That's exactly what would be really problematic to implement,
> because of no "membership" in booth. It would be, however, possible to
> implement message when ticket was granted/rejected and have a list of
> other booths replies and what was their votes.
>
> > true the other way around also where booth-arbitrator cannot see geo
> booth
> > sites.
> > I am not sure how others are doing it in today's deployment, but I see
> need
> > of monitoring of every other booth/qnet node. So that on basis of event,
> > appropriate alarms can be raised and action can be taken accordingly.
> >
> > Please let me know if you agree on the usecases. I'll raise
> feature-request
>
> I can agree on usecases, but (especially with booth) there are technical
> problems on realizing them.
>
> > on the pacemaker upstream project accordingly.
>
> Please use booth (https://github.com/ClusterLabs/booth) and qdevice
> (https://github.com/corosync/corosync-qdevice) upstream rather than
> pacemaker, because these requests has really nothing to do with pcmk.
>
> Regards,
> honza
>
> >
> > Thanks,
> > Rohit
> >
> > On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse <jfriesse at redhat.com> wrote:
> >
> >> Hi Rohit,
> >>
> >> Rohit Saini napsal(a):
> >>> Hi Team,
> >>>
> >>> Question-1:
> >>> Similar to pcs alerts, do we have something similar for qdevice/qnetd?
> >> This
> >>
> >> You mean pacemaker alerts right?
> >>
> >>> is to detect asynchronously if any of the member is
> >> unreachable/joined/left
> >>> and if that member is qdevice or qnetd.
> >>
> >> Nope but actually shouldn't be that hard to implement. What exactly
> >> would you like to see there?
> >>
> >>>
> >>> Question-2:
> >>> Same above question for booth nodes and arbitrator. Is there any way to
> >>> receive events from booth daemon?
> >>
> >> Not directly (again, shouldn't be that hard to implement). But pacemaker
> >> alerts should be triggered when service changes state because of ticket
> >> grant/reject, isn't it?
> >>
> >>>
> >>> My main objective is to see if these daemons give events related to
> >>> their internal state transitions and raise some alarms accordingly.
> For
> >>> example, boothd arbitrator is unreachable, ticket moved from x to y,
> etc.
> >>
> >> I don't think "boothd arbitrator is unreachable" alert is really doable.
> >> Ticket moved from x to y would be probably two alerts - 1. ticket
> >> rejected on X and 2. granted on Y.
> >>
> >> Would you mind to elaborate a bit more on events you would like to see
> >> and potentially open issue for upstream project (or, if you have a RH
> >> subscription try to contact GSS, so I get more time to work on this
> issue).
> >>
> >> Regards,
> >> Honza
> >>
> >>>
> >>> Thanks,
> >>> Rohit
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Manage your subscription:
> >>> https://lists.clusterlabs.org/mailman/listinfo/users
> >>>
> >>> ClusterLabs home: https://www.clusterlabs.org/
> >>>
> >>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20200814/1c00d750/attachment.htm>
More information about the Users
mailing list