[ClusterLabs] Alerts for qdevice/qnetd/booth

Jan Friesse jfriesse at redhat.com
Mon Aug 17 03:38:49 EDT 2020


> Thanks Honza. I have raised these on both upstream projects.

Thanks

> I will leave upto implementer how best this can be done, considering the
> technical limitations you mentioned.
> 
> https://github.com/corosync/corosync-qdevice/issues/13
> https://github.com/ClusterLabs/booth/issues/99
> 
> Thanks,
> Rohit
> 
> On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse <jfriesse at redhat.com> wrote:
> 
>> Hi Rohit,
>>
>>> Hi Honza,
>>> Thanks for your reply. Please find the attached image below:
>>>
>>> [image: image.png]
>>>
>>> Yes, I am talking about pacemaker alerts only.
>>>
>>> Please find my suggestions/requirements below:
>>>
>>> *Booth:*
>>> 1. Node5 booth-arbitrator should be able to give event when any of the
>>> booth node joins or leaves. booth-ip can be passed in event.
>>
>> This is not how booth works. Ticket leader (so site booth, never
>> arbitrator) executes election and get replies from other
>> sites/arbitrator. Follower executes election when leader hasn't for
>> configured timeout.
>>
>> What I want to say is, that there is no "membership" - as in (for
>> example) corosync fashion.
>>
>> The best we could get is the rough estimation based on election
>> request/replies.
>>
>>> 2. Event when booth-arbitrator is up successfully and has started
>>> monitoring the booth nodes.
>>
>> This is basically start of service. I think it's doable with small
>> change in unit file (something like
>>
>> https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html
>> )
>>
>>> 2. Geo site booth should be able to give event when its booth peers
>>> joins/leaves. For example, Geo site1 gives an event when node5
>>> booth-arbitrator joins/leaves OR site2 booth joins/leaves.  booth-ip can
>> be
>>> passed in event.
>>> 3. On ticket movements (revoke/grant), every booth node(Site1/2 and
>> node5)
>>> should give events.
>>
>> That would be doable
>>
>>>
>>> Note: pacemaker alerts works in a cluster. Since, arbitrator is a
>>> non-cluster node, not sure how exactly it will work there. But this is
>> good
>>> to have feature.
>>>
>>> *Qnetd/Qdevice:*
>>> This is similar to above.
>>> 1. Node5 qnetd should be able to raise an event when any of the cluster
>>> node joins/leaves the quorum.
>>
>> Doable
>>
>>> 2. Event when qnetd is up successfully and has started monitoring the
>>> cluster nodes
>>
>> Qnetd itself is not monitoring qdevice nodes (it doesn't have list of
>> nodes). It monitors node status after node joins (= it would be possible
>> to trigger event on leave). So that may be enough.
>>
>>> 3. Cluster node should be able to give event when any of the quorum node
>>> leaves/joins.
>>
>> You mean qdevice should be able to trigger event when connected to qnetd?
>>
>>>
>>> If you see on high level, then these are kind of node/resource events wrt
>>> booth and qnetd/qdevice.
>>
>> Yeah
>>
>>>
>>> As of today wrt booth/qnetd, I don't see any provision where any of the
>>> nodes gives any event when its peer leaves/joins. This makes it difficult
>>> to know whether geo sites nodes can see booth-arbitrator or not. This is
>>
>> Got it. That's exactly what would be really problematic to implement,
>> because of no "membership" in booth. It would be, however, possible to
>> implement message when ticket was granted/rejected and have a list of
>> other booths replies and what was their votes.
>>
>>> true the other way around also where booth-arbitrator cannot see geo
>> booth
>>> sites.
>>> I am not sure how others are doing it in today's deployment, but I see
>> need
>>> of monitoring of every other booth/qnet node. So that on basis of event,
>>> appropriate alarms can be raised and action can be taken accordingly.
>>>
>>> Please let me know if you agree on the usecases. I'll raise
>> feature-request
>>
>> I can agree on usecases, but (especially with booth) there are technical
>> problems on realizing them.
>>
>>> on the pacemaker upstream project accordingly.
>>
>> Please use booth (https://github.com/ClusterLabs/booth) and qdevice
>> (https://github.com/corosync/corosync-qdevice) upstream rather than
>> pacemaker, because these requests has really nothing to do with pcmk.
>>
>> Regards,
>>     honza
>>
>>>
>>> Thanks,
>>> Rohit
>>>
>>> On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse <jfriesse at redhat.com> wrote:
>>>
>>>> Hi Rohit,
>>>>
>>>> Rohit Saini napsal(a):
>>>>> Hi Team,
>>>>>
>>>>> Question-1:
>>>>> Similar to pcs alerts, do we have something similar for qdevice/qnetd?
>>>> This
>>>>
>>>> You mean pacemaker alerts right?
>>>>
>>>>> is to detect asynchronously if any of the member is
>>>> unreachable/joined/left
>>>>> and if that member is qdevice or qnetd.
>>>>
>>>> Nope but actually shouldn't be that hard to implement. What exactly
>>>> would you like to see there?
>>>>
>>>>>
>>>>> Question-2:
>>>>> Same above question for booth nodes and arbitrator. Is there any way to
>>>>> receive events from booth daemon?
>>>>
>>>> Not directly (again, shouldn't be that hard to implement). But pacemaker
>>>> alerts should be triggered when service changes state because of ticket
>>>> grant/reject, isn't it?
>>>>
>>>>>
>>>>> My main objective is to see if these daemons give events related to
>>>>> their internal state transitions  and raise some alarms accordingly.
>> For
>>>>> example, boothd arbitrator is unreachable, ticket moved from x to y,
>> etc.
>>>>
>>>> I don't think "boothd arbitrator is unreachable" alert is really doable.
>>>> Ticket moved from x to y would be probably two alerts - 1. ticket
>>>> rejected on X and 2. granted on Y.
>>>>
>>>> Would you mind to elaborate a bit more on events you would like to see
>>>> and potentially open issue for upstream project (or, if you have a RH
>>>> subscription try to contact GSS, so I get more time to work on this
>> issue).
>>>>
>>>> Regards,
>>>>      Honza
>>>>
>>>>>
>>>>> Thanks,
>>>>> Rohit
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Manage your subscription:
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>
>>>>
>>>>
>>>
>>
>>
> 



More information about the Users mailing list