[ClusterLabs] Antw: Re: Notification agent and Notification recipients

Tue Aug 15 07:51:22 UTC 2017

Thanks for clarifying.

Regards,
Sriram.

On Mon, Aug 14, 2017 at 7:34 PM, Klaus Wenninger <kwenning at redhat.com>
wrote:

> On 08/14/2017 03:19 PM, Sriram wrote:
>
> Yes, I had precreated the script file with the required permission.
>
> [root@*node1* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4140 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
>  [root@*node2* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
> [root@*node3* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51 /usr/share/pacemaker/alert_
> file.sh
>
> Later I observed that user "hacluster" is not able to create the log file
> under /usr/share/pacemaker/alert_file.log.
> I am sorry, I should have observed this in the log before posting the
> query. Then I gave the path as /tmp/alert_file.log, it is able to create
> now.
> Thanks for pointing it out.
>
> I have one more clarification,
>
> if the resource is running in node2,
> [root at node2 tmp]# pcs resource
>  TRR    (ocf::heartbeat:TimingRedundancyRA):    Started node2
>
> And I executed the below command to make it standby.
> [root at node2 tmp] # pcs node standby node2
>
> Resource shifted to node3, because of higher location constraint.
> [root at node2 tmp]# pcs resource
>  TRR    (ocf::heartbeat:TimingRedundancyRA):    Started node3.
>
>
> I got the log file created under node2(resource stopped) and
> node3(resource started).
>
> Node1 was not notified about the resource shift, I mean no log file was
> created there.
> Its because alerts are designed to notify the external agents about the
> cluster events. Its not for internal notifications.
>
> Is my understanding correct ?
>
>
> Quite simple: crmd of node1 just didn't have anything to do with shifting
> the resource
> from node2 -> node3. There is no additional information passed between the
> nodes
> just to create a full set of notifications on every node. If you want to
> have a full log
> (or whatever you altert-agent is doing) in one place this would be up to
> your alert-agent.
>
>
> Regards,
> Klaus
>
>
> Regards,
> Sriram.
>
>
>
> On Mon, Aug 14, 2017 at 5:42 PM, Klaus Wenninger <kwenning at redhat.com>
> wrote:
>
>> On 08/14/2017 12:32 PM, Sriram wrote:
>>
>> Hi Ken,
>>
>> I used the alerts as well, seems to be not working.
>>
>> Please check the below configuration
>> [root at node1 alerts]# pcs config show
>> Cluster Name:
>> Corosync Nodes:
>> Pacemaker Nodes:
>>  node1 node2 node3
>>
>> Resources:
>>  Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
>>   Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
>>               stop interval=0s timeout=20s (TRR-stop-interval-0s)
>>               monitor interval=10 timeout=20 (TRR-monitor-interval-10)
>>
>> Stonith Devices:
>> Fencing Levels:
>>
>> Location Constraints:
>>   Resource: TRR
>>     Enabled on: node1 (score:100) (id:location-TRR-node1-100)
>>     Enabled on: node2 (score:200) (id:location-TRR-node2-200)
>>     Enabled on: node3 (score:300) (id:location-TRR-node3-300)
>> Ordering Constraints:
>> Colocation Constraints:
>> Ticket Constraints:
>>
>> Alerts:
>>  Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
>>   Options: debug_exec_order=false
>>   Meta options: timeout=15s
>>   Recipients:
>>    Recipient: recipient_alert_file_id (value=/usr/share/pacemaker/al
>> ert_file.log)
>>
>>
>> Did you pre-create the file with proper rights? Be aware that the
>> alert-agent
>> is called as user hacluster.
>>
>>
>> Resources Defaults:
>>  resource-stickiness: INFINITY
>> Operations Defaults:
>>  No defaults set
>>
>> Cluster Properties:
>>  cluster-infrastructure: corosync
>>  dc-version: 1.1.15-11.el7_3.4-e174ec8
>>  default-action-timeout: 240
>>  have-watchdog: false
>>  no-quorum-policy: ignore
>>  placement-strategy: balanced
>>  stonith-enabled: false
>>  symmetric-cluster: false
>>
>> Quorum:
>>   Options:
>>
>>
>> /usr/share/pacemaker/alert_file.sh does not get called whenever I
>> trigger a scenario for failover.
>> Please let me know if I m missing anything.
>>
>>
>> Do you get any logs - like for startup of resources - or nothing at all?
>>
>> Regards,
>> Klaus
>>
>>
>>
>>
>> Regards,
>> Sriram.
>>
>> On Tue, Aug 8, 2017 at 8:29 PM, Ken Gaillot <kgaillot at redhat.com> wrote:
>>
>>> On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
>>> > Hi Ulrich,
>>> >
>>> >
>>> > Please see inline.
>>> >
>>> > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
>>> > <Ulrich.Windl at rz.uni-regensburg.de> wrote:
>>> >         >>> Sriram <sriram.ec at gmail.com> schrieb am 08.08.2017 um
>>> >         09:30 in Nachricht
>>> >         <CAMvdjurcQc6t=ZfGr=cRL25Xq0Je9h9F_TvZXyxVAn3n
>>> >         +Dvcgw at mail.gmail.com>:
>>> >         > Hi Ken & Jan,
>>> >         >
>>> >         > In the cluster we have, there is only one resource running.
>>> >         Its a OPT-IN
>>> >         > cluster with resource-stickiness set to INFINITY.
>>> >         >
>>> >         > Just to clarify my question, lets take a scenario where
>>> >         there are four
>>> >         > nodes N1, N2, N3, N4
>>> >         > a. N1 comes up first, starts the cluster.
>>> >
>>> >         The cluster will start once it has a quorum.
>>> >
>>> >         > b. N1 Checks that there is no resource running, so it will
>>> >         add the
>>> >         > resource(R) with the some location constraint(lets say score
>>> >         100)
>>> >         > c. So Resource(R) runs in N1 now.
>>> >         > d. N2 comes up next, checks that resource(R) is already
>>> >         running in N1, so
>>> >         > it will update the location constraint(lets say score 200)
>>> >         > e. N3 comes up next, checks that resource(R) is already
>>> >         running in N1, so
>>> >         > it will update the location constraint(lets say score 300)
>>> >
>>> >         See my remark on quorum above.
>>> >
>>> > Yes you are right, I forgot to mention it.
>>> >
>>> >
>>> >         > f.  N4 comes up next, checks that resource(R) is already
>>> >         running in N1, so
>>> >         > it will update the location constraint(lets say score 400)
>>> >         > g. For the some reason, if N1 goes down, resource(R) shifts
>>> >         to N4(as its
>>> >         > score is higher than anyone).
>>> >         >
>>> >         > In this case is it possible to notify the nodes N2, N3 that
>>> >         newly elected
>>> >         > active node is N4 ?
>>> >
>>> >         What type of notification, and what would the node do with it?
>>> >         Any node in the cluster always has up to date configuration
>>> >         information. So it knows the status of the other nodes also.
>>> >
>>> >
>>> > I agree that the node always has upto date configuration information,
>>> > but an application or a thread needs to poll for that information. Is
>>> > there any way, where the notifications are received through some
>>> > action function in RA. ?
>>>
>>> Ah, I misunderstood your situation, I thought you had a cloned resource.
>>>
>>> For that, the alerts feature (available in Pacemaker 1.1.15 and later)
>>> might be useful:
>>>
>>> http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
>>> le/Pacemaker_Explained/index.html#idm139900098676896
>>>
>>>
>>> >
>>> >
>>> > Regards,
>>> > Sriram.
>>> >
>>> >         >
>>> >         > I went through clone notifications and master-slave, Iooks
>>> >         like it either
>>> >         > requires identical resources(Anonymous) or Unique or
>>> >         Stateful resources to
>>> >         > be running
>>> >         > in all the nodes of the cluster, where as in our case there
>>> >         is only
>>> >         > resource running in the whole cluster.
>>> >
>>> >         Maybe the main reason for not having notifications is that if
>>> >         a node fails hard, it won't be able to send out much status
>>> >         information to the other nodes.
>>> >
>>> >         Regards,
>>> >         Ulrich
>>> >
>>> >         >
>>> >         > Regards,
>>> >         > Sriram.
>>> >         >
>>> >         >
>>> >         >
>>> >         >
>>> >         > On Mon, Aug 7, 2017 at 11:28 AM, Sriram
>>> >         <sriram.ec at gmail.com> wrote:
>>> >         >
>>> >         >>
>>> >         >> Thanks Ken, Jan. Will look into the clone notifications.
>>> >         >>
>>> >         >> Regards,
>>> >         >> Sriram.
>>> >         >>
>>> >         >> On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot
>>> >         <kgaillot at redhat.com> wrote:
>>> >         >>
>>> >         >>> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
>>> >         >>> >
>>> >         >>> > Hi Team,
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > We have a four node cluster (1 active : 3 standby) in
>>> >         our lab for a
>>> >         >>> > particular service. If the active node goes down, one of
>>> >         the three
>>> >         >>> > standby node  becomes active. Now there will be (1
>>> >         active :  2
>>> >         >>> > standby : 1 offline).
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > Is there any way where this newly elected node sends
>>> >         notification to
>>> >         >>> > the remaining 2 standby nodes about its new status ?
>>> >         >>>
>>> >         >>> Hi Sriram,
>>> >         >>>
>>> >         >>> This depends on how your service is configured in the
>>> >         cluster.
>>> >         >>>
>>> >         >>> If you have a clone or master/slave resource, then clone
>>> >         notifications
>>> >         >>> is probably what you want (not alerts, which is the path
>>> >         you were going
>>> >         >>> down -- alerts are designed to e.g. email a system
>>> >         administrator after
>>> >         >>> an important event).
>>> >         >>>
>>> >         >>> For details about clone notifications, see:
>>> >         >>>
>>> >         >>>
>>> >         http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
>>> >         >>>
>>> >         le/Pacemaker_Explained/index.html#_clone_resource_agent_req
>>> uirements
>>> >         >>>
>>> >         >>> The RA must support the "notify" action, which will be
>>> >         called when a
>>> >         >>> clone instance is started or stopped. See the similar
>>> >         section later for
>>> >         >>> master/slave resources for additional information. See the
>>> >         mysql or
>>> >         >>> pgsql resource agents for examples of notify
>>> >         implementations.
>>> >         >>>
>>> >         >>> > I was exploring "notification agent" and "notification
>>> >         recipient"
>>> >         >>> > features, but that doesn't seem to
>>> >         work. /etc/sysconfig/notify.sh
>>> >         >>> > doesn't get invoked even in the newly elected active
>>> >         node.
>>> >         >>>
>>> >         >>> Yep, that's something different altogether -- it's only
>>> >         enabled on RHEL
>>> >         >>> systems, and solely for backward compatibility with an
>>> >         early
>>> >         >>> implementation of the alerts interface. The new alerts
>>> >         interface is more
>>> >         >>> flexible, but it's not designed to send information
>>> >         between cluster
>>> >         >>> nodes -- it's designed to send information to something
>>> >         external to the
>>> >         >>> cluster, such as a human, or an SNMP server, or a
>>> >         monitoring system.
>>> >         >>>
>>> >         >>>
>>> >         >>> > Cluster Properties:
>>> >         >>> >  cluster-infrastructure: corosync
>>> >         >>> >  dc-version: 1.1.17-e2e6cdce80
>>> >         >>> >  default-action-timeout: 240
>>> >         >>> >  have-watchdog: false
>>> >         >>> >  no-quorum-policy: ignore
>>> >         >>> >  notification-agent: /etc/sysconfig/notify.sh
>>> >         >>> >  notification-recipient: /var/log/notify.log
>>> >         >>> >  placement-strategy: balanced
>>> >         >>> >  stonith-enabled: false
>>> >         >>> >  symmetric-cluster: false
>>> >         >>> >
>>> >         >>> >
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > I m using the following versions of pacemaker and
>>> >         corosync.
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > /usr/sbin # ./pacemakerd --version
>>> >         >>> > Pacemaker 1.1.17
>>> >         >>> > Written by Andrew Beekhof
>>> >         >>> > /usr/sbin # ./corosync -v
>>> >         >>> > Corosync Cluster Engine, version '2.3.5'
>>> >         >>> > Copyright (c) 2006-2009 Red Hat, Inc.
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > Can you please suggest if I m doing anything wrong or if
>>> >         there any
>>> >         >>> > other mechanisms to achieve this ?
>>> >         >>> >
>>> >         >>> >
>>> >         >>> > Regards,
>>> >         >>> > Sriram.
>>>
>>> --
>>> Ken Gaillot <kgaillot at redhat.com>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> http://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.orghttp://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170815/31dec811/attachment-0002.html>