[ClusterLabs] Antw: Re: Notification agent and Notification recipients

Mon Aug 14 16:04:10 CEST 2017

On 08/14/2017 03:19 PM, Sriram wrote:
> Yes, I had precreated the script file with the required permission. 
>
> [root@*node1* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4140 Aug 14 01:51
> /usr/share/pacemaker/alert_file.sh
>  [root@*node2* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51
> /usr/share/pacemaker/alert_file.sh
> [root@*node3* alerts]# ls -l /usr/share/pacemaker/alert_file.sh
> -rwxr-xr-x. 1 root root 4139 Aug 14 01:51
> /usr/share/pacemaker/alert_file.sh
>
> Later I observed that user "hacluster" is not able to create the log
> file under /usr/share/pacemaker/alert_file.log.
> I am sorry, I should have observed this in the log before posting the
> query. Then I gave the path as /tmp/alert_file.log, it is able to
> create now.
> Thanks for pointing it out.
>
> I have one more clarification,
>
> if the resource is running in node2,
> [root at node2 tmp]# pcs resource
>  TRR    (ocf::heartbeat:TimingRedundancyRA):    Started node2
>
> And I executed the below command to make it standby.
> [root at node2 tmp] # pcs node standby node2
>
> Resource shifted to node3, because of higher location constraint.
> [root at node2 tmp]# pcs resource
>  TRR    (ocf::heartbeat:TimingRedundancyRA):    Started node3.
>
>
> I got the log file created under node2(resource stopped) and
> node3(resource started). 
>
> Node1 was not notified about the resource shift, I mean no log file
> was created there.
> Its because alerts are designed to notify the external agents about
> the cluster events. Its not for internal notifications.
>
> Is my understanding correct ?

Quite simple: crmd of node1 just didn't have anything to do with
shifting the resource
from node2 -> node3. There is no additional information passed between
the nodes
just to create a full set of notifications on every node. If you want to
have a full log
(or whatever you altert-agent is doing) in one place this would be up to
your alert-agent.

Regards,
Klaus

>  
> Regards,
> Sriram.
>
>
>
> On Mon, Aug 14, 2017 at 5:42 PM, Klaus Wenninger <kwenning at redhat.com
> <mailto:kwenning at redhat.com>> wrote:
>
>     On 08/14/2017 12:32 PM, Sriram wrote:
>>     Hi Ken,
>>
>>     I used the alerts as well, seems to be not working.
>>
>>     Please check the below configuration
>>     [root at node1 alerts]# pcs config show
>>     Cluster Name:
>>     Corosync Nodes:
>>     Pacemaker Nodes:
>>      node1 node2 node3
>>
>>     Resources:
>>      Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
>>       Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
>>                   stop interval=0s timeout=20s (TRR-stop-interval-0s)
>>                   monitor interval=10 timeout=20
>>     (TRR-monitor-interval-10)
>>
>>     Stonith Devices:
>>     Fencing Levels:
>>
>>     Location Constraints:
>>       Resource: TRR
>>         Enabled on: node1 (score:100) (id:location-TRR-node1-100)
>>         Enabled on: node2 (score:200) (id:location-TRR-node2-200)
>>         Enabled on: node3 (score:300) (id:location-TRR-node3-300)
>>     Ordering Constraints:
>>     Colocation Constraints:
>>     Ticket Constraints:
>>
>>     Alerts:
>>      Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
>>       Options: debug_exec_order=false
>>       Meta options: timeout=15s
>>       Recipients:
>>        Recipient: recipient_alert_file_id
>>     (value=/usr/share/pacemaker/alert_file.log)
>
>     Did you pre-create the file with proper rights? Be aware that the
>     alert-agent
>     is called as user hacluster.
>
>>
>>     Resources Defaults:
>>      resource-stickiness: INFINITY
>>     Operations Defaults:
>>      No defaults set
>>
>>     Cluster Properties:
>>      cluster-infrastructure: corosync
>>      dc-version: 1.1.15-11.el7_3.4-e174ec8
>>      default-action-timeout: 240
>>      have-watchdog: false
>>      no-quorum-policy: ignore
>>      placement-strategy: balanced
>>      stonith-enabled: false
>>      symmetric-cluster: false
>>
>>     Quorum:
>>       Options:
>>
>>
>>     /usr/share/pacemaker/alert_file.sh does not get called whenever I
>>     trigger a scenario for failover.
>>     Please let me know if I m missing anything.
>
>     Do you get any logs - like for startup of resources - or nothing
>     at all?
>
>     Regards,
>     Klaus
>
>
>>
>>
>>     Regards,
>>     Sriram.
>>
>>     On Tue, Aug 8, 2017 at 8:29 PM, Ken Gaillot <kgaillot at redhat.com
>>     <mailto:kgaillot at redhat.com>> wrote:
>>
>>         On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
>>         > Hi Ulrich,
>>         >
>>         >
>>         > Please see inline.
>>         >
>>         > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
>>         > <Ulrich.Windl at rz.uni-regensburg.de
>>         <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
>>         >         >>> Sriram <sriram.ec at gmail.com
>>         <mailto:sriram.ec at gmail.com>> schrieb am 08.08.2017
>>         <tel:08.08.2017> um
>>         >         09:30 in Nachricht
>>         >         <CAMvdjurcQc6t=ZfGr=cRL25Xq0Je9h9F_TvZXyxVAn3n
>>         >         +Dvcgw at mail.gmail.com <mailto:Dvcgw at mail.gmail.com>>:
>>         >         > Hi Ken & Jan,
>>         >         >
>>         >         > In the cluster we have, there is only one
>>         resource running.
>>         >         Its a OPT-IN
>>         >         > cluster with resource-stickiness set to INFINITY.
>>         >         >
>>         >         > Just to clarify my question, lets take a scenario
>>         where
>>         >         there are four
>>         >         > nodes N1, N2, N3, N4
>>         >         > a. N1 comes up first, starts the cluster.
>>         >
>>         >         The cluster will start once it has a quorum.
>>         >
>>         >         > b. N1 Checks that there is no resource running,
>>         so it will
>>         >         add the
>>         >         > resource(R) with the some location
>>         constraint(lets say score
>>         >         100)
>>         >         > c. So Resource(R) runs in N1 now.
>>         >         > d. N2 comes up next, checks that resource(R) is
>>         already
>>         >         running in N1, so
>>         >         > it will update the location constraint(lets say
>>         score 200)
>>         >         > e. N3 comes up next, checks that resource(R) is
>>         already
>>         >         running in N1, so
>>         >         > it will update the location constraint(lets say
>>         score 300)
>>         >
>>         >         See my remark on quorum above.
>>         >
>>         > Yes you are right, I forgot to mention it.
>>         >
>>         >
>>         >         > f.  N4 comes up next, checks that resource(R) is
>>         already
>>         >         running in N1, so
>>         >         > it will update the location constraint(lets say
>>         score 400)
>>         >         > g. For the some reason, if N1 goes down,
>>         resource(R) shifts
>>         >         to N4(as its
>>         >         > score is higher than anyone).
>>         >         >
>>         >         > In this case is it possible to notify the nodes
>>         N2, N3 that
>>         >         newly elected
>>         >         > active node is N4 ?
>>         >
>>         >         What type of notification, and what would the node
>>         do with it?
>>         >         Any node in the cluster always has up to date
>>         configuration
>>         >         information. So it knows the status of the other
>>         nodes also.
>>         >
>>         >
>>         > I agree that the node always has upto date configuration
>>         information,
>>         > but an application or a thread needs to poll for that
>>         information. Is
>>         > there any way, where the notifications are received through
>>         some
>>         > action function in RA. ?
>>
>>         Ah, I misunderstood your situation, I thought you had a
>>         cloned resource.
>>
>>         For that, the alerts feature (available in Pacemaker 1.1.15
>>         and later)
>>         might be useful:
>>
>>         http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139900098676896
>>         <http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139900098676896>
>>
>>
>>         >
>>         >
>>         > Regards,
>>         > Sriram.
>>         >
>>         >         >
>>         >         > I went through clone notifications and
>>         master-slave, Iooks
>>         >         like it either
>>         >         > requires identical resources(Anonymous) or Unique or
>>         >         Stateful resources to
>>         >         > be running
>>         >         > in all the nodes of the cluster, where as in our
>>         case there
>>         >         is only
>>         >         > resource running in the whole cluster.
>>         >
>>         >         Maybe the main reason for not having notifications
>>         is that if
>>         >         a node fails hard, it won't be able to send out
>>         much status
>>         >         information to the other nodes.
>>         >
>>         >         Regards,
>>         >         Ulrich
>>         >
>>         >         >
>>         >         > Regards,
>>         >         > Sriram.
>>         >         >
>>         >         >
>>         >         >
>>         >         >
>>         >         > On Mon, Aug 7, 2017 at 11:28 AM, Sriram
>>         >         <sriram.ec at gmail.com <mailto:sriram.ec at gmail.com>>
>>         wrote:
>>         >         >
>>         >         >>
>>         >         >> Thanks Ken, Jan. Will look into the clone
>>         notifications.
>>         >         >>
>>         >         >> Regards,
>>         >         >> Sriram.
>>         >         >>
>>         >         >> On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot
>>         >         <kgaillot at redhat.com <mailto:kgaillot at redhat.com>>
>>         wrote:
>>         >         >>
>>         >         >>> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
>>         >         >>> >
>>         >         >>> > Hi Team,
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > We have a four node cluster (1 active : 3
>>         standby) in
>>         >         our lab for a
>>         >         >>> > particular service. If the active node goes
>>         down, one of
>>         >         the three
>>         >         >>> > standby node  becomes active. Now there will
>>         be (1
>>         >         active :  2
>>         >         >>> > standby : 1 offline).
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > Is there any way where this newly elected
>>         node sends
>>         >         notification to
>>         >         >>> > the remaining 2 standby nodes about its new
>>         status ?
>>         >         >>>
>>         >         >>> Hi Sriram,
>>         >         >>>
>>         >         >>> This depends on how your service is configured
>>         in the
>>         >         cluster.
>>         >         >>>
>>         >         >>> If you have a clone or master/slave resource,
>>         then clone
>>         >         notifications
>>         >         >>> is probably what you want (not alerts, which is
>>         the path
>>         >         you were going
>>         >         >>> down -- alerts are designed to e.g. email a system
>>         >         administrator after
>>         >         >>> an important event).
>>         >         >>>
>>         >         >>> For details about clone notifications, see:
>>         >         >>>
>>         >         >>>
>>         >       
>>          http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
>>         <http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing>
>>         >         >>>
>>         >       
>>          le/Pacemaker_Explained/index.html#_clone_resource_agent_requirements
>>         >         >>>
>>         >         >>> The RA must support the "notify" action, which
>>         will be
>>         >         called when a
>>         >         >>> clone instance is started or stopped. See the
>>         similar
>>         >         section later for
>>         >         >>> master/slave resources for additional
>>         information. See the
>>         >         mysql or
>>         >         >>> pgsql resource agents for examples of notify
>>         >         implementations.
>>         >         >>>
>>         >         >>> > I was exploring "notification agent" and
>>         "notification
>>         >         recipient"
>>         >         >>> > features, but that doesn't seem to
>>         >         work. /etc/sysconfig/notify.sh
>>         >         >>> > doesn't get invoked even in the newly elected
>>         active
>>         >         node.
>>         >         >>>
>>         >         >>> Yep, that's something different altogether --
>>         it's only
>>         >         enabled on RHEL
>>         >         >>> systems, and solely for backward compatibility
>>         with an
>>         >         early
>>         >         >>> implementation of the alerts interface. The new
>>         alerts
>>         >         interface is more
>>         >         >>> flexible, but it's not designed to send information
>>         >         between cluster
>>         >         >>> nodes -- it's designed to send information to
>>         something
>>         >         external to the
>>         >         >>> cluster, such as a human, or an SNMP server, or a
>>         >         monitoring system.
>>         >         >>>
>>         >         >>>
>>         >         >>> > Cluster Properties:
>>         >         >>> >  cluster-infrastructure: corosync
>>         >         >>> >  dc-version: 1.1.17-e2e6cdce80
>>         >         >>> >  default-action-timeout: 240
>>         >         >>> >  have-watchdog: false
>>         >         >>> >  no-quorum-policy: ignore
>>         >         >>> >  notification-agent: /etc/sysconfig/notify.sh
>>         >         >>> >  notification-recipient: /var/log/notify.log
>>         >         >>> >  placement-strategy: balanced
>>         >         >>> >  stonith-enabled: false
>>         >         >>> >  symmetric-cluster: false
>>         >         >>> >
>>         >         >>> >
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > I m using the following versions of pacemaker and
>>         >         corosync.
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > /usr/sbin # ./pacemakerd --version
>>         >         >>> > Pacemaker 1.1.17
>>         >         >>> > Written by Andrew Beekhof
>>         >         >>> > /usr/sbin # ./corosync -v
>>         >         >>> > Corosync Cluster Engine, version '2.3.5'
>>         >         >>> > Copyright (c) 2006-2009 Red Hat, Inc.
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > Can you please suggest if I m doing anything
>>         wrong or if
>>         >         there any
>>         >         >>> > other mechanisms to achieve this ?
>>         >         >>> >
>>         >         >>> >
>>         >         >>> > Regards,
>>         >         >>> > Sriram.
>>
>>         --
>>         Ken Gaillot <kgaillot at redhat.com <mailto:kgaillot at redhat.com>>
>>
>>
>>
>>
>>
>>         _______________________________________________
>>         Users mailing list: Users at clusterlabs.org
>>         <mailto:Users at clusterlabs.org>
>>         http://lists.clusterlabs.org/mailman/listinfo/users
>>         <http://lists.clusterlabs.org/mailman/listinfo/users>
>>
>>         Project Home: http://www.clusterlabs.org
>>         Getting started:
>>         http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>         <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>>         Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>>
>>     _______________________________________________
>>     Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
>>     http://lists.clusterlabs.org/mailman/listinfo/users
>>     <http://lists.clusterlabs.org/mailman/listinfo/users>
>>
>>     Project Home: http://www.clusterlabs.org
>>     Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>>     Bugs: http://bugs.clusterlabs.org
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170814/8c384b0b/attachment-0001.html>