[ClusterLabs] Antw: Re: Notification agent and Notification recipients

Mon Aug 14 12:12:28 UTC 2017

On 08/14/2017 12:32 PM, Sriram wrote:
> Hi Ken,
>
> I used the alerts as well, seems to be not working.
>
> Please check the below configuration
> [root at node1 alerts]# pcs config show
> Cluster Name:
> Corosync Nodes:
> Pacemaker Nodes:
>  node1 node2 node3
>
> Resources:
>  Resource: TRR (class=ocf provider=heartbeat type=TimingRedundancyRA)
>   Operations: start interval=0s timeout=60s (TRR-start-interval-0s)
>               stop interval=0s timeout=20s (TRR-stop-interval-0s)
>               monitor interval=10 timeout=20 (TRR-monitor-interval-10)
>
> Stonith Devices:
> Fencing Levels:
>
> Location Constraints:
>   Resource: TRR
>     Enabled on: node1 (score:100) (id:location-TRR-node1-100)
>     Enabled on: node2 (score:200) (id:location-TRR-node2-200)
>     Enabled on: node3 (score:300) (id:location-TRR-node3-300)
> Ordering Constraints:
> Colocation Constraints:
> Ticket Constraints:
>
> Alerts:
>  Alert: alert_file (path=/usr/share/pacemaker/alert_file.sh)
>   Options: debug_exec_order=false
>   Meta options: timeout=15s
>   Recipients:
>    Recipient: recipient_alert_file_id
> (value=/usr/share/pacemaker/alert_file.log)

Did you pre-create the file with proper rights? Be aware that the
alert-agent
is called as user hacluster.

>
> Resources Defaults:
>  resource-stickiness: INFINITY
> Operations Defaults:
>  No defaults set
>
> Cluster Properties:
>  cluster-infrastructure: corosync
>  dc-version: 1.1.15-11.el7_3.4-e174ec8
>  default-action-timeout: 240
>  have-watchdog: false
>  no-quorum-policy: ignore
>  placement-strategy: balanced
>  stonith-enabled: false
>  symmetric-cluster: false
>
> Quorum:
>   Options:
>
>
> /usr/share/pacemaker/alert_file.sh does not get called whenever I
> trigger a scenario for failover.
> Please let me know if I m missing anything.

Do you get any logs - like for startup of resources - or nothing at all?

Regards,
Klaus

>
>
> Regards,
> Sriram.
>
> On Tue, Aug 8, 2017 at 8:29 PM, Ken Gaillot <kgaillot at redhat.com
> <mailto:kgaillot at redhat.com>> wrote:
>
>     On Tue, 2017-08-08 at 17:40 +0530, Sriram wrote:
>     > Hi Ulrich,
>     >
>     >
>     > Please see inline.
>     >
>     > On Tue, Aug 8, 2017 at 2:01 PM, Ulrich Windl
>     > <Ulrich.Windl at rz.uni-regensburg.de
>     <mailto:Ulrich.Windl at rz.uni-regensburg.de>> wrote:
>     >         >>> Sriram <sriram.ec at gmail.com
>     <mailto:sriram.ec at gmail.com>> schrieb am 08.08.2017
>     <tel:08.08.2017> um
>     >         09:30 in Nachricht
>     >         <CAMvdjurcQc6t=ZfGr=cRL25Xq0Je9h9F_TvZXyxVAn3n
>     >         +Dvcgw at mail.gmail.com <mailto:Dvcgw at mail.gmail.com>>:
>     >         > Hi Ken & Jan,
>     >         >
>     >         > In the cluster we have, there is only one resource
>     running.
>     >         Its a OPT-IN
>     >         > cluster with resource-stickiness set to INFINITY.
>     >         >
>     >         > Just to clarify my question, lets take a scenario where
>     >         there are four
>     >         > nodes N1, N2, N3, N4
>     >         > a. N1 comes up first, starts the cluster.
>     >
>     >         The cluster will start once it has a quorum.
>     >
>     >         > b. N1 Checks that there is no resource running, so it will
>     >         add the
>     >         > resource(R) with the some location constraint(lets say
>     score
>     >         100)
>     >         > c. So Resource(R) runs in N1 now.
>     >         > d. N2 comes up next, checks that resource(R) is already
>     >         running in N1, so
>     >         > it will update the location constraint(lets say score 200)
>     >         > e. N3 comes up next, checks that resource(R) is already
>     >         running in N1, so
>     >         > it will update the location constraint(lets say score 300)
>     >
>     >         See my remark on quorum above.
>     >
>     > Yes you are right, I forgot to mention it.
>     >
>     >
>     >         > f.  N4 comes up next, checks that resource(R) is already
>     >         running in N1, so
>     >         > it will update the location constraint(lets say score 400)
>     >         > g. For the some reason, if N1 goes down, resource(R)
>     shifts
>     >         to N4(as its
>     >         > score is higher than anyone).
>     >         >
>     >         > In this case is it possible to notify the nodes N2, N3
>     that
>     >         newly elected
>     >         > active node is N4 ?
>     >
>     >         What type of notification, and what would the node do
>     with it?
>     >         Any node in the cluster always has up to date configuration
>     >         information. So it knows the status of the other nodes also.
>     >
>     >
>     > I agree that the node always has upto date configuration
>     information,
>     > but an application or a thread needs to poll for that
>     information. Is
>     > there any way, where the notifications are received through some
>     > action function in RA. ?
>
>     Ah, I misunderstood your situation, I thought you had a cloned
>     resource.
>
>     For that, the alerts feature (available in Pacemaker 1.1.15 and later)
>     might be useful:
>
>     http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139900098676896
>     <http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139900098676896>
>
>
>     >
>     >
>     > Regards,
>     > Sriram.
>     >
>     >         >
>     >         > I went through clone notifications and master-slave, Iooks
>     >         like it either
>     >         > requires identical resources(Anonymous) or Unique or
>     >         Stateful resources to
>     >         > be running
>     >         > in all the nodes of the cluster, where as in our case
>     there
>     >         is only
>     >         > resource running in the whole cluster.
>     >
>     >         Maybe the main reason for not having notifications is
>     that if
>     >         a node fails hard, it won't be able to send out much status
>     >         information to the other nodes.
>     >
>     >         Regards,
>     >         Ulrich
>     >
>     >         >
>     >         > Regards,
>     >         > Sriram.
>     >         >
>     >         >
>     >         >
>     >         >
>     >         > On Mon, Aug 7, 2017 at 11:28 AM, Sriram
>     >         <sriram.ec at gmail.com <mailto:sriram.ec at gmail.com>> wrote:
>     >         >
>     >         >>
>     >         >> Thanks Ken, Jan. Will look into the clone notifications.
>     >         >>
>     >         >> Regards,
>     >         >> Sriram.
>     >         >>
>     >         >> On Sat, Aug 5, 2017 at 1:25 AM, Ken Gaillot
>     >         <kgaillot at redhat.com <mailto:kgaillot at redhat.com>> wrote:
>     >         >>
>     >         >>> On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote:
>     >         >>> >
>     >         >>> > Hi Team,
>     >         >>> >
>     >         >>> >
>     >         >>> > We have a four node cluster (1 active : 3 standby) in
>     >         our lab for a
>     >         >>> > particular service. If the active node goes down,
>     one of
>     >         the three
>     >         >>> > standby node  becomes active. Now there will be (1
>     >         active :  2
>     >         >>> > standby : 1 offline).
>     >         >>> >
>     >         >>> >
>     >         >>> > Is there any way where this newly elected node sends
>     >         notification to
>     >         >>> > the remaining 2 standby nodes about its new status ?
>     >         >>>
>     >         >>> Hi Sriram,
>     >         >>>
>     >         >>> This depends on how your service is configured in the
>     >         cluster.
>     >         >>>
>     >         >>> If you have a clone or master/slave resource, then clone
>     >         notifications
>     >         >>> is probably what you want (not alerts, which is the path
>     >         you were going
>     >         >>> down -- alerts are designed to e.g. email a system
>     >         administrator after
>     >         >>> an important event).
>     >         >>>
>     >         >>> For details about clone notifications, see:
>     >         >>>
>     >         >>>
>     >       
>      http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing
>     <http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-sing>
>     >         >>>
>     >       
>      le/Pacemaker_Explained/index.html#_clone_resource_agent_requirements
>     >         >>>
>     >         >>> The RA must support the "notify" action, which will be
>     >         called when a
>     >         >>> clone instance is started or stopped. See the similar
>     >         section later for
>     >         >>> master/slave resources for additional information.
>     See the
>     >         mysql or
>     >         >>> pgsql resource agents for examples of notify
>     >         implementations.
>     >         >>>
>     >         >>> > I was exploring "notification agent" and "notification
>     >         recipient"
>     >         >>> > features, but that doesn't seem to
>     >         work. /etc/sysconfig/notify.sh
>     >         >>> > doesn't get invoked even in the newly elected active
>     >         node.
>     >         >>>
>     >         >>> Yep, that's something different altogether -- it's only
>     >         enabled on RHEL
>     >         >>> systems, and solely for backward compatibility with an
>     >         early
>     >         >>> implementation of the alerts interface. The new alerts
>     >         interface is more
>     >         >>> flexible, but it's not designed to send information
>     >         between cluster
>     >         >>> nodes -- it's designed to send information to something
>     >         external to the
>     >         >>> cluster, such as a human, or an SNMP server, or a
>     >         monitoring system.
>     >         >>>
>     >         >>>
>     >         >>> > Cluster Properties:
>     >         >>> >  cluster-infrastructure: corosync
>     >         >>> >  dc-version: 1.1.17-e2e6cdce80
>     >         >>> >  default-action-timeout: 240
>     >         >>> >  have-watchdog: false
>     >         >>> >  no-quorum-policy: ignore
>     >         >>> >  notification-agent: /etc/sysconfig/notify.sh
>     >         >>> >  notification-recipient: /var/log/notify.log
>     >         >>> >  placement-strategy: balanced
>     >         >>> >  stonith-enabled: false
>     >         >>> >  symmetric-cluster: false
>     >         >>> >
>     >         >>> >
>     >         >>> >
>     >         >>> >
>     >         >>> > I m using the following versions of pacemaker and
>     >         corosync.
>     >         >>> >
>     >         >>> >
>     >         >>> > /usr/sbin # ./pacemakerd --version
>     >         >>> > Pacemaker 1.1.17
>     >         >>> > Written by Andrew Beekhof
>     >         >>> > /usr/sbin # ./corosync -v
>     >         >>> > Corosync Cluster Engine, version '2.3.5'
>     >         >>> > Copyright (c) 2006-2009 Red Hat, Inc.
>     >         >>> >
>     >         >>> >
>     >         >>> > Can you please suggest if I m doing anything wrong
>     or if
>     >         there any
>     >         >>> > other mechanisms to achieve this ?
>     >         >>> >
>     >         >>> >
>     >         >>> > Regards,
>     >         >>> > Sriram.
>
>     --
>     Ken Gaillot <kgaillot at redhat.com <mailto:kgaillot at redhat.com>>
>
>
>
>
>
>     _______________________________________________
>     Users mailing list: Users at clusterlabs.org
>     <mailto:Users at clusterlabs.org>
>     http://lists.clusterlabs.org/mailman/listinfo/users
>     <http://lists.clusterlabs.org/mailman/listinfo/users>
>
>     Project Home: http://www.clusterlabs.org
>     Getting started:
>     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>     <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>     Bugs: http://bugs.clusterlabs.org
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.clusterlabs.org/pipermail/users/attachments/20170814/d7918688/attachment-0002.html>