[ClusterLabs] Fwd: Not getting Fencing monitor alerts

Wed Oct 17 10:20:29 EDT 2018

On Wed, 2018-10-17 at 12:18 +0530, Rohit Saini wrote:
> Hi Klaus,
> Please see answers below for your queries:
> 
> Do you have any evidence that monitoring is happening when "resources
> are unreachable"
> ( == fence_virtd is reachable?) like logs?
> [Rohit] Yes, monitoring is happening. I have already tested this. I'm
> getting pcs alerts accurately when monitoring goes down or up.

I'm not sure I understand the question, but monitor alerts are only
sent out when the monitor status changes. If there are 10 successful
monitors in a row and then a failure, there will be one alert for the
first successful monitor and then one alert for the failure.

> I would guess that there is no monitoring unless the fencing-resource 
> is accounted
> as started successfully. 
> [Rohit] If that's the case, then I am never going to get pcs alerts.
> Only way is to check status of resources via fence_xvm or fence_ilo4
> to know if resources are reachable or not. Do you agree with me?

Again I'm not too clear on the question, but monitors are only run on
started services, unless you configure a separate monitor with target-
role=Stopped. If a start fails, an alert will be sent for the failure;
if the start succeeds, an alert will be sent for the success, a monitor
will be started, and alerts will be sent for any change in monitor
status.

> 
> Thanks,
> Rohit 
> 
> On Tue, Oct 16, 2018 at 2:03 PM Klaus Wenninger <kwenning at redhat.com>
> wrote:
> > On 10/16/2018 07:43 AM, Rohit Saini wrote:
> > > Gentle Reminder!!
> > > 
> > > ---------- Forwarded message ---------
> > > From: Rohit Saini <rohitsaini111.forum at gmail.com>
> > > Date: Tue, Oct 9, 2018 at 2:51 PM
> > > Subject: Not getting Fencing monitor alerts
> > > To: <users at clusterlabs.org>
> > > 
> > > 
> > > Hi,
> > > I am facing issue in getting pcs alerts for fencing resources.
> > > 
> > > Scenario:
> > > 1. Configure the pcs alerts
> > > 2. Add stonith resources (resources are unreachable)
> > > 3. No monitoring alerts received.
> > > 
> > > Note:
> > > If stonith resources (reachable) are successfully added, then I
> > > get pcs alert for monitor link down and up.
> >  
> > Do you have any evidence that monitoring is happening when
> > "resources are unreachable"
> > ( == fence_virtd is reachable?) like logs?
> > I would guess that there is no monitoring unless the fencing-
> > resource is accounted
> > as started successfully.
> > 
> > Regards,
> > Klaus
> > >     ------PCS Alert configuration------
> > >    pcs alert create id=${PCS_ALERT_ID}
> > > path=/var/lib/pacemaker/pw_alert.sh                             
> > >                                                                  
> > >              
> > >     pcs alert recipient add ${PCS_ALERT_ID}
> > > value=/var/lib/pacemaker/pw_alert.sh
> > > 
> > > 	
> > > 	------Starting Stonith------
> > >     my_fence_name="fence-xvm-$my_hostname"
> > >     pcs stonith show $my_fence_name
> > >     if [ $? -ne 0 ]; then
> > >         #monitor on-fail is "ignore" which means "Pretend the
> > > resource did not fail".
> > >         #Only alarm will be generated if monitoring link goes
> > > down.
> > >         pcs stonith create $my_fence_name fence_xvm \
> > >         multicast_address=$my_mcast_addr port=$my_hostport \
> > >         pcmk_host_list=$my_hostname action=$actionvalue
> > > delay=$my_fence_delay \
> > >         op start interval="100s" on-fail="restart" \
> > >         op monitor interval="5s" on-fail="ignore"
> > >         pcs constraint colocation add $my_fence_name with master
> > > unicloud-master INFINITY
> > >         pcs constraint order start $my_fence_name then promote
> > > unicloud-master
> > >         pcs stonith update $my_fence_name meta failure-timeout=3s
> > >     fi
> > >     peer_fence_name="fence-xvm-$peer_hostname"
> > >     pcs stonith show $peer_fence_name
> > >     if [ $? -ne 0 ]; then
> > >         pcs stonith create $peer_fence_name fence_xvm \
> > >         multicast_address=$peer_mcast_addr port=$peer_hostport \
> > >         pcmk_host_list=$peer_hostname action=$actionvalue
> > > delay=$peer_fence_delay \
> > >         op start interval="100s" on-fail="restart" \
> > >         op monitor interval="5s" on-fail="ignore"
> > >         pcs constraint colocation add $peer_fence_name with
> > > master unicloud-master INFINITY
> > >         pcs constraint order start $peer_fence_name then promote
> > > unicloud-master
> > >         pcs stonith update $peer_fence_name meta failure-
> > > timeout=3s
> > >     fi
> > >                                                                  
> > >                                                                  
> > >                                                             
> > >     pcs property set stonith-enabled=true
> > > 
> > > 
> > > Thanks,
> > > Rohit
> > > 
> > > 
> > > _______________________________________________
> > > Users mailing list: Users at clusterlabs.org
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scra
> > > tch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >  
> > 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.
> pdf
> Bugs: http://bugs.clusterlabs.org
-- 
Ken Gaillot <kgaillot at redhat.com>