[Pacemaker] Stonith Device APC AP7900

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Oct 28 11:16:17 EDT 2010


Hi,

On Thu, Oct 28, 2010 at 08:35:30AM -0600, Rick Cone wrote:
> Dejan,
> 
> Sorry to confuse. I just assumed a script would be issued or new version,
> etc.
> 
> I do apologize for this though.

No problems.

> As fate would have it, I did just so happen
> see an issue (while testing something else) concerning the AP7900 using
> apcmastersnmp, just yesterday.  You may want to let me know if what I'm
> doing is non-standard or improper:
> 
> I'm using DRBD/Heartbeat/Pacemaker, and my 2 HA systems have dual power
> supplies.  On the AP7900 I have 1 of the systems plugged into outlets 1 and
> 2, and the other system is plugged into outlets 3 and 4.  Both outlets 1 and
> 2 have the exact same outlet name, and that is the name of system 1 (uname
> -n), and outlets 3 and 4 have the exact same outlet name, and that is the
> name of system 2 (uname -n):
> 
> Outlet 1 name: spserv1m
> Outlet 2 name: spserv1m
> Outlet 3 name: spserv1s
> Outlet 4 name: spserv1s
> 
> The AP7900 allows identical outlet names.
> 
> What I noticed was (during testing) that a "Heartbeat/Pacemaker" attempt to
> stonith system 1, it only shut off Outlet 2, and not both outlets 1 and 2?
> I had not seen this before.  I certainly have seen it do both system outlets
> in testing, and manually using command line with testing the stonith command
> (early on).

Yes, according to the code, the apcmastersnmp plugin supports up
to 8 outlets per node. It's strange that it used to work and now
doesn't any more. If there are no messages in the logs, I'd
suggest to turn debug on and try to see what's going on and under
which circumstances it fails to get all outlets.

You can do more testing using the stonith command with option -d
and checking status (-lS). Not sure, but perhaps that should also
output some outlet information. I guess that doing too many
resets is not very hardware-friendly.

> I'm not sure if this is problem, or rather an improper or problematic
> approach.  I am considering having just 1 outlet having the system name and
> using a splitter or extension cord with multiple outlets for each system.
> 
> Just for reference, here is my Pacemaker crm stonith configuration piece:
> 
> primitive res_stonith stonith:apcmastersnmp \
>         params ipaddr="192.1.1.109" port="161" community="sps" \
>         op start interval="0" timeout="60s" \
>         op monitor interval="60s" timeout="60s" \

This monitor is scheduled way too often. Better use something
like 2h or so.

>         op stop interval="0" timeout="60s"
> clone rc_res_stonith res_stonith \
>         meta target-role="Started"

You can also use a single instance setup, i.e. without clones.

Thanks,

Dejan

> Thanks,
> Rick
> 
>  
> 
>  
> 
> -----Original Message-----
> From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm] 
> Sent: Thursday, October 28, 2010 2:37 AM
> To: 'The Pacemaker cluster resource manager'
> Subject: Re: [Pacemaker] Stonith Device APC AP7900
> 
> On Wed, Oct 27, 2010 at 10:12:03AM -0600, Rick Cone wrote:
> > The AP7900 does seem to be working properly and tested with the
> > apcmastersnmp.  I guess you can add it to the list.  What is the change
> > after it is added to the list?
> 
> Not sure if I understand your question. The only difference would
> be that the warning message is not issued any more.
> 
> Thanks,
> 
> Dejan
> 
> > 
> > 
> > 
> > 
> > -----Original Message-----
> > From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm] 
> > Sent: Wednesday, October 27, 2010 9:55 AM
> > To: rcone at securepaymentsystems.com; The Pacemaker cluster resource manager
> > Subject: Re: [Pacemaker] Stonith Device APC AP7900
> > 
> > Hi,
> > 
> > On Wed, Oct 27, 2010 at 09:11:03AM -0600, Rick Cone wrote:
> > > I use the APC AP7900 stonith device with apcmastersnmp.  It seems to
> works
> > > well, but I see the message in logs:
> > > 
> > >  
> > > 
> > > Oct 24 04:02:32 spserv1m stonithd: [2885]: WARN: apcmastersnmp_status:
> > > module not tested with this hardware 'AP7900'.
> > > 
> > >  
> > > 
> > > Should I be worried about this?
> > 
> > Probably not. The author of the plugin made a list of devices
> > that were tested and issues a warning for a device not in the
> > list.
> > 
> > >  Is there anyway to resolve this or to get
> > > rid of the message?
> > 
> > If you tested the device thoroughly, then we can add it to the
> > list.
> > 
> > Thanks,
> > 
> > Dejan
> > 
> > > 
> > >  
> > > 
> > > Thanks,
> > > 
> > >  
> > > 
> > > Rick Cone
> > > 
> > >  
> > > 
> > >  
> > > 
> > >  
> > > 
> > > 
> > > 
> > > 
> > > 
> > 
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs:
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> > 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker




More information about the Pacemaker mailing list