[Pacemaker] Validate strategy for RA on DRBD standby node

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Feb 24 15:08:32 UTC 2011


Hi,

On Thu, Feb 24, 2011 at 08:06:52AM -0500, David McCurley wrote:
> Pacemaker and list newbie here :)
> 
> I'm writing a resource adapter in python for the newer release of OpenLDAP but I need some pointers on a strategy for the validate function in a certain case.  (In python because the more advanced shell scripting hurts my head :).  Here is the situation:

That would be the first python RA. BTW, there was recently posted
slapd RA (implemented in shell), which I should review, but
haven't done that yet. At any rate, that RA does not support
multi-state resources which I think would be essential. Did you
plan to do that? At any rate, I'd suggest that you check that
code too and then see if you need to do your own implementation.
The thread starts here:

http://marc.info/?l=linux-ha-dev&m=129666245428850&w=2

> The config file for OpenLDAP is stored in /etc/ldap/slapd.d/cn=config.ldif.  This is on a DRBD active-passive system and the /etc/ldap directory is actually a symlink to the DRBD controlled share /vcoreshare/etc/ldap.  The real config file is at /vcoreshare/etc/ldap/slapd.d/cn=config.ldif.

What about the old style configuration? I assume that there are
still quite a few installations/distributions using those.

> 
> So I'm trying to be very judicious with every function and validation, checking file permissions, etc.  But the problem is that /etc/ldap/slapd.d/cn=config.ldif is only present on the active DRBD node.  My validate function checks that the file is readable by the user/group that slapd is to run as.  Now, as soon as I start ldap in the cluster, it starts fine, but validate fails on the standby node (because the DRBD volume isn't mounted) and crm_mon shows a failed action:

On probes (monitor with interval 0), some parts of validation
which concern the local node and not the configuration should say
OCF_NOT_RUNNING instead of error. This is exactly that case. No
worries, because if the next action is start validation is
invoked again. Probes are issued by pacemaker to establish if the
resource is running and normally it is expected to be not running
(for instance on node startup).

HTH,

Dejan

> ----------------------------------------------
> ============
> Last updated: Wed Feb 23 07:35:19 2011
> Stack: openais
> Current DC: vcoresrv1 - partition with quorum
> Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd
> 2 Nodes configured, 2 expected votes
> 5 Resources configured.
> ============
> 
> Online: [ vcoresrv1 vcoresrv2 ]
> 
> fs_vcoreshare   (ocf::heartbeat:Filesystem):    Started vcoresrv1
>  Master/Slave Set: ms_drbd_vcoreshare
>      Masters: [ vcoresrv1 ]
>      Slaves: [ vcoresrv2 ]
> clusterip       (ocf::heartbeat:IPaddr2):       Started vcoresrv1
> clusteripsourcing       (ocf::heartbeat:IPsrcaddr):     Started vcoresrv1
> 
> Failed actions:
>     ldap_monitor_0 (node=vcoresrv2, call=130, rc=5, status=complete): not installed
> ---------------------------------------------
> 
> Is there a way for my RA to know that it is being called on the active node instead of the passive node.  Or more generally, what would anyone recommend here?  I really didn't want to write the resource adapter so it would be specific to our setup (e.g. checking to make sure the DRBD mount is readable before looking for the config files).  Maybe Pacemaker passes in some extra env variable that can be used?
> 
> I'm reluctanct to post the code for the RA here in the list because it is 450 lines.  But, here is the logic for the validate function:
> 
> if the appropriate slapd user and group do not exist:
>    return OCF_ERR_INSTALLED
> if the ldap config file doesn't exist or isn't readable by the slapd user:
>    return OCF_ERR_INSTALLED
> if the ldap binary doesn't exist or isn't executable:
>    return OCF_ERR_INSTALLED
> return OCF_SUCCESS
> 
> Or maybe I'm overdoing it in my tests or have misinterpreted the "OCF Resource Agent Developer's Guide"?
> 
> Any advice or guidance / clarification appreciated.
> 
> Thanks,
> 
> Mac
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker




More information about the Pacemaker mailing list