[Pacemaker] crm_master triggering assert section != NULL

Florian Haas florian at hastexo.com
Wed Oct 12 12:59:31 EDT 2011


Hi again,

On 2011-10-12 18:23, Yves Trudeau wrote:
> Hi,
>    following my previous post to the wrong list, forwarded to the
> Pacemaker list by Florian, here is the my complete cluster configuration:
> 
> http://pastebin.com/zDj0MF1Z

Yikes. I just had a look at that resource agent (operating under the
assumption that the version at
http://bazaar.launchpad.net/~y-trudeau/percona-prm/alpha/view/head:/percona-prm/MySQL_replication
is still current), and that looks like you guys never once looked at the
OCF RA dev guide. What's the reason for rolling your own when you could
have contributed to the existing mysql RA that does support replication?
Or fixed it, in case you think it's broken or buggy?

Let's please go over this this in London. I dare say that if this RA
actually works, then it works pretty much only by accident. Again, all
of this is assuming we're talking about the version that's in Launchpad;
you may have produced your CIB dump on a box that uses an updated version.

> Just to recall the original message:
> 
>     I started to have issues with crm_master with Pacemaker 1.0.11.  I
> think I traced it down to the following problem.  I know crm_master is
> supposed to be called within the resource script, calling manually helps
> to illustrate the problem.
> 
> root at testvirtbox1:~# /usr/sbin/crm_master -l reboot -v 1000 -r
> p_MySQL_replication:0
> root at testvirtbox1:~# /usr/local/sbin/crm_master -r
> 'p_MySQL_Replication:0' -G
>    name=master-p_MySQL_Replication:0 value=(null)
> Error performing operation: cib object missing
> 
> and in daemon.log:
> 
> Oct 11 12:17:41 testvirtbox1 crm_attribute: [21986]: info: Invoked:
> crm_attribute -N testvirtbox1 -n master-p_MySQL_Replication:0 -G
> Oct 11 12:17:41 testvirtbox1 crm_attribute: [21986]: ERROR: crm_abort:
> read_attr: Triggered assert at cib_attrs.c:297 : section != NULL

Er, sorry. I don't think anyone on this list is going to be willing to
troubleshoot a problem that involves a resource agent that doesn't use
notify where it should, invokes "crm resource start" for a different
resource from within the RA, asks the cluster manager about its role
when it should be telling it, etc. Let's not waste developer time. So
please, either show us an updated version of the RA that's fixed, or
let's talk about this in London in person, week after next.

I am not contesting that you may have actually found a Pacemaker
problem, but in order to be sure we'd have to start from a setup that's
_expected_ to work. Can you reproduce this issue with a different
Master/Slave RA, say the "Stateful" agent?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now




More information about the Pacemaker mailing list