[Pacemaker] crm_master triggering assert section != NULL
Lars Ellenberg
lars.ellenberg at linbit.com
Wed Oct 12 19:21:46 EDT 2011
On Wed, Oct 12, 2011 at 05:09:45PM -0400, Yves Trudeau wrote:
> Hi Florian,
>
> On 11-10-12 04:09 PM, Florian Haas wrote:
> >On 2011-10-12 21:46, Yves Trudeau wrote:
> >>Hi Florian,
> >> sure, let me state the requirements. If those requirements can be
> >>met, pacemaker will be much more used to manage MySQL replication.
> >>Right now, although at Percona I deal with many large MySQL deployments,
> >>none are using the current agent. Another tool, MMM is currently used
> >>but it is currently orphan and suffers from many pretty fundamental
> >>flaws (while implement about the same logic as below).
> >>
> >>Consider a pool of N identical MySQL servers. In that case we need:
> >>- N replication resources (it could be the MySQL RA)
> >>- N Reader_vip
> >>- 1 Writer_vip
> >>
> >>Reader vips are used by the application to run queries that do not
> >>modify data, usually accessed is round-robin fashion. When the
> >>application needs to write something, it uses the writer_vip. That's
> >>how read/write splitting is implement in many many places.
> >>
> >>So, for the agent, here are the requirements:
> >>
> >>- No need to manage MySQL itself
> >>
> >>The resource we are interested in is replication, MySQL itself is at
> >>another level. If the RA is to manage MySQL, it must not interfere.
> >>
> >>- the writer_vip must be assigned only to the master, after it is promoted
> >>
> >>This, is easy with colocation
> >Agreed.
> >
> >>- After the promotion of a new master, all slaves should be allowed to
> >>complete the application of their relay logs prior to any change master
> >>
> >>The current RA does not do that but it should be fairly easy to implement.
> >That's a use case for a pre-promote and post-promote notification. Like
> >the mysql RA currently does.
> >
> >>- After its promotion and before allowing writes to it, a master should
> >>publish its current master file and position. I am using resource
> >>parameters in the CIB for these (I am wondering if transient attributes
> >>could be used instead)
> >They could, and you should. Like the mysql RA currently does.
> >
>
> The RA I downloaded following instruction of the wiki stating it is
> the latest sources:
>
> wget -O resource-agents.tar.bz2
> http://hg.linux-ha.org/agents/archive/tip.tar.bz2
Has moved to github.
I'll try to make that more obvious at the website,
but that won't help for "direct download" hg archive links.
http://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/mysql
raw download:
http://raw.github.com/ClusterLabs/resource-agents/master/heartbeat/mysql
Also see this pull request:
https://github.com/ClusterLabs/resource-agents/pull/28
> has the following code to change the master:
>
> ocf_run $MYSQL $MYSQL_OPTIONS_LOCAL $MYSQL_OPTIONS_REPL \
> -e "CHANGE MASTER TO MASTER_HOST='$master_host', \
> MASTER_USER='$OCF_RESKEY_replication_user', \
> MASTER_PASSWORD='$OCF_RESKEY_replication_passwd'"
>
> which does not include file and position.
>
>
> >>- After the promotion of a new master, all slaves should be reconfigured
> >>to point to the new master host with correct file and position as
> >>published by the master when it was promoted
> >>
> >>The current RA does not set file and position.
> >"The current RA" being ocf:heartbeat:mysql?
> >
> >A cursory grep for "CRM_ATTR" in ocf:heartbeat:mysql indicates that it
> >does set those.
>
> grep CRM_ATTR returned nothing.
>
> yves at yves-desktop:/opt/pacemaker/Cluster-Resource-Agents-7a11934b142d/heartbeat$
> grep -i CRM_ATTR mysql
> yves at yves-desktop:/opt/pacemaker/Cluster-Resource-Agents-7a11934b142d/heartbeat$
>
> and that is the latest from Mercurial...
>
> >>Under any non-trivial
> >>load this will fail. The current RA is not designed to stores the
> >>information. The new RA uses the information stored in the cib along
> >>with post-promote notification.
> >Is this point moot considering my previous statement?
> >
> >>- each slave and the master may have one or more reader_vip provided
> >>that they are replicating correctly (no lag beyond a threshold,
> >>replication of course working). If all slaves fails, all reader_vip
> >>should be located on the master.
> >Use a cloned IPaddr2 as a non-anonymous clone, thereby managing an IP
> >range. Add a location constraint restricting the clone instance to run
> >on only those nodes where a specific node attribute is set. Or
> >conversely, forbid them from running on nodes where said attribute is
> >not set. Manage that attribute from your RA.
>
> That's clever, never thought about it.
>
> >>The current RA either kills MySQL or does nothing, it doesn't care about
> >>reader_vips. Killling MySQL on a busy server with 256GB of buffer pool
> >>is enough for someone to lose his job... The new RA adjusts location
> >>scores for the reader_vip resources dynamically.
> >Like I said, that's managing one resource from another, which is a total
> >nightmare. It's also not necessary, I dare say, given the approach I
> >outlined above.
> >
> I'll explore the node attribute approach, I like it.
>
> Is it possible to create an attribute that does not belong to a node
> but is cluster wide?
> >>- the RA should implement a protection against flapping in case a slave
> >>hovers around the replication lag threshold
> >You should get plenty of inspiration there from how the dampen parameter
> >is used in ocf:pacemaker:ping.
> >
> ok, I'll check
> >>The current RA does implement that but it is not required giving the
> >>context. The new RA does implement flapping protection.
> >>
> >>- upon demote of a master, the RA _must_ attempt to kill all user
> >>(non-system) connections
> >>
> >>The current RA does not do that but it is easy to implement
> >Yeah, as I assume it would be in the other one.
> >
> >>- Slaves must be read-only
> >>
> >>That's fine, handled by the current RA.
> >Correct.
> >
> >>- Monitor should test MySQL and replication. If either is bad, vips
> >>should be moved away. Common errors should not trigger actions.
> >Like I said, should be feasible with the node attribute approach
> >outlined above. No reason to muck around with the resources directly.
> >
> >>That's handled by the current RA for most of if. The error handling
> >>could be added.
> >>
> >>- Slaves should update their master score according to the state of
> >>their replication.
> >>
> >>Handled by both RA
> >Right.
> >
> >>So, at the minimum, the RA needs to be able to store the master
> >>coordinate information, either in the resource parameters or in
> >>transient attributes and must be able to modify resources location
> >>scores. The script _was_ working before I got the cib issue, maybe it
> >>was purely accidental but it proves the concept. I was actually
> >>implement/testing the relay_log completion stuff. I chose not to use
> >>the current agent because I didn't want to manage MySQL itself, just
> >>replication.
> >>
> >>I am wide open to argue any Pacemaker or RA architecture/design part but
> >>I don't want to argue the replication requirements, they are fundamental
> >>in my mind.
> >Yup, and I still believe that ocf:heartbeat:mysql either already
> >addresses those, or they could be addressed in a much cleaner fashion
> >than writing a new RA.
> >
> >Now, if the only remaining point is "but I want to write an agent that
> >can do _less_ than an existing one" (namely, manage only replication,
> >not the underlying daemon), then I guess I can't argue with that, but
> >I'd still believe that would be a suboptimal approach.
> Ohh... don't get me wrong, I am not the kind of guy that takes
> pride in having re-invented the flat tire. I want an opensource
> _solution_ I can offer to my customers. I think part of the problem
> here is that we are not talking about the same ocf:heartbeat:mysql
> RA. What is mainstream is what you can get with "apt-get install
> pacemaker" on 10.04 LTS for example. This is 1.0.8. I also tried
> 1.0.11 and still it is obviously not the same version. I got my
> "latest" agent version as explained in the clusterlabs FAQ page
> from:
>
> wget -O resource-agents.tar.bz2
> http://hg.linux-ha.org/agents/archive/tip.tar.bz2
>
> Where can I get the version you are using :)
>
> Regards,
>
> Yves
>
> >Cheers,
> >Florian
> >
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
More information about the Pacemaker
mailing list