[Pacemaker] crm_master triggering assert section != NULL

Wed Oct 12 20:09:10 UTC 2011

On 2011-10-12 21:46, Yves Trudeau wrote:
> Hi Florian,
>   sure, let me state the requirements.  If those requirements can be
> met, pacemaker will be much more used to manage MySQL replication. 
> Right now, although at Percona I deal with many large MySQL deployments,
> none are using the current agent.   Another tool, MMM is currently used
> but it is currently orphan and suffers from many pretty fundamental
> flaws (while implement about the same logic as below).
> 
> Consider a pool of N identical MySQL servers.  In that case we need:
> - N replication resources (it could be the MySQL RA)
> - N Reader_vip
> - 1 Writer_vip
> 
> Reader vips are used by the application to run queries that do not
> modify data, usually accessed is round-robin fashion.  When the
> application needs to write something, it uses the writer_vip.  That's
> how read/write splitting is implement in many many places.
> 
> So, for the agent, here are the requirements:
> 
> - No need to manage MySQL itself
> 
> The resource we are interested in is replication, MySQL itself is at
> another level.  If the RA is to manage MySQL, it must not interfere.
> 
> - the writer_vip must be assigned only to the master, after it is promoted
> 
> This, is easy with colocation

Agreed.

> 
> - After the promotion of a new master, all slaves should be allowed to
> complete the application of their relay logs prior to any change master
> 
> The current RA does not do that but it should be fairly easy to implement.

That's a use case for a pre-promote and post-promote notification. Like
the mysql RA currently does.

> 
> - After its promotion and before allowing writes to it, a master should
> publish its current master file and position.   I am using resource
> parameters in the CIB for these (I am wondering if transient attributes
> could be used instead)

They could, and you should. Like the mysql RA currently does.

> - After the promotion of a new master, all slaves should be reconfigured
> to point to the new master host with correct file and position as
> published by the master when it was promoted
> 
> The current RA does not set file and position.

"The current RA" being ocf:heartbeat:mysql?

A cursory grep for "CRM_ATTR" in ocf:heartbeat:mysql indicates that it
does set those.

> Under any non-trivial
> load this will fail.  The current RA is not designed to stores the
> information.  The new RA uses the information stored in the cib along
> with post-promote notification.

Is this point moot considering my previous statement?

> - each slave and the master may have one or more reader_vip provided
> that they are replicating correctly (no lag beyond a threshold,
> replication of course working).  If all slaves fails, all reader_vip
> should be located on the master.

Use a cloned IPaddr2 as a non-anonymous clone, thereby managing an IP
range. Add a location constraint restricting the clone instance to run
on only those nodes where a specific node attribute is set. Or
conversely, forbid them from running on nodes where said attribute is
not set. Manage that attribute from your RA.

> The current RA either kills MySQL or does nothing, it doesn't care about
> reader_vips.  Killling MySQL on a busy server with 256GB of buffer pool
> is enough for someone to lose his job...  The new RA adjusts location
> scores for the reader_vip resources dynamically.

Like I said, that's managing one resource from another, which is a total
nightmare. It's also not necessary, I dare say, given the approach I
outlined above.

> - the RA should implement a protection against flapping in case a slave
> hovers around the replication lag threshold

You should get plenty of inspiration there from how the dampen parameter
is used in ocf:pacemaker:ping.

> The current RA does implement that but it is not required giving the
> context.  The new RA does implement flapping protection.
> 
> - upon demote of a master, the RA _must_ attempt to kill all user
> (non-system) connections
> 
> The current RA does not do that but it is easy to implement

Yeah, as I assume it would be in the other one.

> - Slaves must be read-only
> 
> That's fine, handled by the current RA.

Correct.

> - Monitor should test MySQL and replication.  If either is bad, vips
> should be moved away.  Common errors should not trigger actions.

Like I said, should be feasible with the node attribute approach
outlined above. No reason to muck around with the resources directly.

> That's handled by the current RA for most of if.  The error handling
> could be added.
> 
> - Slaves should update their master score according to the state of
> their replication.
> 
> Handled by both RA

Right.

> So, at the minimum, the RA needs to be able to store the master
> coordinate information, either in the resource parameters or in
> transient attributes and must be able to modify resources location
> scores.  The script _was_ working before I got the cib issue, maybe it
> was purely accidental but it proves the concept.  I was actually
> implement/testing the relay_log completion stuff.  I chose not to use
> the current agent because I didn't want to manage MySQL itself, just
> replication.
> 
> I am wide open to argue any Pacemaker or RA architecture/design part but
> I don't want to argue the replication requirements, they are fundamental
> in my mind.

Yup, and I still believe that ocf:heartbeat:mysql either already
addresses those, or they could be addressed in a much cleaner fashion
than writing a new RA.

Now, if the only remaining point is "but I want to write an agent that
can do _less_ than an existing one" (namely, manage only replication,
not the underlying daemon), then I guess I can't argue with that, but
I'd still believe that would be a suboptimal approach.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now