[Pacemaker] crm_master triggering assert section != NULL

Wed Oct 12 19:46:46 UTC 2011

Hi Florian,
   sure, let me state the requirements.  If those requirements can be 
met, pacemaker will be much more used to manage MySQL replication.  
Right now, although at Percona I deal with many large MySQL deployments, 
none are using the current agent.   Another tool, MMM is currently used 
but it is currently orphan and suffers from many pretty fundamental 
flaws (while implement about the same logic as below).

Consider a pool of N identical MySQL servers.  In that case we need:
- N replication resources (it could be the MySQL RA)
- N Reader_vip
- 1 Writer_vip

Reader vips are used by the application to run queries that do not 
modify data, usually accessed is round-robin fashion.  When the 
application needs to write something, it uses the writer_vip.  That's 
how read/write splitting is implement in many many places.

So, for the agent, here are the requirements:

- No need to manage MySQL itself

The resource we are interested in is replication, MySQL itself is at 
another level.  If the RA is to manage MySQL, it must not interfere.

- the writer_vip must be assigned only to the master, after it is promoted

This, is easy with colocation

- After the promotion of a new master, all slaves should be allowed to 
complete the application of their relay logs prior to any change master

The current RA does not do that but it should be fairly easy to implement.

- After its promotion and before allowing writes to it, a master should 
publish its current master file and position.   I am using resource 
parameters in the CIB for these (I am wondering if transient attributes 
could be used instead)

- After the promotion of a new master, all slaves should be reconfigured 
to point to the new master host with correct file and position as 
published by the master when it was promoted

The current RA does not set file and position.  Under any non-trivial 
load this will fail.  The current RA is not designed to stores the 
information.  The new RA uses the information stored in the cib along 
with post-promote notification.

- each slave and the master may have one or more reader_vip provided 
that they are replicating correctly (no lag beyond a threshold, 
replication of course working).  If all slaves fails, all reader_vip 
should be located on the master.

The current RA either kills MySQL or does nothing, it doesn't care about 
reader_vips.  Killling MySQL on a busy server with 256GB of buffer pool 
is enough for someone to lose his job...  The new RA adjusts location 
scores for the reader_vip resources dynamically.

- the RA should implement a protection against flapping in case a slave 
hovers around the replication lag threshold

The current RA does implement that but it is not required giving the 
context.  The new RA does implement flapping protection.

- upon demote of a master, the RA _must_ attempt to kill all user 
(non-system) connections

The current RA does not do that but it is easy to implement

- Slaves must be read-only

That's fine, handled by the current RA.

- Monitor should test MySQL and replication.  If either is bad, vips 
should be moved away.  Common errors should not trigger actions.

That's handled by the current RA for most of if.  The error handling 
could be added.

- Slaves should update their master score according to the state of 
their replication.

Handled by both RA

So, at the minimum, the RA needs to be able to store the master 
coordinate information, either in the resource parameters or in 
transient attributes and must be able to modify resources location 
scores.  The script _was_ working before I got the cib issue, maybe it 
was purely accidental but it proves the concept.  I was actually 
implement/testing the relay_log completion stuff.  I chose not to use 
the current agent because I didn't want to manage MySQL itself, just 
replication.

I am wide open to argue any Pacemaker or RA architecture/design part but 
I don't want to argue the replication requirements, they are fundamental 
in my mind.

Do not hesitate if you have questions.

Regards,

Yves

On 11-10-12 01:53 PM, Florian Haas wrote:
> On 2011-10-12 19:36, Yves Trudeau wrote:
>> Hi Florian,
>>    I pushed the latest code to LP, the agent use notification now.
> Better.
>
>> Also,
>> most of the start/stop of resource have been removed.
> "Most of" is really not good enough here -- that thing still does all
> sorts of things modifying other resources, and I think we all agree that
> that's a big no-no. The monitor function is also still misguided.
>
>> In my opinion,
>> the existing agent would need a major rewrite to support the required
>> logic.
> I don't recall this RA being discussed on this list prior to today, or
> any of the authors getting involved in a discussion on the existing
> mysql RA. I may have missed something though; did I? If so, please point
> me to a link from the list archives and I'll be happy to educate myself
> on the discussion and whatever pros and cons were raised therein.
>
>> I think indeed it will a good idea to sit and talk at PLUK
>> about it.
> Yes, let's do that.
>
>>   Maybe Pacemaker cannot be used but that would be sad.
> I strongly doubt that it can't.
>
> Cheers,
> Florian
>