[ClusterLabs Developers] MariaDB resource-agent - help with choosing a master

Tue Feb 14 15:51:46 EST 2017

Hi,

I'm working on implementing a MariaDB resource-agent based on the mysql one.
The idea is to take advantage of new features in MariaDB, especially 
semi-synchronous replication and GTID.

GTID (Global Transaction ID) means that there is a counter that applies 
to the replicated databases, which is unique within the cluster (there 
can be multiple replication clusters with overlapping ID's).

Semi-synchronous replication means that the master will replicate 
synchronously to AT LEAST ONE slave, before actually performing the 
transaction. In theory there can be no data-loss due to a single node 
failure, a big improvement compared to the normal async replication in 
MariaDB.

These two sets of technologies should allow for quite a straightforward 
set of semantics in the resource-agent.
On master failure, the node with the highest GTID must be the one that 
was replicating synchronously, and should be promoted to be the new 
master. The question is how to relay the information to crmd.

My current working hypothesis is that I can place the GTID as a 
crm-attribute both when starting the resource-agent and in a post-demote 
notify. During the subsequent monitor operation the resource-agents can 
then scan the the crm-attributes from other nodes and simply prioritise 
themselves in relation to others (some relative scoring?).

This requires a few things though:

- If there is no master when the resource agent starts we need to wait 
for all nodes to come online (i.e) the cluster is just starting before 
promoting any to master, so they can read GTID from the attributes.
- There must be a monitor step after start and demote and before the 
promotion of any resource to master, and this must execute on all nodes 
so they can set their priority for promotion.
- The post-demote notifier must complete execution before a node can 
start the monitor operation. I THINK that it is ok for not all nodes to 
have completed the post-demote notifier before the monitor operation 
starts, probably this can work by creating a sparse priority 
distribution, i.e. First node to execute monitor sets a priority of 100 
- the next one down 90 - the next one in the middle at 95, based on the 
number of nodes etc.

I hope this doesn't sound too tangled, I will try this out, but I can't 
find any clear documentation on the ordering and completion of start, 
notifiers, monitor and promote operations as well as master selection, 
so all pointers are very much welcome.

And completely alternative suggestions also very much welcome.

Thanks for any and all assistance,
Nils