[Pacemaker] MySQL Master-Master replication with Corosync and Pacemaker

Peter Scott Peter at PSDT.com
Thu Jan 26 21:56:28 EST 2012

On 1/25/2012 11:29 PM, Florian Haas wrote:
> Well if you just want to restart Corosync by administrative 
> intervention (i.e. in a planned, controlled fashion), then why not put 
> the cluster in maintenance mode before you restart Corosync? Cheers, 
> Florian
Good point.  My concern was about whether this behavior meant that there 
were other modes that did not behave the way we expected, and I have 
just found one.

Our requirement and expectation is that mysql will continue running on 
both nodes while either of them is up.  Here's a scenario contrary to 
that that I can repeat and don't understand.

The two nodes are named mysql01 and mysql02, with mysql01 preferred.

Reboot mysql02.
Wait 10 seconds.
Reboot mysql01.
mysql02 comes up.
mysql02 starts mysqld from init.
Run crm_mon on mysql02.
Eventually it says mysql resource is running on mysql02 and mysql01 is 
mysql01 comes up.
mysql01 starts mysqld from init.
crm_mon says it is starting mysqld on mysql01.
(I think it actually stopped and restarted mysqld on mysql01 but haven't 
been able to verify yet.  Seems gratuitous, but I can live with that 
mysqld is stopped on mysql02.  That IS a problem.

So when the preferred node came back up, it caused the service to switch 
there and stopped the resource on the other node.  That doesn't work for 
us because the mysql servers need to stay up, replicating to each 
other.  We could fix that by putting some sort of watchdog process on 
each machine that tells them to restart mysqld if it goes down, but I 
don't like the idea of fighting with corosync/pacemaker.  If they took 
down mysqld deliberately I assume this was on the basis of some 
operational model that isn't the one we need, and I should either 
reconfigure my ha setup or use something different rather than keep 
banging a square peg into a round hole.

Is the observed behavior described above of pacemaker stopping mysqld as 
expected?  If so, why?  That would help me wrap my brain around this.  

More information about the Pacemaker mailing list