[ClusterLabs] Installed Galera, now HAProxy won't start
nevo_n at hotmail.com
Wed Mar 16 18:10:08 EDT 2016
Sorry, folks, for being a pest here, but I'm finding the learning curve on this clustering stuff to be pretty steep.
I'm following the docs to set up a three-node Openstack Controller cluster. I got Pacemaker running and I had two resources, the virtual IP and HAProxy, up and running and I could move these resources to any of the three nodes. Success!
I then moved on to installing Galera.
The MariaDB engine started fine on 2 of the 3 nodes but refused to start on the third. After some digging and poking (and swearing), I found that HAProxy was listening on the virtual IP on the mySQL port, which prevented MariaDB from listening on that port. Makes sense. So I moved HAProxy to another node and started MariaDB on my third node and now I have a three-node Galera cluster.
Now HAPRoxy won't start on any node. I imagine it's because MariaDB is already listening on the same IP:Port combination that Galera wants. (After all, HAProxy is supposed to proxy that IP:Port, right?) Unfortunately, I don't see anything useful in the HAProxy.log file so I don't really know what's wrong.
So.... thinking this through logically, it seems to me that the Openstack docs were wrong in telling me to configure MariaDB server to bind to all available ports (http://docs.openstack.org/ha-guide/controller-ha-galera-config.html, scroll to "Database Configuration," note that bind-address is 0.0.0.0.). If MariaDB binds to the virtual IP address, then HAProxy can't bind to that address and therefore won't start. Right?
Am I thinking correctly here, or is something else wrong with my setup? In general, I've found that the OpenStack documents tend to be right, but in this case my understanding of the concepts involved makes me wonder.
In any case, I'm having difficulty getting HAProxy and Galera running on the same nodes. My HAProxy config file is:
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
server controller1 10.0.0.11:3306 check port 9200 inter 2000 rise 2 fall 5
server controller2 10.0.0.12:3306 backup check port 9200 inter 2000 rise 2 fall 5
server controller3 10.0.0.13:3306 backup check port 9200 inter 2000 rise 2 fall 5
Does the server name under "listen galera_cluster" need to match the hostname of the node? What else could be causing these two daemons to not play nicely together?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users