[Pacemaker] MySQL HA with pacemaker and DRBD - How to monitor mysql service

Fri Mar 23 08:42:29 EDT 2012

Thank you for your responses,

i have fixed my migration-treshold problem with lsb:mysqld ressource (i can
see migration-threshold=2 with crm_mon failcounts so its ok) bu failover
still doesn't work when mysql fail (but work fine when node fail or
standby).
So i've tried with the ocf ressource agent, it's work fine on my first node
but fail on my second node with unknown error.

crm_mon --failcount:

Failed actions:
    mysqld_monitor_0 (node=node2, call=6, rc=1, status=complete): unknown
error
    mysqld_stop_0 (node=node2, call=7, rc=1, status=complete): unknown error

I have exactly the same mysql packages version and configuration on my two
nodes (with proper permissions), also corosync/heartbeat and pacemaker are
in the same version too:

corosynclib-1.2.7-1.1.el5
corosync-1.2.7-1.1.el5
pacemaker-libs-1.1.5-1.1.el5
pacemaker-1.1.5-1.1.el5
heartbeat-3.0.3-2.3.el5
heartbeat-libs-3.0.3-2.3.el5
heartbeat-debuginfo-3.0.2-2.el5

So i don't andersand wy it's works on one node but not on the second?

ressource config:

primitive mysqld ocf:heartbeat:mysql \
        params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" \
        user="mysql" group="mysql" pid="/var/run/mysqld/mysqld.pid" \
        datadir="/data/mysql/databases" socket="/var/lib/mysql/mysql.sock" \
        op start interval="0" timeout="120" \
        op stop interval="0" timeout="120" \
        op monitor interval="30" timeout="30" depth="0" \
        target-role="Started"

And same with (i have created test database / table + grant test user on
it):

primitive mysqld ocf:heartbeat:mysql \
        params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" \
        datadir="/data/mysql/databases" user="mysql" \
        pid="/var/run/mysqld/mysqld.pid" socket="/var/lib/mysql/mysql.sock"
\
        test_passwd="test" test_table="Cluster.dbcheck" test_user="test" \
        op start interval="0" timeout="120" \
        op stop interval="0" timeout="120" \
        op monitor interval="30s" timeout="30s" OCF_CHECK_LEVEL="1" \
        meta migration-threshold="3" target-role="Started"

Full config:

node node2 \
        attributes standby="off"
node node1 \
        attributes standby="off"
primitive Cluster-VIP ocf:heartbeat:IPaddr2 \
        params ip="x.x.x.x" broadcast="x.x.x.x" nic="eth0"
cidr_netmask="21" iflabel="VIP1" \
        op monitor interval="10s" timeout="20s" \
        meta is-managed="true"
primitive datavg ocf:heartbeat:LVM \
        params volgrpname="datavg" exclusive="true" \
        op start interval="0" timeout="30" \
        op stop interval="0" timeout="30"
primitive drbd_mysql ocf:linbit:drbd \
        params drbd_resource="drbd-mysql" \
        op monitor interval="15s"
primitive fs_mysql ocf:heartbeat:Filesystem \
        params device="/dev/datavg/data" directory="/data" fstype="ext3"
primitive mysqld ocf:heartbeat:mysql \
        params binary="/usr/bin/mysqld_safe" config="/etc/my.cnf"
user="mysql" group="mysql" pid="/var/run/mysqld/mysqld.pid"
datadir="/data/mysql/databases" socket="/var/lib/mysql/mysql.sock" \
        op start interval="0" timeout="120" \
        op stop interval="0" timeout="120" \
        op monitor interval="30" timeout="30" depth="0"
group mysql datavg fs_mysql Cluster-VIP mysqld
ms ms_drbd_mysql drbd_mysql \
        meta master-max="1" master-node-max="1" clone-max="2"
clone-node-max="1" notify="true"
location master-prefer-node-1 Cluster-VIP 25: node1
colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master
order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
\
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        last-lrm-refresh="1332504626"
rsc_defaults $id="rsc-options" \
        resource-stickiness="100"

2012/3/22 Andreas Kurz <andreas at hastexo.com>

> On 03/22/2012 03:23 PM, coma wrote:
> > Thank you for your responses,
> >
> > I have added the migration-treshold on my mysqld ressource, when i kill
> > or manually stop mysql on one node, there is not failover on the second
> > node.
> > Also, when i look crm_mon --failcounts, i can see "mysqld:
> > migration-threshold=1000000 fail-count=1000000", so i don"t anderstand
> > why migration-threshold not equal 2?
> >
> > Migration summary:
> > * Node node1:
> >    mysqld: migration-threshold=1000000 fail-count=1000000
> > * Node node2:
> >
> > Failed actions:
> >     mysqld_monitor_10000 (node=node1, call=90, rc=7, status=complete):
> > not running
> >     mysqld_stop_0 (node=node1, call=93, rc=1, status=complete): unknown
> > error
>
> The lsb init script you are using seems to be not LSB compliant ...
> looks like it returns an error on stopping an already stopped mysql.
>
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/index.html#ap-lsb
>
> Fix the script ... or better use the ocf resource agent.
>
> Regards,
> Andreas
>
> --
> Need help with Pacemaker?
> http://www.hastexo.com/now
>
> >
> >
> >
> > configuration:
> >
> > node node1 \
> >         attributes standby="off"
> > node node2 \
> >         attributes standby="off"
> > primitive Cluster-VIP ocf:heartbeat:IPaddr2 \
> >         params ip="x.x.x.x" broadcast="x.x.x.x" nic="eth0"
> > cidr_netmask="21" iflabel="VIP1" \
> >         op monitor interval="10s" timeout="20s" \
> >         meta is-managed="true"
> > primitive datavg ocf:heartbeat:LVM \
> >         params volgrpname="datavg" exclusive="true" \
> >         op start interval="0" timeout="30" \
> >         op stop interval="0" timeout="30"
> > primitive drbd_mysql ocf:linbit:drbd \
> >         params drbd_resource="drbd-mysql" \
> >         op monitor interval="15s"
> > primitive fs_mysql ocf:heartbeat:Filesystem \
> >         params device="/dev/datavg/data" directory="/data" fstype="ext3"
> > primitive mysqld lsb:mysqld \
> >         op monitor interval="10s" timeout="30s" \
> >         op start interval="0" timeout="120" \
> >         op stop interval="0" timeout="120" \
> >         meta target-role="Started" migration-threshold="2"
> > failure-timeout="20s"
> > group mysql datavg fs_mysql Cluster-VIP mysqld
> > ms ms_drbd_mysql drbd_mysql \
> >         meta master-max="1" master-node-max="1" clone-max="2"
> > clone-node-max="1" notify="true"
> > location master-prefer-node-1 Cluster-VIP 25: node1
> > colocation mysql_on_drbd inf: mysql ms_drbd_mysql:Master
> > order mysql_after_drbd inf: ms_drbd_mysql:promote mysql:start
> > property $id="cib-bootstrap-options" \
> >
> > dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
> >         cluster-infrastructure="openais" \
> >         expected-quorum-votes="2" \
> >         stonith-enabled="false" \
> >         no-quorum-policy="ignore" \
> >         last-lrm-refresh="1332425337"
> > rsc_defaults $id="rsc-options" \
> >         resource-stickiness="100"
> >
> >
> >
> > 2012/3/22 Andreas Kurz <andreas at hastexo.com <mailto:andreas at hastexo.com
> >>
> >
> >     On 03/22/2012 01:51 PM, coma wrote:
> >     > Ah yes thank you, the service status mysql is now monitored, but
> the
> >     > failover is not performed?
> >
> >     As long as local restarts are successful there is no need for a
> failover
> >     ... there is migration-treshold to limit local restart tries.
> >
> >     Regards,
> >     Andreas
> >
> >     --
> >     Need help with Pacemaker?
> >     http://www.hastexo.com/now
> >
> >     >
> >     >
> >     >
> >     > 2012/3/22 emmanuel segura <emi2fast at gmail.com
> >     <mailto:emi2fast at gmail.com> <mailto:emi2fast at gmail.com
> >     <mailto:emi2fast at gmail.com>>>
> >     >
> >     >     sorry
> >     >     I think you missed the op monitor operetion in your primitive
> >     definition
> >     >
> >     >
> >     >
> >     >     Il giorno 22 marzo 2012 11:52, emmanuel segura
> >     <emi2fast at gmail.com <mailto:emi2fast at gmail.com>
> >     >     <mailto:emi2fast at gmail.com <mailto:emi2fast at gmail.com>>> ha
> >     scritto:
> >     >
> >     >         I think you missed the op monitor operetion you primitive
> >     definition
> >     >
> >     >         Il giorno 22 marzo 2012 11:33, coma <coma.inf at gmail.com
> >     <mailto:coma.inf at gmail.com>
> >     >         <mailto:coma.inf at gmail.com <mailto:coma.inf at gmail.com>>>
> >     ha scritto:
> >     >
> >     >             Hello,
> >     >
> >     >             I have a question about mysql service monitoring into a
> >     >             MySQL HA cluster with pacemaker and DRBD,
> >     >             I have set up a configuration to allow a failover
> between
> >     >             two nodes, it work fine when a node is offline (or
> >     standby),
> >     >             but i want to know if it is possible to monitor the
> mysql
> >     >             service to perform a failover if mysql is stopped or
> >     >             unavailable?
> >     >
> >     >             Thank you in advance for any response.
> >     >
> >     >             My crm configuration:
> >     >
> >     >             node node1 \
> >     >                     attributes standby="off"
> >     >             node node2 \
> >     >                     attributes standby="off"
> >     >             primitive Cluster-VIP ocf:heartbeat:IPaddr2 \
> >     >                     params ip="x.x.x.x" broadcast="x.x.x.x"
> nic="eth0"
> >     >             cidr_netmask="21" iflabel="VIP1" \
> >     >                     op monitor interval="10s" timeout="20s" \
> >     >                     meta is-managed="true"
> >     >             primitive datavg ocf:heartbeat:LVM \
> >     >                     params volgrpname="datavg" exclusive="true" \
> >     >                     op start interval="0" timeout="30" \
> >     >                     op stop interval="0" timeout="30"
> >     >             primitive drbd_mysql ocf:linbit:drbd \
> >     >                     params drbd_resource="drbd-mysql" \
> >     >                     op monitor interval="15s"
> >     >             primitive fs_mysql ocf:heartbeat:Filesystem \
> >     >                     params device="/dev/datavg/data"
> directory="/data"
> >     >             fstype="ext3"
> >     >             primitive mysqld lsb:mysqld
> >     >             group mysql datavg fs_mysql Cluster-VIP mysqld
> >     >             ms ms_drbd_mysql drbd_mysql \
> >     >                     meta master-max="1" master-node-max="1"
> >     >             clone-max="2" clone-node-max="1" notify="true"
> >     >             location master-prefer-node-1 Cluster-VIP 25: node1
> >     >             colocation mysql_on_drbd inf: mysql
> ms_drbd_mysql:Master
> >     >             order mysql_after_drbd inf: ms_drbd_mysql:promote
> >     mysql:start
> >     >             property $id="cib-bootstrap-options" \
> >     >
> >     >
> >     dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f"
> >     >             \
> >     >                     cluster-infrastructure="openais" \
> >     >                     expected-quorum-votes="2" \
> >     >                     stonith-enabled="false" \
> >     >                     no-quorum-policy="ignore" \
> >     >                     last-lrm-refresh="1332254494"
> >     >             rsc_defaults $id="rsc-options" \
> >     >                     resource-stickiness="100"
> >     >
> >     >
> >     >             _______________________________________________
> >     >             Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>
> >     >             <mailto:Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>>
> >     >             http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >     >
> >     >             Project Home: http://www.clusterlabs.org
> >     >             Getting started:
> >     >
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >     >             Bugs: http://bugs.clusterlabs.org
> >     >
> >     >
> >     >
> >     >
> >     >         --
> >     >         esta es mi vida e me la vivo hasta que dios quiera
> >     >
> >     >
> >     >
> >     >
> >     >     --
> >     >     esta es mi vida e me la vivo hasta que dios quiera
> >     >
> >     >     _______________________________________________
> >     >     Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>
> >     >     <mailto:Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>>
> >     >     http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >     >
> >     >     Project Home: http://www.clusterlabs.org
> >     >     Getting started:
> >     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >     >     Bugs: http://bugs.clusterlabs.org
> >     >
> >     >
> >     >
> >     >
> >     > _______________________________________________
> >     > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>
> >     > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >     >
> >     > Project Home: http://www.clusterlabs.org
> >     > Getting started:
> >     http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >     > Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >     _______________________________________________
> >     Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >     <mailto:Pacemaker at oss.clusterlabs.org>
> >     http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> >     Project Home: http://www.clusterlabs.org
> >     Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >     Bugs: http://bugs.clusterlabs.org
> >
> >
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20120323/0047aa79/attachment-0003.html>