[Pacemaker] pacemaker won't start mysql in the second node
    Brian Cavanagh 
    brian at designedtoscale.com
       
    Wed Feb  2 15:50:54 UTC 2011
    
    
  
sudo bash
chmod 666 /var/log/mysql_safe.log
chmod 666 /var/log/mysql.log
crm configure edit
....
primitive mysql ocf:heartbeat:mysql \
    params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf" \
user="mysql" group="mysql" log="/var/log/mysql.log" \
pid="/var/run/mysqld.pid" datadir="/var/lib/mysql" \
socket="/var/run/mysqld/mysqld.sock"
...
That should fix your issues, if not, change your pid location to
/var/lib/mysql and stop using the sock.
Also tail the mysql_safe.log next time.
On 2/2/11 10:05 AM, "Liang.Ma at asc-csa.gc.ca" <Liang.Ma at asc-csa.gc.ca>
wrote:
>Hi, 
>
>I did what Michael suggested (included below). When there are only
>ms_drbd_mysql and fs_mysql, no problem to fail over to node 2. Added ip1,
>it still fail over to arsvr2 fine when I put node 1 (arsvr1) standby. But
>when I added mysql in group MySQLDB, it behaved exactly the same.
>fs_mysql started and mounted on arsvr2, and even ip1 started no problem,
>but mysql failed to start. Crm_mon shows the error
>
>Failed actions: mysql_start_0 (node=arsvr2, call=32, rc=4,
>status=complete): insufficient privileges
>
>While the cluster log didn't show anything on mysql start.
>
>Also I tried to run resource script mysql under
>/usr/lib/ocf/resource.d/heartbeat, it started mysql server no problem.
>
>My guess is the problem is somewhere right before pacemaker calls
>resource mysql. Maybe related to any permission or authentication problem
>with mysql as commented by Dejan, but I checked the permission on
>/var/run/mysqld and /var/lib/mysql, they are the same on both nodes.
>Anything within /var/lib/mysql is shared by drbd partition, which should
>be identical, right?
>
>Anyone has any ideas which part of mysql server setting's may cause the
>problem? My.cnf files under /etc/ in both nodes are identical. I found
>debian.cnf files were different in password field after upgrading. Then I
>copied the one from arsvr1 to arsvr2.
>
>Thank you for your any help.
>
>Here is the simplified crm configuration.
>
>
>node $id="bc6bf61d-6b5f-4307-85f3-bf7bb11531bb" arsvr2 \
>    attributes standby="off"
>node $id="bf0e7394-9684-42b9-893b-5a9a6ecddd7e" arsvr1 \
>    attributes standby="off"
>primitive drbd_mysql ocf:linbit:drbd \
>    params drbd_resource="r0" \
>    op monitor interval="15s"
>primitive fs_mysql ocf:heartbeat:Filesystem \
>    params device="/dev/drbd/by-res/r0" directory="/var/lib/mysql"
>fstype="ext4" \
>    op start interval="0" timeout="60" \
>    op stop interval="0" timeout="120" \
>    meta target-role="Started"
>primitive ip1 ocf:heartbeat:IPaddr2 \
>    params ip="10.10.10.193" nic="eth0" \
>    op monitor interval="5s" \
>    meta target-role="Started"
>primitive ip1arp ocf:heartbeat:SendArp \
>    params ip="10.10.10.193" nic="eth0" \
>    meta target-role="Stopped"
>primitive mysql ocf:heartbeat:mysql \
>    params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf"
>user="mysql" group="mysql" log="/var/log/mysql.log"
>pid="/var/run/mysqld/mysqld.pid" datadir="/var/lib/mysql"
>socket="/var/run/mysqld/mysqld.sock" \
>    op monitor interval="30s" timeout="30s" \
>    op start interval="15" timeout="120" \
>    op stop interval="0" timeout="120" \
>    meta target-role="Started"
>group MySQLDB fs_mysql ip1 mysql \
>    meta target-role="Started"
>ms ms_drbd_mysql drbd_mysql \
>    meta master-max="1" master-node-max="1" clone-max="2"
>clone-node-max="1" notify="true"
>colocation fs_on_drbd inf: fs_mysql ms_drbd_mysql:Master colocation
>mysql_on_drbd inf: MySQLDB fs_mysql order fs-mysql-after-drbd inf:
>ms_drbd_mysql:promote fs_mysql:start order ip1-after-fs-mysql inf:
>fs_mysql:start ip1:start order mysql-after-fs-mysql inf: fs_mysql:start
>mysql:start property $id="cib-bootstrap-options" \
>    dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
>    cluster-infrastructure="Heartbeat" \
>    expected-quorum-votes="1" \
>    stonith-enabled="false" \
>    no-quorum-policy="ignore"
>rsc_defaults $id="rsc-options" \
>    resource-stickiness="100"
>
>Liang Ma
>Contractuel | Consultant | SED Systems Inc.
>Ground Systems Analyst
>Agence spatiale canadienne | Canadian Space Agency
>6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
>Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
>Courriel/E-mail : [liang.ma at space.gc.ca]
>Site web/Web site : [www.space.gc.ca ]
>
>
>
>Hi,
>
>first of all, you configuration is, well, unconventional. I'd put all the
>primitives together in one group and any make the group colocated and
>ordered in respect to the DRBD's. Perhaps it'd be wise to make two groups.
>
>Googeling through the archives of the list I'd bet this error is caused
>be the crm trying to mount a secondary DRBD. This might happen by some
>constraints that somehow end up forming a loop.
>
>Could you please start with a very simple setup like:
>
>primitive resDRBD ocf:linbit:drbd params drbd_resource="r0"
>primitive resFS ocf:heartbeat:Filesystem \
>  params device="/dev/drbd0" directory="/mnt" fstype="ext4"
>ms msDRBD resDRBD meta notify="true"
>collocation col_FS_DRBD inf: resFS:Started msDRBD:Master order
>ord_DRBD_FS inf: msDRBD:promote resFS:start
>
>If this works try to add a IP-Address as resource and make a group of both
>primitives:
>
>primitive resIP ocf:heartbeat:IPaddr2 \
>  params ip="10.10.10.193" nic="eth0" cidr_netmask="24"
>group groupMySQL resFS resIP
>
>Failover still working? What are the constraints now?
>Now add the MySQL database to the group:
>
>primitive mysql ocf:heartbeat:mysql \
>  params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf" \
>    user="mysql" group="mysql" log="/var/log/mysql.log" \
>    pid="/var/run/mysqld/mysqld.pid" datadir="/var/lib/mysql" \
>  socket="/var/run/mysqld/mysqld.sock"
>
>edit group groupMySQL
>to Add the Mysql server
>
>and so on,
>
>
>Hope you are successful taking one step after the other.
>
>Greetings,
>--
>Dr. Michael Schwartzkopff
>Guardinistr. 63
>81375 München
>
>Tel: (0163) 172 50 98
>
>-----Original Message-----
>From: Ma, Liang 
>Sent: January 31, 2011 9:41 AM
>To: The Pacemaker cluster resource manager
>Subject: RE: [Pacemaker] pacemaker won't start mysql in the second node
>
>Hi,
>
>Thanks for your hints. I went through the cluster logs more carefully. By
>comparing the logs from the two nodes, the real different is after the
>line 
>
>info: process_lrm_event: LRM operation fs_mysql_start_0
>
>On node arsvr1, after that line we got a confirmation on Action
>fs_mysql_start_0 as such
>
>info: match_graph_event: Action fs_mysql_start_0 (8) confirmed on arsvr1
>
>and then went on to Initiating action 9: start mysql_start_0 on arsvr1
>(local).
>
>However on node arsvr2, we never see the confirmation from Action
>fs_mysql_start_0. So mysql_start_0 is never called. But the strange thing
>is, I can see the drbd partition of fs_mysql is properly mounted on
>arsvr2. Anyone knows what might stop arsvr2 to run that Action
>fs_mysql_start_0 (8) confirmed?
>
>Thanks in advance.
>
>Here are the logs from the two nodes.
>
>Logs on Node 2:
>
>Jan 28 14:24:23 arsvr2 lrmd: [919]: info: rsc:fs_mysql:229: start Jan 28
>14:24:23 arsvr2 Filesystem[1568]: [1596]: INFO: Running start for
>/dev/drbd/by-res/r0 on /var/lib/mysql Jan 28 14:24:23 arsvr2 lrmd: [919]:
>info: RA output:
>(fs_mysql:start:stderr) FATAL: Module scsi_hostadapter not found.
>Jan 28 14:24:23 arsvr2 Filesystem[1568]: [1606]: INFO: Starting
>filesystem check on /dev/drbd/by-res/r0 Jan 28 14:24:23 arsvr2 lrmd:
>[919]: info: RA output:
>(fs_mysql:start:stdout) fsck from util-linux-ng 2.17.2 Jan 28 14:24:23
>arsvr2 lrmd: [919]: info: RA output:
>(fs_mysql:start:stdout) /dev/drbd0: clean, 178/3276800 files,
>257999/13106791 blocks
>Jan 28 14:24:23 arsvr2 crmd: [922]: info: process_lrm_event: LRM
>operation fs_mysql_start_0 (call=229, rc=0, cib-update=251,
>confirmed=true) ok
>Jan 28 14:24:46 arsvr2 cib: [918]: info: cib_stats: Processed 149
>operations (0.00us average, 0% utilization) in the last 10min
>
>Logs on Node 1:
>
>Jan 28 14:28:58 arsvr1 lrmd: [1065]: info: rsc:fs_mysql:867: start Jan 28
>14:28:58 arsvr1 crmd: [1068]: info: te_rsc_command: Initiating action 31:
>monitor drbd_mysql:1_monitor_15000 on arsvr2 Jan 28 14:28:58 arsvr1
>Filesystem[516]: [544]: INFO: Running start for /dev/drbd/by-res/r0 on
>/var/lib/mysql Jan 28 14:28:58 arsvr1 lrmd: [1065]: info: RA output:
>(fs_mysql:start:stderr) FATAL: Module scsi_hostadapter not found.
>Jan 28 14:28:58 arsvr1 Filesystem[516]: [554]: INFO: Starting filesystem
>check on /dev/drbd/by-res/r0 Jan 28 14:28:58 arsvr1 lrmd: [1065]: info:
>RA output:
>(fs_mysql:start:stdout) fsck from util-linux-ng 2.17.2 Jan 28 14:28:58
>arsvr1 lrmd: [1065]: info: RA output:
>(fs_mysql:start:stdout) /dev/drbd0: clean, 178/3276800 files,
>257999/13106791 blocks
>Jan 28 14:28:58 arsvr1 crmd: [1068]: info: process_lrm_event: LRM
>operation fs_mysql_start_0 (call=867, rc=0, cib-update=1650,
>confirmed=true) ok
>Jan 28 14:28:58 arsvr1 crmd: [1068]: info: match_graph_event: Action
>fs_mysql_start_0 (8) confirmed on arsvr1 (rc=0)
>Jan 28 14:28:58 arsvr1 crmd: [1068]: info: te_rsc_command: Initiating
>action 9: start mysql_start_0 on arsvr1 (local)
>Jan 28 14:28:58 arsvr1 crmd: [1068]: info: do_lrm_rsc_op: Performing
>key=9:551:0:9c402121-906c-42de-a18a-68deb24208cb op=mysql_start_0 )
>Jan 28 14:28:58 arsvr1 lrmd: [1065]: info: rsc:mysql:868: start
>Jan 28 14:28:58 arsvr1 mysqld_safe: Starting mysqld daemon with databases
>from /var/lib/mysql
>Jan 28 14:28:59 arsvr1 crmd: [1068]: info: match_graph_event: Action
>drbd_mysql:1_monitor_15000 (31) confirmed on arsvr2 (rc=0)
>Jan 28 14:29:02 arsvr1 mysql[576]: [728]: INFO: MySQL started
>Jan 28 14:29:02 arsvr1 crmd: [1068]: info: process_lrm_event: LRM
>operation mysql_start_0 (call=868, rc=0, cib-update=1651,
>confirmed=true) ok
>Jan 28 14:29:02 arsvr1 crmd: [1068]: info: match_graph_event: Action
>mysql_start_0 (9) confirmed on arsvr1 (rc=0)
>
>
>Liang Ma
>Contractuel | Consultant | SED Systems Inc.
>Ground Systems Analyst
>Agence spatiale canadienne | Canadian Space Agency
>6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
>Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
>Courriel/E-mail : [liang.ma at space.gc.ca]
>Site web/Web site : [www.space.gc.ca ]
>
>
>
>
>-----Original Message-----
>From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm]
>Sent: January 28, 2011 11:09 AM
>To: The Pacemaker cluster resource manager
>Subject: Re: [Pacemaker] pacemaker won't start mysql in the second node
>
>On Fri, Jan 28, 2011 at 08:50:45AM -0500, Liang.Ma at asc-csa.gc.ca wrote:
>> Hi Dejan, thanks for your reply.
>> 
>> That's one of the problem. I don't see any logs in log file
>>/var/log/mysql/error.log.
>
>I meant the cluster logs.
>
>> I checked the permission of directories /var/run/mysqld and
>>/var/log/mysql. In both nodes they are the same as
>> 
>> drwxr-xr-x 2 mysql root 40 2011-01-27 13:50 /var/run/mysqld/
>> drwxr-s--- 2 mysql adm 4096 2011-01-27 11:34 /var/log/mysql
>> 
>> By the way, under which user pacemaker runs, root or someone else?
>
>pacemaker is a collection of programs. At any rate, the RA run
>as root, but may su to another user (mysql) depending on the
>resource configuration.
>
>Thanks,
>
>Dejan
>
>> Liang Ma
>> Contractuel | Consultant | SED Systems Inc.
>> Ground Systems Analyst
>> Agence spatiale canadienne | Canadian Space Agency
>> 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
>> Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
>> Courriel/E-mail : [liang.ma at space.gc.ca]
>> Site web/Web site : [www.space.gc.ca ]
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Dejan Muhamedagic [mailto:dejanmm at fastmail.fm]
>> Sent: January 28, 2011 8:26 AM
>> To: The Pacemaker cluster resource manager
>> Subject: Re: [Pacemaker] pacemaker won't start mysql in the second node
>> 
>> Hi,
>> 
>> On Thu, Jan 27, 2011 at 11:51:31AM -0500, Liang.Ma at asc-csa.gc.ca wrote:
>> > 
>> > 
>> > Hi There,
>> > 
>> > I have set up a pair of ha LAMP servers using heartbeat, pacemaker and
>> > drbd on Ubuntu 10.04 LTS. Everything works fine until I upgraded
>> > mysql-server from 5.1.41-3ubuntu12.6 to 5.1.41-3ubuntu12.9. Now node 1
>> > (arsvr1) works still fine, but mysql on node 2 (arsvr2) won't start
>> > when I switch arsvr1 standby. The error message shown from "crm
>> > status" is
>> > 
>> > Failed actions:
>> > mysql_start_0 (node=arsvr2, call=32, rc=4, status=complete):
>> > insufficient privileges
>> > 
>> > No errors logged in /var/log/mysql/error.log at all.
>> 
>> I think that you should check directory permissions. The log
>> file should give you a hint.
>> 
>> Thanks,
>> 
>> Dejan
>> 
>> 
>> > drbd mysql partition mounted properly. If I go to
>> > /usr/lib/ocf/resource.d/heartbeat and set the OCF_RESKEY parameters, I
>> > have no problem to start mysql server by "./mysql start". But the
>> > resource mysql won't show up in crm status.
>> > 
>> > So looks somehow pacemaker fail to start resource mysql even before
>> > running the resource script.
>> > 
>> > Here is the configuration
>> > 
>> > node $id="bc6bf61d-6b5f-4307-85f3-bf7bb11531bb" arsvr2 \
>> >     attributes standby="off"
>> > node $id="bf0e7394-9684-42b9-893b-5a9a6ecddd7e" arsvr1 \
>> >     attributes standby="off"
>> > primitive apache2 lsb:apache2 \
>> >     op start interval="0" timeout="60" \
>> >     op stop interval="0" timeout="120" start-delay="15" \
>> >     meta target-role="Started"
>> > primitive drbd_mysql ocf:linbit:drbd \
>> >     params drbd_resource="r0" \
>> >     op monitor interval="15s"
>> > primitive drbd_webfs ocf:linbit:drbd \
>> >     params drbd_resource="r1" \
>> >     op monitor interval="15s" \
>> >     op start interval="0" timeout="240" \
>> >     op stop interval="0" timeout="100"
>> > primitive fs_mysql ocf:heartbeat:Filesystem \
>> >     params device="/dev/drbd/by-res/r0" directory="/var/lib/mysql"
>>fstype="ext4" \
>> >     op start interval="0" timeout="60" \
>> >     op stop interval="0" timeout="120" \
>> >     meta target-role="Started"
>> > primitive fs_webfs ocf:heartbeat:Filesystem \
>> >     params device="/dev/drbd/by-res/r1" directory="/srv"
>>fstype="ext4" \
>> >     op start interval="0" timeout="60" \
>> >     op stop interval="0" timeout="120" \
>> >     meta target-role="Started"
>> > primitive ip1 ocf:heartbeat:IPaddr2 \
>> >     params ip="10.10.10.193" nic="eth0" \
>> >     op monitor interval="5s"
>> > primitive ip1arp ocf:heartbeat:SendArp \
>> >     params ip="10.10.10.193" nic="eth0"
>> > primitive mysql ocf:heartbeat:mysql \
>> >     params binary="/usr/bin/mysqld_safe" config="/etc/mysql/my.cnf"
>> > user="mysql" group="mysql" log="/var/log/mysql.log"
>> > pid="/var/run/mysqld/mysqld.pid" datadir="/var/lib/mysql"
>> > socket="/var/run/mysqld/mysqld.sock" \
>> >     op monitor interval="30s" timeout="30s" \
>> >     op start interval="0" timeout="120" \
>> >     op stop interval="0" timeout="120" \
>> >     meta target-role="Started"
>> > group MySQLDB fs_mysql mysql \
>> >     meta target-role="Started"
>> > group WebServices ip1 ip1arp fs_webfs apache2 \
>> >     meta target-role="Started"
>> > ms ms_drbd_mysql drbd_mysql \
>> >     meta master-max="1" master-node-max="1" clone-max="2"
>> > clone-node-max="1" notify="true"
>> > ms ms_drbd_webfs drbd_webfs \
>> >     meta master-max="1" master-node-max="1" clone-max="2"
>> > clone-node-max="1" notify="true" target-role="Started"
>> > colocation apache2_with_ip inf: apache2 ip1
>> > colocation apache2_with_mysql inf: apache2 ms_drbd_mysql:Master
>> > colocation apache2_with_webfs inf: apache2 ms_drbd_webfs:Master
>> > colocation fs_on_drbd inf: fs_mysql ms_drbd_mysql:Master
>> > colocation ip_with_ip_arp inf: ip1 ip1arp
>> > colocation mysql_on_drbd inf: MySQLDB ms_drbd_mysql:Master
>> > colocation web_with_mysql inf: MySQLDB WebServices
>> > colocation webfs_on_drbd inf: fs_webfs ms_drbd_webfs:Master
>> > colocation webfs_with_fs inf: fs_webfs fs_mysql
>> > order apache2-after-arp inf: ip1arp:start apache2:start
>> > order arp-after-ip inf: ip1:start ip1arp:start
>> > order fs-mysql-after-drbd inf: ms_drbd_mysql:promote fs_mysql:start
>> > order fs-webfs-after-drbd inf: ms_drbd_webfs:promote fs_webfs:start
>> > order ip-after-mysql inf: mysql:start ip1:start
>> > order mysql-after-fs-mysql inf: fs_mysql:start mysql:start
>> > property $id="cib-bootstrap-options" \
>> >     dc-version="1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd" \
>> >     cluster-infrastructure="Heartbeat" \
>> >     expected-quorum-votes="1" \
>> >     stonith-enabled="false" \
>> >     no-quorum-policy="ignore"
>> > rsc_defaults $id="rsc-options" \
>> >     resource-stickiness="100"
>> > 
>> > Any help please?
>> > 
>> > Thanks,
>> > 
>> > Liang Ma
>> > Contractuel | Consultant | SED Systems Inc.
>> > Ground Systems Analyst
>> > Agence spatiale canadienne | Canadian Space Agency
>> > 6767, Route de l'Aéroport, Longueuil (St-Hubert), QC, Canada, J3Y 8Y9
>> > Tél/Tel : (450) 926-5099 | Téléc/Fax: (450) 926-5083
>> > Courriel/E-mail : [liang.ma at space.gc.ca]
>> > Site web/Web site : [www.space.gc.ca ]
>> > 
>> > 
>> > 
>> > 
>> > _______________________________________________
>> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> > 
>> > Project Home: http://www.clusterlabs.org
>> > Getting started:
>>http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> > Bugs: 
>>http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>>http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>> 
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>> 
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: 
>>http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>_______________________________________________
>Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: 
>http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>
>_______________________________________________
>Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
>Project Home: http://www.clusterlabs.org
>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>Bugs: 
>http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
    
    
More information about the Pacemaker
mailing list