[ClusterLabs] Problem with MariaDB cluster

Reid Wahl nwahl at redhat.com
Fri Jan 27 03:32:12 EST 2023


On Fri, Jan 27, 2023 at 12:23 AM Thomas CAS <tcas at ikoula.com> wrote:

> Hello Reid,
>
>
>
> Thank you so much for your answer and bug report.
>
> If it is a bug, I do not understand why the problem is present in
> production but not on my lab which is identical?
>

That's a good question. I'm not sure, and I haven't worked much with this
resource agent. Does the lab show interesting logs at all from the resource
agent during startup? I wonder if it's hitting the same "No MySQL master
present" issue but *not* the error that follows. That error is coming from
mysql itself, not from the resource agent.

After the agent hits the "No MySQL master present" issue, it calls the
unset_master() function. You can take a look at the agent script (in
/usr/lib/ocf/resource.d/heartbeat/mysql by default) to see all the things
that unset_master() is doing, and try to determine what's behaving
differently in production vs. lab.


> What does the "$OCF_RESKEY_CRM_meta_notify_master_uname" variable do?
> (which shell command is done through this variable)
>

Pacemaker sets it during a resource's notify operation. It's set to some
information that's used only during a notify operation, and it's unset
otherwise. You can see some other similar meta variables in the
mysql_notify() function. The mysql_notify() function gets called during a
notify operation, so those variables might be set. They won't be set when
mysql_start() is called during a start operation.


>
>
> Best regards,
>
>
>
> Thomas Cas  |  Technicien du support infogérance
>
> PHONE : +33 3 51 25 23 26       WEB : www.ikoula.com/en
>
> IKOULA Data Center 34 rue Pont Assy - 51100 Reims - FRANCE
>
> Before printing this letter, think about the impact on the environment!
>
>
>
> [image: Ikoula] <https://www.ikoula.com/en>
>
> [image: Twitter] <https://twitter.com/ikoula_en> [image: Linkedin]
> <https://www.linkedin.com/company/ikoula> [image: Youtube]
> <http://www.youtube.fr/ikoulanet> [image: Pressroom]
> <https://pressroom.ikoula.com/> [image: Blog] <https://blog.ikoula.com/en>
>
>
>
>
> *De :* Reid Wahl <nwahl at redhat.com>
> *Envoyé :* jeudi 26 janvier 2023 20:31
> *À :* Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>; Thomas CAS <tcas at ikoula.com>
> *Objet :* Re: [ClusterLabs] Problem with MariaDB cluster
>
>
>
> Vous ne recevez pas souvent de courriers de la part de nwahl at redhat.com. Découvrez
> pourquoi cela est important
> <https://aka.ms/LearnAboutSenderIdentification>
>
>
>
>
>
> On Thu, Jan 26, 2023 at 7:39 AM Thomas CAS <tcas at ikoula.com> wrote:
>
> Hello,
>
>
>
> I'm having trouble with a MariaDB cluster (2 nodes, master-slave) on
> Debian 11.
>
> I don't know what to do anymore.
>
>
>
> *Environment:*
>
>
>
> Node1:
>
>  OS: Debian 11
>
> Kernel: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21)
>
> Versions: resource-agents (4.7.0-1), pacemaker (2.0.5-2), corosync
> (3.1.2-2), mariadb (10.5.18-0+deb11u1)
>
>
>
> Node2:
>
>  OS: Debian 11
>
> Kernel: 5.10.0-21-amd64 #1 SMP Debian 5.10.162-1 (2023-01-21)
>
> Versions: resource-agents (4.7.0-1), pacemaker (2.0.5-2), corosync
> (3.1.2-2), mariadb (10.5.18-0+deb11u1)
>
>
>
> crm configure show as attachment.
>
>
>
> *Problem: *
>
>
>
> When I restart Node2 (which is a slave), it goes up correctly in the
> cluster:
>
>
>
> $ crm status
>
> Cluster Summary:
>
>   * Stack: corosync
>
>   * Current DC: Node1 (version 2.0.5-ba59be7122) - partition with quorum
>
>   * Last updated: Thu Jan 26 12:04:57 2023
>
>   * Last change:  Thu Jan 26 11:39:58 2023 by root via cibadmin on Node2
>
>   * 2 nodes configured
>
>   * 3 resource instances configured
>
>
>
> Node List:
>
>   * Online: [ Node1 Node2 ]
>
>
>
> Full List of Resources:
>
>   * VIP (ocf::heartbeat:IPaddr2):        Started Node1
>
>   * Clone Set: MYSQLREPLICATOR [MYSQL] (promotable):
>
>     * Masters: [ Node1 ]
>
>     * Slaves: [ Node2 ]
>
>
>
> But it does not retrieve the replication information. (SHOW SLAVE STATUS;
> returns nothing)
>
> In the Node2 logs, I can see this message that explains that replication
> is not taking place:
>
>
>
> Jan 25 16:29:38  mysql(MYSQL)[22862]:    INFO: No MySQL master present -
> clearing replication state
>
> Jan 25 16:29:39  mysql(MYSQL)[22862]:    WARNING: MySQL Slave IO threads
> currently not running.
>
> Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: MySQL Slave SQL threads
> currently not running.
>
> Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: See  for details
>
> Jan 25 16:29:39  mysql(MYSQL)[22862]:    ERROR: ERROR 1200 (HY000) at line
> 1: Misconfigured slave: MASTER_HOST was not set; Fix in config file or with
> CHANGE MASTER TO
>
>
>
> From what I see in the following file, Node2 does not seem to find the
> master name. So it clears its replication information:
>
>
>
> /usr/lib/ocf/resource.d/heartbeat/mysql
>
>
>
>         master_host=`echo $OCF_RESKEY_CRM_meta_notify_master_uname|tr -d "
> "`
>
>         if [ "$master_host" -a "$master_host" != ${NODENAME} ]; then
>
>             ocf_log info "Changing MySQL configuration to replicate from
> $master_host."
>
>             set_master
>
>             start_slave
>
>             if [ $? -ne 0 ]; then
>
>                 ocf_exit_reason "Failed to start slave"
>
>                 return $OCF_ERR_GENERIC
>
>             fi
>
>         else
>
>             ocf_log info "No MySQL master present - clearing replication
> state"
>
>             unset_master
>
>         fi
>
>
>
> As it is a production environment, I performed a bare metal restore of
> these machines on 2 LAB machines and I have no problem…
>
> In production, there is a lot of writing but the servers are far from
> being saturated.
>
>
>
> Thank you in advance for all the help you can give me.
>
>
> Best regards,
>
>
>
> I'm sorry you've encountered this.
>
>
>
> I don't understand why the resource agent checks
> $OCF_RESKEY_CRM_meta_notify_master_uname during the start operation. That
> value gets set only during a notify operation. That looks like a bug in the
> resource agent.
>
>
>
> I've filed an issue against it here:
> https://github.com/ClusterLabs/resource-agents/issues/1839
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FClusterLabs%2Fresource-agents%2Fissues%2F1839&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110564595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jsASsl7V1pLz46T%2FcqrHQLmBdMvaMg6863i7xny4AhY%3D&reserved=0>
>
>
>
>
>
>
>
> Thomas Cas  |  Technicien du support infogérance
>
> PHONE : +33 3 51 25 23 26       WEB : www.ikoula.com/en
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ikoula.com%2Fen&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110720819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=t1b5WNDzLi7Uu3qsJrOGDln%2F%2Bi%2FJQpGj6zWWBYG1WNo%3D&reserved=0>
>
> IKOULA Data Center 34 rue Pont Assy - 51100 Reims - FRANCE
>
> Before printing this letter, think about the impact on the environment!
>
>
>
> [image: Ikoula]
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ikoula.com%2Fen&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110720819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=t1b5WNDzLi7Uu3qsJrOGDln%2F%2Bi%2FJQpGj6zWWBYG1WNo%3D&reserved=0>
>
>
> [image: Twitter]
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fikoula_en&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110720819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=2bvhY7LWr5apEQg3S1n8tqXVb8ApBH1Sj6yDH7PLqd4%3D&reserved=0>
>  [image: Linkedin]
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fikoula&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110720819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=NsZKS4uHglEauz%2BIM7aIUuzgjReMma79mQ0s4w%2B5dM0%3D&reserved=0>
>  [image: Youtube]
> <https://fra01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.youtube.fr%2Fikoulanet&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110720819%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=L2eGusCXT%2BRkV8Y23a5LmZAs3MdrANPFHfduHwbosX0%3D&reserved=0>
>  [image: Pressroom]
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpressroom.ikoula.com%2F&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110877041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iY3K4OajrW5k5cum6%2F9dnA%2BwJs%2FWlQoAS0IaEScg6X4%3D&reserved=0>
>  [image: Blog]
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.ikoula.com%2Fen&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110877041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=xU7ptTaaNo9fld7330tLxkg6GculNta3%2BuXMbkgZUBE%3D&reserved=0>
>
>
>
>
>
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.clusterlabs.org%2Fmailman%2Flistinfo%2Fusers&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110877041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=3NBeknQhLMP6hGGEV1ABu%2FLyQw5B2tEqfP0D3uvCrZ0%3D&reserved=0>
>
> ClusterLabs home: https://www.clusterlabs.org/
> <https://fra01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.clusterlabs.org%2F&data=05%7C01%7Ctcas%40ikoula.com%7C6cd84feaf4054c3ddaf908daffd3f66c%7Ccb7a4a4ea7f747cc931f80db4a66f1c7%7C0%7C0%7C638103583110877041%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qAR2FaOViM5i4GMmd1sttMGHYDhnF6bkcL56DffrNKQ%3D&reserved=0>
>
>
>
> --
>
> Regards,
>
> Reid Wahl (He/Him)
>
> Senior Software Engineer, Red Hat
>
> RHEL High Availability - Pacemaker
>


-- 
Regards,

Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 20133 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 402 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 585 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0008.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 310 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0009.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image005.png
Type: image/png
Size: 381 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image006.png
Type: image/png
Size: 458 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/users/attachments/20230127/6b8f8f69/attachment-0011.png>


More information about the Users mailing list