[ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
Andrei Borzenkov
arvidjaar at gmail.com
Wed May 12 13:04:06 EDT 2021
On 12.05.2021 17:34, fatcharly at gmx.de wrote:
> Hi Andrei, Hi everybody,
>
>
>> Gesendet: Mittwoch, 12. Mai 2021 um 16:01 Uhr
>> Von: fatcharly at gmx.de
>> An: users at clusterlabs.org
>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>
>> Hi Andrei, Hi everybody,
>>
>>
>>> You need order fs_database after promote operation; and as I just found
>>> pacemaker also does not reverse it correctly and executes fs stop and
>>> drbd demote concurrently. So you need additional order constraint to
>>> first stop fs then demote drbd.
>>
>> is there so good doku about this, I don't know how to archive a "after promote operation" and how can I tell the pcs to first dismount the filesystem mountpoint and then demote the drbd-device.
>>
> ok, so I found something and used this:
>
> pcs constraint order stop fs_logfiles then demote drbd_logsfiles-clone
> pcs constraint order stop fs_database then demote database_drbd-clone
>
> and it works great Thanks for the hint.
> But the thing I still don't understand is why the cluster demotes is active node for a short time when I reenable a node from standby back to unstandby ? Is it not possible to join the drbd as secondary without demote the primary for a short moment ?
Try adding interleave=true to your clones.
>
> Best regards and take care
>
> fatcharly
>
>
>
>> Sorry but this is new for me.
>>
>> Best regards and take care
>>
>> fatcharly
>>
>>
>>
>>
>>> Gesendet: Dienstag, 11. Mai 2021 um 17:19 Uhr
>>> Von: "Andrei Borzenkov" <arvidjaar at gmail.com>
>>> An: users at clusterlabs.org
>>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>>
>>> On 11.05.2021 17:43, fatcharly at gmx.de wrote:
>>>> Hi,
>>>>
>>>> I'm using a CentOS 8.3.2011 with a pacemaker-2.0.4-6.el8_3.1.x86_64 + corosync-3.0.3-4.el8.x86_64 and kmod-drbd90-9.0.25-2.el8_3.elrepo.x86_64.
>>>> The cluster consists of two nodes which are providing a ha-mariadb with the help of two drbd devices for the database and the logfiles. The corosync is working over two rings and both machines are virtual kvm-guests.
>>>>
>>>> Problem:
>>>> Node susanne is the active node and lisbon is changing from standby to active, susanna is trying to demote one drbd-device but is failling to. The cluster is working on properly, but the error stays.
>>>> This is the what happens:
>>>>
>>>> Cluster Summary:
>>>> * Stack: corosync
>>>> * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with quo rum
>>>> * Last updated: Tue May 11 16:15:54 2021
>>>> * Last change: Tue May 11 16:15:42 2021 by root via cibadmin on susanne
>>>> * 2 nodes configured
>>>> * 11 resource instances configured
>>>>
>>>> Node List:
>>>> * Online: [ lisbon susanne ]
>>>>
>>>> Active Resources:
>>>> * HA_IP (ocf::heartbeat:IPaddr2): Started susanne
>>>> * Clone Set: database_drbd-clone [database_drbd] (promotable):
>>>> * Masters: [ susanne ]
>>>> * Slaves: [ lisbon ]
>>>> * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
>>>> * drbd_logsfiles (ocf::linbit:drbd): Demoting susanne
>>>> * fs_logfiles (ocf::heartbeat:Filesystem): Started susanne
>>>
>>> Presumably fs_logfiles is located on drbd_logfiles, so how comes it is
>>> active while drbd_logfiles is being demoted? Then drbdadm fails to
>>> change status to secondary and RA simply loops forever until timeout.
>>>
>>>> * fs_database (ocf::heartbeat:Filesystem): Started susanne
>>>> * mysql-server (ocf::heartbeat:mysql): Started susanne
>>>> * Clone Set: ping_fw-clone [ping_fw]:
>>>> * Started: [ lisbon susanne ]
>>>>
>>>> -------------------------------------------------------------------------------------------
>>>> after a few seconds it switches over:
>>>>
>>>> Cluster Summary:
>>>> * Stack: corosync
>>>> * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with quo rum
>>>> * Last updated: Tue May 11 16:17:59 2021
>>>> * Last change: Tue May 11 16:15:42 2021 by root via cibadmin on susanne
>>>> * 2 nodes configured
>>>> * 11 resource instances configured
>>>>
>>>> Node List:
>>>> * Online: [ lisbon susanne ]
>>>>
>>>> Active Resources:
>>>> * HA_IP (ocf::heartbeat:IPaddr2): Started susanne
>>>> * Clone Set: database_drbd-clone [database_drbd] (promotable):
>>>> * Masters: [ susanne ]
>>>> * Slaves: [ lisbon ]
>>>> * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
>>>> * Masters: [ susanne ]
>>>> * Slaves: [ lisbon ]
>>>> * fs_logfiles (ocf::heartbeat:Filesystem): Started susanne
>>>> * fs_database (ocf::heartbeat:Filesystem): Started susanne
>>>> * mysql-server (ocf::heartbeat:mysql): Started susanne
>>>> * Resource Group: apache:
>>>> * httpd_srv (ocf::heartbeat:apache): Started susanne
>>>> * Clone Set: ping_fw-clone [ping_fw]:
>>>> * Started: [ lisbon susanne ]
>>>>
>>>> Failed Resource Actions:
>>>> * drbd_logsfiles_demote_0 on susanne 'error' (1): call=736, status='Timed Out'
>>>> , exitreason='', last-rc-change='2021-05-11 16:15:42 +02:00', queued=0ms, exec=9 0001ms
>>>> ----------------------------------------------------------------------------------------------
>>>>
>>>
>>> And what you see in logs?
>>>
>>>> I think it is a constraint-problem, but I can't find it.
>>>> This is my config:
>>>> [root at susanne pacemaker]# pcs config show Cluster Name: mysql_cluster Corosync Nodes:
>>>> susanne lisbon
>>>> Pacemaker Nodes:
>>>> lisbon susanne
>>>>
>>>> Resources:
>>>> Resource: HA_IP (class=ocf provider=heartbeat type=IPaddr2)
>>>> Attributes: cidr_netmask=24 ip=192.168.18.154
>>>> Operations: monitor interval=15s (HA_IP-monitor-interval-15s)
>>>> start interval=0s timeout=20s (HA_IP-start-interval-0s)
>>>> stop interval=0s timeout=20s (HA_IP-stop-interval-0s)
>>>> Clone: database_drbd-clone
>>>> Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1
>>>> Resource: database_drbd (class=ocf provider=linbit type=drbd)
>>>> Attributes: drbd_resource=drbd1
>>>> Operations: demote interval=0s timeout=90 (database_drbd-demote-interval-0s)
>>>> monitor interval=20 role=Slave timeout=20 (database_drbd-monitor-interval-20)
>>>> monitor interval=10 role=Master timeout=20 (database_drbd-monitor-interval-10)
>>>> notify interval=0s timeout=90 (database_drbd-notify-interval-0s)
>>>> promote interval=0s timeout=90 (database_drbd-promote-interval-0s)
>>>> reload interval=0s timeout=30 (database_drbd-reload-interval-0s)
>>>> start interval=0s timeout=240 (database_drbd-start-interval-0s)
>>>> stop interval=0s timeout=100 (database_drbd-stop-interval-0s)
>>>> Clone: drbd_logsfiles-clone
>>>> Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1
>>>> Resource: drbd_logsfiles (class=ocf provider=linbit type=drbd)
>>>> Attributes: drbd_resource=drbd2
>>>> Operations: demote interval=0s timeout=90 (drbd_logsfiles-demote-interval-0s)
>>>> monitor interval=20 role=Slave timeout=20 (drbd_logsfiles-monitor-interval-20)
>>>> monitor interval=10 role=Master timeout=20 (drbd_logsfiles-monitor-interval-10)
>>>> notify interval=0s timeout=90 (drbd_logsfiles-notify-interval-0s)
>>>> promote interval=0s timeout=90 (drbd_logsfiles-promote-interval-0s)
>>>> reload interval=0s timeout=30 (drbd_logsfiles-reload-interval-0s)
>>>> start interval=0s timeout=240 (drbd_logsfiles-start-interval-0s)
>>>> stop interval=0s timeout=100 (drbd_logsfiles-stop-interval-0s)
>>>> Resource: fs_logfiles (class=ocf provider=heartbeat type=Filesystem)
>>>> Attributes: device=/dev/drbd2 directory=/mnt/clusterfs2 fstype=ext4
>>>> Operations: monitor interval=20s timeout=40s (fs_logfiles-monitor-interval-20s)
>>>> start interval=0s timeout=60s (fs_logfiles-start-interval-0s)
>>>> stop interval=0s timeout=60s (fs_logfiles-stop-interval-0s)
>>>> Resource: fs_database (class=ocf provider=heartbeat type=Filesystem)
>>>> Attributes: device=/dev/drbd1 directory=/mnt/clusterfs1 fstype=ext4
>>>> Operations: monitor interval=20s timeout=40s (fs_database-monitor-interval-20s)
>>>> start interval=0s timeout=60s (fs_database-start-interval-0s)
>>>> stop interval=0s timeout=60s (fs_database-stop-interval-0s)
>>>> Resource: mysql-server (class=ocf provider=heartbeat type=mysql)
>>>> Attributes: additional_parameters=--bind-address=0.0.0.0 binary=/usr/bin/mysqld_safe config=/etc/my.cnf datadir=/mnt/clusterfs1/mysql pid=/var/lib/mysql/run/mariadb.pid socket=/var/lib/mysql/mysql.sock
>>>> Operations: demote interval=0s timeout=120s (mysql-server-demote-interval-0s)
>>>> monitor interval=20s timeout=30s (mysql-server-monitor-interval-20s)
>>>> notify interval=0s timeout=90s (mysql-server-notify-interval-0s)
>>>> promote interval=0s timeout=120s (mysql-server-promote-interval-0s)
>>>> start interval=0s timeout=60s (mysql-server-start-interval-0s)
>>>> stop interval=0s timeout=60s (mysql-server-stop-interval-0s)
>>>> Group: apache
>>>> Resource: httpd_srv (class=ocf provider=heartbeat type=apache)
>>>> Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://127.0.0.1/server-status
>>>> Operations: monitor interval=10s timeout=20s (httpd_srv-monitor-interval-10s)
>>>> start interval=0s timeout=40s (httpd_srv-start-interval-0s)
>>>> stop interval=0s timeout=60s (httpd_srv-stop-interval-0s)
>>>> Clone: ping_fw-clone
>>>> Resource: ping_fw (class=ocf provider=pacemaker type=ping)
>>>> Attributes: dampen=10s host_list=192.168.18.1 multiplier=1000
>>>> Operations: monitor interval=10s timeout=60s (ping_fw-monitor-interval-10s)
>>>> start interval=0s timeout=60s (ping_fw-start-interval-0s)
>>>> stop interval=0s timeout=20s (ping_fw-stop-interval-0s)
>>>>
>>>> Stonith Devices:
>>>> Fencing Levels:
>>>>
>>>> Location Constraints:
>>>> Resource: mysql-server
>>>> Constraint: location-mysql-server
>>>> Rule: boolean-op=or score=-INFINITY (id:location-mysql-server-rule)
>>>> Expression: pingd lt 1 (id:location-mysql-server-rule-expr)
>>>> Expression: not_defined pingd (id:location-mysql-server-rule-expr-1)
>>>> Ordering Constraints:
>>>> start mysql-server then start httpd_srv (kind:Mandatory) (id:order-mysql-server-httpd_srv-mandatory)
>>>> start database_drbd-clone then start drbd_logsfiles-clone (kind:Mandatory) (id:order-database_drbd-clone-drbd_logsfiles-clone-mandatory)
>>>> start drbd_logsfiles-clone then start fs_database (kind:Mandatory) (id:order-drbd_logsfiles-clone-fs_database-mandatory)
>>>
>>> You need order fs_database after promote operation; and as I just found
>>> pacemaker also does not reverse it correctly and executes fs stop and
>>> drbd demote concurrently. So you need additional order constraint to
>>> first stop fs then demote drbd.
>>>
>>>> start fs_database then start fs_logfiles (kind:Mandatory) (id:order-fs_database-fs_logfiles-mandatory)
>>>> start fs_logfiles then start mysql-server (kind:Mandatory) (id:order-fs_logfiles-mysql-server-mandatory)
>>>> Colocation Constraints:
>>>> fs_logfiles with drbd_logsfiles-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-fs_logfiles-drbd_logsfiles-clone-INFINITY)
>>>> fs_database with database_drbd-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-fs_database-database_drbd-clone-INFINITY)
>>>> drbd_logsfiles-clone with database_drbd-clone (score:INFINITY) (rsc-role:Master) (with-rsc-role:Master) (id:colocation-drbd_logsfiles-clone-database_drbd-clone-INFINITY)
>>>> HA_IP with database_drbd-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-HA_IP-database_drbd-clone-INFINITY)
>>>> mysql-server with fs_database (score:INFINITY) (id:colocation-mysql-server-fs_database-INFINITY)
>>>> httpd_srv with mysql-server (score:INFINITY) (id:colocation-httpd_srv-mysql-server-INFINITY)
>>>> Ticket Constraints:
>>>>
>>>> Alerts:
>>>> No alerts defined
>>>>
>>>> Resources Defaults:
>>>> No defaults set
>>>> Operations Defaults:
>>>> No defaults set
>>>>
>>>> Cluster Properties:
>>>> cluster-infrastructure: corosync
>>>> cluster-name: mysql_cluster
>>>> dc-version: 2.0.4-6.el8_3.1-2deceaa3ae
>>>> have-watchdog: false
>>>> last-lrm-refresh: 1620742514
>>>> stonith-enabled: FALSE
>>>>
>>>> Tags:
>>>> No tags defined
>>>>
>>>> Quorum:
>>>> Options:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Any suggestions are welcome
>>>>
>>>> best regards stay safe, take care
>>>>
>>>> fatcharly
>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
More information about the Users
mailing list