[ClusterLabs] 2 node mariadb-cluster - constraint-problems ?

Tue May 18 09:19:46 EDT 2021

Dne 18. 05. 21 v 14:55 fatcharly at gmx.de napsal(a):
> 
> 
>> Gesendet: Dienstag, 18. Mai 2021 um 14:49 Uhr
>> Von: fatcharly at gmx.de
>> An: users at clusterlabs.org
>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>
>> Hi Andrei,Hi everybody,
>>
>> ...
>>>> and it works great Thanks for the hint.
>>>> But the thing I still don't understand is why the cluster demotes is active node for a short time when I reenable a node from standby back to unstandby ? Is it not possible to join the drbd as secondary without demote the primary for a short moment ?
>>>
>>> Try adding interleave=true to your clones.
>>
>> I tried this but it get me an error msg, what is wrong ?
>>
>>   pcs resource update database_drbd ocf:linbit:drbd drbd_resource=drbd1 promotable promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 notify=true interleave=true
>>
>> Error: invalid resource options: 'clone-max', 'clone-node-max', 'interleave', 'notify', 'promoted-max', 'promoted-node-max', allowed options are: 'adjust_master_score', 'connect_only_after_promote', 'drbd_resource', 'drbdconf', 'fail_promote_early_if_peer_primary', 'ignore_missing_notifications', 'remove_master_score_if_peer_primary', 'require_drbd_module_version_ge', 'require_drbd_module_version_lt', 'stop_outdates_secondary', 'unfence_extra_args', 'unfence_if_all_uptodate', 'wfc_timeout', use --force to override
> 
> or is it simply:
> pcs resource update database_drbd-clone interleave=true ?

Hi fatcharly,

The error comes from the fact that the update command as you used it is 
trying to update instance attributes (that is options which pacemaker 
passes to a resource agent). The agent doesn't define options you named, 
therefore pcs prints an error.

You want to update meta attributes, which are options pacemaker is 
processing by itself. This is how you do it:

pcs resource meta database_drbd-clone interleave=true

Regards,
Tomas

> 
>>
>> Any suggestions are welcome
>>
>> Stay safe and take care
>>
>> fatcharly
>>
>>
>>
>>
>>> Gesendet: Mittwoch, 12. Mai 2021 um 19:04 Uhr
>>> Von: "Andrei Borzenkov" <arvidjaar at gmail.com>
>>> An: users at clusterlabs.org
>>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>>
>>> On 12.05.2021 17:34, fatcharly at gmx.de wrote:
>>>> Hi Andrei, Hi everybody,
>>>>
>>>>
>>>>> Gesendet: Mittwoch, 12. Mai 2021 um 16:01 Uhr
>>>>> Von: fatcharly at gmx.de
>>>>> An: users at clusterlabs.org
>>>>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>>>>
>>>>> Hi Andrei, Hi everybody,
>>>>>
>>>>>
>>>>>> You need order fs_database after promote operation; and as I just found
>>>>>> pacemaker also does not reverse it correctly and executes fs stop and
>>>>>> drbd demote concurrently. So you need additional order constraint to
>>>>>> first stop fs then demote drbd.
>>>>>
>>>>> is there so good doku about this, I don't know how to archive a "after promote operation" and how can I tell the pcs to first dismount the filesystem mountpoint and then demote the drbd-device.
>>>>>
>>>> ok, so I found something and used this:
>>>>
>>>> pcs constraint order stop fs_logfiles then demote drbd_logsfiles-clone
>>>> pcs constraint order stop fs_database then demote database_drbd-clone
>>>>
>>>> and it works great Thanks for the hint.
>>>> But the thing I still don't understand is why the cluster demotes is active node for a short time when I reenable a node from standby back to unstandby ? Is it not possible to join the drbd as secondary without demote the primary for a short moment ?
>>>
>>> Try adding interleave=true to your clones.
>>>
>>>>
>>>> Best regards and take care
>>>>
>>>> fatcharly
>>>>
>>>>
>>>>
>>>>> Sorry but this is new for me.
>>>>>
>>>>> Best regards and take care
>>>>>
>>>>> fatcharly
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Gesendet: Dienstag, 11. Mai 2021 um 17:19 Uhr
>>>>>> Von: "Andrei Borzenkov" <arvidjaar at gmail.com>
>>>>>> An: users at clusterlabs.org
>>>>>> Betreff: Re: [ClusterLabs] 2 node mariadb-cluster - constraint-problems ?
>>>>>>
>>>>>> On 11.05.2021 17:43, fatcharly at gmx.de wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm using a CentOS 8.3.2011 with a pacemaker-2.0.4-6.el8_3.1.x86_64 + corosync-3.0.3-4.el8.x86_64 and kmod-drbd90-9.0.25-2.el8_3.elrepo.x86_64.
>>>>>>> The cluster consists of two nodes which are providing a ha-mariadb with the help of two drbd devices for the database and the logfiles. The corosync is working over two rings and both machines are virtual kvm-guests.
>>>>>>>
>>>>>>> Problem:
>>>>>>> Node susanne is the active node and lisbon is changing from standby to active, susanna is trying to demote one drbd-device but is failling to. The cluster is working on properly, but the error stays.
>>>>>>> This is the what happens:
>>>>>>>
>>>>>>> Cluster Summary:
>>>>>>>    * Stack: corosync
>>>>>>>    * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with quo rum
>>>>>>>    * Last updated: Tue May 11 16:15:54 2021
>>>>>>>    * Last change:  Tue May 11 16:15:42 2021 by root via cibadmin on susanne
>>>>>>>    * 2 nodes configured
>>>>>>>    * 11 resource instances configured
>>>>>>>
>>>>>>> Node List:
>>>>>>>    * Online: [ lisbon susanne ]
>>>>>>>
>>>>>>> Active Resources:
>>>>>>>    * HA_IP       (ocf::heartbeat:IPaddr2):        Started susanne
>>>>>>>    * Clone Set: database_drbd-clone [database_drbd] (promotable):
>>>>>>>      * Masters: [ susanne ]
>>>>>>>      * Slaves: [ lisbon ]
>>>>>>>    * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
>>>>>>>      * drbd_logsfiles    (ocf::linbit:drbd):      Demoting susanne
>>>>>>>    * fs_logfiles (ocf::heartbeat:Filesystem):     Started susanne
>>>>>>
>>>>>> Presumably fs_logfiles is located on drbd_logfiles, so how comes it is
>>>>>> active while drbd_logfiles is being demoted? Then drbdadm fails to
>>>>>> change status to secondary and RA simply loops forever until timeout.
>>>>>>
>>>>>>>    * fs_database (ocf::heartbeat:Filesystem):     Started susanne
>>>>>>>    * mysql-server        (ocf::heartbeat:mysql):  Started susanne
>>>>>>>    * Clone Set: ping_fw-clone [ping_fw]:
>>>>>>>      * Started: [ lisbon susanne ]
>>>>>>>
>>>>>>> -------------------------------------------------------------------------------------------
>>>>>>> after a few seconds it switches over:
>>>>>>>
>>>>>>> Cluster Summary:
>>>>>>>    * Stack: corosync
>>>>>>>    * Current DC: lisbon (version 2.0.4-6.el8_3.1-2deceaa3ae) - partition with quo rum
>>>>>>>    * Last updated: Tue May 11 16:17:59 2021
>>>>>>>    * Last change:  Tue May 11 16:15:42 2021 by root via cibadmin on susanne
>>>>>>>    * 2 nodes configured
>>>>>>>    * 11 resource instances configured
>>>>>>>
>>>>>>> Node List:
>>>>>>>    * Online: [ lisbon susanne ]
>>>>>>>
>>>>>>> Active Resources:
>>>>>>>    * HA_IP       (ocf::heartbeat:IPaddr2):        Started susanne
>>>>>>>    * Clone Set: database_drbd-clone [database_drbd] (promotable):
>>>>>>>      * Masters: [ susanne ]
>>>>>>>      * Slaves: [ lisbon ]
>>>>>>>    * Clone Set: drbd_logsfiles-clone [drbd_logsfiles] (promotable):
>>>>>>>      * Masters: [ susanne ]
>>>>>>>      * Slaves: [ lisbon ]
>>>>>>>    * fs_logfiles (ocf::heartbeat:Filesystem):     Started susanne
>>>>>>>    * fs_database (ocf::heartbeat:Filesystem):     Started susanne
>>>>>>>    * mysql-server        (ocf::heartbeat:mysql):  Started susanne
>>>>>>>    * Resource Group: apache:
>>>>>>>      * httpd_srv (ocf::heartbeat:apache):         Started susanne
>>>>>>>    * Clone Set: ping_fw-clone [ping_fw]:
>>>>>>>      * Started: [ lisbon susanne ]
>>>>>>>
>>>>>>> Failed Resource Actions:
>>>>>>>    * drbd_logsfiles_demote_0 on susanne 'error' (1): call=736, status='Timed Out'
>>>>>>> , exitreason='', last-rc-change='2021-05-11 16:15:42 +02:00', queued=0ms, exec=9 0001ms
>>>>>>> ----------------------------------------------------------------------------------------------
>>>>>>>
>>>>>>
>>>>>> And what you see in logs?
>>>>>>
>>>>>>> I think it is a constraint-problem, but I can't find it.
>>>>>>> This is my config:
>>>>>>> [root at susanne pacemaker]# pcs config show Cluster Name: mysql_cluster Corosync Nodes:
>>>>>>>   susanne lisbon
>>>>>>> Pacemaker Nodes:
>>>>>>>   lisbon susanne
>>>>>>>
>>>>>>> Resources:
>>>>>>>   Resource: HA_IP (class=ocf provider=heartbeat type=IPaddr2)
>>>>>>>    Attributes: cidr_netmask=24 ip=192.168.18.154
>>>>>>>    Operations: monitor interval=15s (HA_IP-monitor-interval-15s)
>>>>>>>                start interval=0s timeout=20s (HA_IP-start-interval-0s)
>>>>>>>                stop interval=0s timeout=20s (HA_IP-stop-interval-0s)
>>>>>>>   Clone: database_drbd-clone
>>>>>>>    Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1
>>>>>>>    Resource: database_drbd (class=ocf provider=linbit type=drbd)
>>>>>>>     Attributes: drbd_resource=drbd1
>>>>>>>     Operations: demote interval=0s timeout=90 (database_drbd-demote-interval-0s)
>>>>>>>                 monitor interval=20 role=Slave timeout=20 (database_drbd-monitor-interval-20)
>>>>>>>                 monitor interval=10 role=Master timeout=20 (database_drbd-monitor-interval-10)
>>>>>>>                 notify interval=0s timeout=90 (database_drbd-notify-interval-0s)
>>>>>>>                 promote interval=0s timeout=90 (database_drbd-promote-interval-0s)
>>>>>>>                 reload interval=0s timeout=30 (database_drbd-reload-interval-0s)
>>>>>>>                 start interval=0s timeout=240 (database_drbd-start-interval-0s)
>>>>>>>                 stop interval=0s timeout=100 (database_drbd-stop-interval-0s)
>>>>>>>   Clone: drbd_logsfiles-clone
>>>>>>>    Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1
>>>>>>>    Resource: drbd_logsfiles (class=ocf provider=linbit type=drbd)
>>>>>>>     Attributes: drbd_resource=drbd2
>>>>>>>     Operations: demote interval=0s timeout=90 (drbd_logsfiles-demote-interval-0s)
>>>>>>>                 monitor interval=20 role=Slave timeout=20 (drbd_logsfiles-monitor-interval-20)
>>>>>>>                 monitor interval=10 role=Master timeout=20 (drbd_logsfiles-monitor-interval-10)
>>>>>>>                 notify interval=0s timeout=90 (drbd_logsfiles-notify-interval-0s)
>>>>>>>                 promote interval=0s timeout=90 (drbd_logsfiles-promote-interval-0s)
>>>>>>>                 reload interval=0s timeout=30 (drbd_logsfiles-reload-interval-0s)
>>>>>>>                 start interval=0s timeout=240 (drbd_logsfiles-start-interval-0s)
>>>>>>>                 stop interval=0s timeout=100 (drbd_logsfiles-stop-interval-0s)
>>>>>>>   Resource: fs_logfiles (class=ocf provider=heartbeat type=Filesystem)
>>>>>>>    Attributes: device=/dev/drbd2 directory=/mnt/clusterfs2 fstype=ext4
>>>>>>>    Operations: monitor interval=20s timeout=40s (fs_logfiles-monitor-interval-20s)
>>>>>>>                start interval=0s timeout=60s (fs_logfiles-start-interval-0s)
>>>>>>>                stop interval=0s timeout=60s (fs_logfiles-stop-interval-0s)
>>>>>>>   Resource: fs_database (class=ocf provider=heartbeat type=Filesystem)
>>>>>>>    Attributes: device=/dev/drbd1 directory=/mnt/clusterfs1 fstype=ext4
>>>>>>>    Operations: monitor interval=20s timeout=40s (fs_database-monitor-interval-20s)
>>>>>>>                start interval=0s timeout=60s (fs_database-start-interval-0s)
>>>>>>>                stop interval=0s timeout=60s (fs_database-stop-interval-0s)
>>>>>>>   Resource: mysql-server (class=ocf provider=heartbeat type=mysql)
>>>>>>>    Attributes: additional_parameters=--bind-address=0.0.0.0 binary=/usr/bin/mysqld_safe config=/etc/my.cnf datadir=/mnt/clusterfs1/mysql pid=/var/lib/mysql/run/mariadb.pid socket=/var/lib/mysql/mysql.sock
>>>>>>>    Operations: demote interval=0s timeout=120s (mysql-server-demote-interval-0s)
>>>>>>>                monitor interval=20s timeout=30s (mysql-server-monitor-interval-20s)
>>>>>>>                notify interval=0s timeout=90s (mysql-server-notify-interval-0s)
>>>>>>>                promote interval=0s timeout=120s (mysql-server-promote-interval-0s)
>>>>>>>                start interval=0s timeout=60s (mysql-server-start-interval-0s)
>>>>>>>                stop interval=0s timeout=60s (mysql-server-stop-interval-0s)
>>>>>>>   Group: apache
>>>>>>>    Resource: httpd_srv (class=ocf provider=heartbeat type=apache)
>>>>>>>     Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://127.0.0.1/server-status
>>>>>>>     Operations: monitor interval=10s timeout=20s (httpd_srv-monitor-interval-10s)
>>>>>>>                 start interval=0s timeout=40s (httpd_srv-start-interval-0s)
>>>>>>>                 stop interval=0s timeout=60s (httpd_srv-stop-interval-0s)
>>>>>>>   Clone: ping_fw-clone
>>>>>>>    Resource: ping_fw (class=ocf provider=pacemaker type=ping)
>>>>>>>     Attributes: dampen=10s host_list=192.168.18.1 multiplier=1000
>>>>>>>     Operations: monitor interval=10s timeout=60s (ping_fw-monitor-interval-10s)
>>>>>>>                 start interval=0s timeout=60s (ping_fw-start-interval-0s)
>>>>>>>                 stop interval=0s timeout=20s (ping_fw-stop-interval-0s)
>>>>>>>
>>>>>>> Stonith Devices:
>>>>>>> Fencing Levels:
>>>>>>>
>>>>>>> Location Constraints:
>>>>>>>    Resource: mysql-server
>>>>>>>      Constraint: location-mysql-server
>>>>>>>        Rule: boolean-op=or score=-INFINITY (id:location-mysql-server-rule)
>>>>>>>          Expression: pingd lt 1 (id:location-mysql-server-rule-expr)
>>>>>>>          Expression: not_defined pingd (id:location-mysql-server-rule-expr-1)
>>>>>>> Ordering Constraints:
>>>>>>>    start mysql-server then start httpd_srv (kind:Mandatory) (id:order-mysql-server-httpd_srv-mandatory)
>>>>>>>    start database_drbd-clone then start drbd_logsfiles-clone (kind:Mandatory) (id:order-database_drbd-clone-drbd_logsfiles-clone-mandatory)
>>>>>>>    start drbd_logsfiles-clone then start fs_database (kind:Mandatory) (id:order-drbd_logsfiles-clone-fs_database-mandatory)
>>>>>>
>>>>>> You need order fs_database after promote operation; and as I just found
>>>>>> pacemaker also does not reverse it correctly and executes fs stop and
>>>>>> drbd demote concurrently. So you need additional order constraint to
>>>>>> first stop fs then demote drbd.
>>>>>>
>>>>>>>    start fs_database then start fs_logfiles (kind:Mandatory) (id:order-fs_database-fs_logfiles-mandatory)
>>>>>>>    start fs_logfiles then start mysql-server (kind:Mandatory) (id:order-fs_logfiles-mysql-server-mandatory)
>>>>>>> Colocation Constraints:
>>>>>>>    fs_logfiles with drbd_logsfiles-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-fs_logfiles-drbd_logsfiles-clone-INFINITY)
>>>>>>>    fs_database with database_drbd-clone (score:INFINITY) (with-rsc-role:Master) (id:colocation-fs_database-database_drbd-clone-INFINITY)
>>>>>>>    drbd_logsfiles-clone with database_drbd-clone (score:INFINITY) (rsc-role:Master) (with-rsc-role:Master) (id:colocation-drbd_logsfiles-clone-database_drbd-clone-INFINITY)
>>>>>>>    HA_IP with database_drbd-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Master) (id:colocation-HA_IP-database_drbd-clone-INFINITY)
>>>>>>>    mysql-server with fs_database (score:INFINITY) (id:colocation-mysql-server-fs_database-INFINITY)
>>>>>>>    httpd_srv with mysql-server (score:INFINITY) (id:colocation-httpd_srv-mysql-server-INFINITY)
>>>>>>> Ticket Constraints:
>>>>>>>
>>>>>>> Alerts:
>>>>>>>   No alerts defined
>>>>>>>
>>>>>>> Resources Defaults:
>>>>>>>    No defaults set
>>>>>>> Operations Defaults:
>>>>>>>    No defaults set
>>>>>>>
>>>>>>> Cluster Properties:
>>>>>>>   cluster-infrastructure: corosync
>>>>>>>   cluster-name: mysql_cluster
>>>>>>>   dc-version: 2.0.4-6.el8_3.1-2deceaa3ae
>>>>>>>   have-watchdog: false
>>>>>>>   last-lrm-refresh: 1620742514
>>>>>>>   stonith-enabled: FALSE
>>>>>>>
>>>>>>> Tags:
>>>>>>>   No tags defined
>>>>>>>
>>>>>>> Quorum:
>>>>>>>    Options:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Any suggestions are welcome
>>>>>>>
>>>>>>> best regards stay safe, take care
>>>>>>>
>>>>>>> fatcharly
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Manage your subscription:
>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>>
>>>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Manage your subscription:
>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>
>>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>>
>>>>> _______________________________________________
>>>>> Manage your subscription:
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>
>>>
>>> _______________________________________________
>>> Manage your subscription:
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> ClusterLabs home: https://www.clusterlabs.org/
>>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> ClusterLabs home: https://www.clusterlabs.org/
>