[ClusterLabs] How can I prevent multiple start of IPaddr 2 in an environment using fence_mpath?

Fri Apr 6 14:12:03 UTC 2018

On Fri, 2018-04-06 at 04:30 +0000, 飯田 雄介 wrote:
> Hi, all
> I am testing the environment using fence_mpath with the following
> settings.
> 
> =======
>   Stack: corosync
>   Current DC: x3650f (version 1.1.17-1.el7-b36b869) - partition with
> quorum
>   Last updated: Fri Apr  6 13:16:20 2018
>   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on
> x3650e
> 
>   2 nodes configured
>   13 resources configured
> 
>   Online: [ x3650e x3650f ]
> 
>   Full list of resources:
> 
>    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
>    fenceMpath-x3650f    (stonith:fence_mpath):  Started x3650f
>    Resource Group: grpPostgreSQLDB
>        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started
> x3650e
>    Resource Group: grpPostgreSQLIP
>        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Start
> ed x3650e
>    Clone Set: clnDiskd1 [prmDiskd1]
>        Started: [ x3650e x3650f ]
>    Clone Set: clnDiskd2 [prmDiskd2]
>        Started: [ x3650e x3650f ]
>    Clone Set: clnPing [prmPing]
>        Started: [ x3650e x3650f ]
> =======
> 
> When split-brain occurs in this environment, x3650f executes fence
> and the resource is started with x3650f.
> 
> === view of x3650e ====
>   Stack: corosync
>   Current DC: x3650e (version 1.1.17-1.el7-b36b869) - partition
> WITHOUT quorum
>   Last updated: Fri Apr  6 13:16:36 2018
>   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on
> x3650e
> 
>   2 nodes configured
>   13 resources configured
> 
>   Node x3650f: UNCLEAN (offline)
>   Online: [ x3650e ]
> 
>   Full list of resources:
> 
>    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
>    fenceMpath-x3650f    (stonith:fence_mpath):  Started[ x3650e
> x3650f ]
>    Resource Group: grpPostgreSQLDB
>        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started
> x3650e
>    Resource Group: grpPostgreSQLIP
>        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Start
> ed x3650e
>    Clone Set: clnDiskd1 [prmDiskd1]
>        prmDiskd1        (ocf::pacemaker:diskd): Started x3650f
> (UNCLEAN)
>        Started: [ x3650e ]
>    Clone Set: clnDiskd2 [prmDiskd2]
>        prmDiskd2        (ocf::pacemaker:diskd): Started x3650f
> (UNCLEAN)
>        Started: [ x3650e ]
>    Clone Set: clnPing [prmPing]
>        prmPing  (ocf::pacemaker:ping):  Started x3650f (UNCLEAN)
>        Started: [ x3650e ]
> 
> === view of x3650f ====
>   Stack: corosync
>   Current DC: x3650f (version 1.1.17-1.el7-b36b869) - partition
> WITHOUT quorum
>   Last updated: Fri Apr  6 13:16:36 2018
>   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on
> x3650e
> 
>   2 nodes configured
>   13 resources configured
> 
>   Online: [ x3650f ]
>   OFFLINE: [ x3650e ]
> 
>   Full list of resources:
> 
>    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650f
>    fenceMpath-x3650f    (stonith:fence_mpath):  Started x3650f
>    Resource Group: grpPostgreSQLDB
>        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Start
> ed x3650f
>        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Start
> ed x3650f
>        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Start
> ed x3650f
>        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Started
> x3650f
>    Resource Group: grpPostgreSQLIP
>        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Start
> ed x3650f
>    Clone Set: clnDiskd1 [prmDiskd1]
>        Started: [ x3650f ]
>        Stopped: [ x3650e ]
>    Clone Set: clnDiskd2 [prmDiskd2]
>        Started: [ x3650f ]
>        Stopped: [ x3650e ]
>    Clone Set: clnPing [prmPing]
>        Started: [ x3650f ]
>        Stopped: [ x3650e ]
> =======
> 
> However, IPaddr2 of x3650e will not stop until pgsql monitor error
> occurs.
> At this time, IPaddr2 is temporarily started on two nodes.
> 
> === view of after pgsql monitor error ===
>   Stack: corosync
>   Current DC: x3650e (version 1.1.17-1.el7-b36b869) - partition
> WITHOUT quorum
>   Last updated: Fri Apr  6 13:16:56 2018
>   Last change: Thu Mar  1 18:38:02 2018 by root via cibadmin on
> x3650e
> 
>   2 nodes configured
>   13 resources configured
> 
>   Node x3650f: UNCLEAN (offline)
>   Online: [ x3650e ]
> 
>   Full list of resources:
> 
>    fenceMpath-x3650e    (stonith:fence_mpath):  Started x3650e
>    fenceMpath-x3650f    (stonith:fence_mpath):  Started[ x3650e
> x3650f ]
>    Resource Group: grpPostgreSQLDB
>        prmFsPostgreSQLDB1       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB2       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmFsPostgreSQLDB3       (ocf::heartbeat:Filesystem):    Start
> ed x3650e
>        prmApPostgreSQLDB        (ocf::heartbeat:pgsql): Stopped
>    Resource Group: grpPostgreSQLIP
>        prmIpPostgreSQLDB        (ocf::heartbeat:IPaddr2):       Stopp
> ed
>    Clone Set: clnDiskd1 [prmDiskd1]
>        prmDiskd1        (ocf::pacemaker:diskd): Started x3650f
> (UNCLEAN)
>        Started: [ x3650e ]
>    Clone Set: clnDiskd2 [prmDiskd2]
>        prmDiskd2        (ocf::pacemaker:diskd): Started x3650f
> (UNCLEAN)
>        Started: [ x3650e ]
>    Clone Set: clnPing [prmPing]
>        prmPing  (ocf::pacemaker:ping):  Started x3650f (UNCLEAN)
>        Started: [ x3650e ]
> 
>   Node Attributes:
>   * Node x3650e:
>       + default_ping_set                        : 100
>       + diskcheck_status                        : normal
>       + diskcheck_status_internal               : normal
> 
>   Migration Summary:
>   * Node x3650e:
>      prmApPostgreSQLDB: migration-threshold=1 fail-count=1 last-
> failure='Fri Apr  6 13:16:39 2018'
> 
>   Failed Actions:
>   * prmApPostgreSQLDB_monitor_10000 on x3650e 'not running' (7):
> call=60, status=complete, exitreason='Configuration file
> /dbfp/pgdata/data/postgresql.conf doesn't exist',
>       last-rc-change='Fri Apr  6 13:16:39 2018', queued=0ms, exec=0ms
> ======
> 
> We regard this behavior as a problem.
> Is there a way to avoid this behavior?
> 
> Regards, Yusuke

Hi Yusuke,

One possibility would be to implement network fabric fencing as well,
e.g. fence_snmp with an SNMP-capable network switch. You can make a
fencing topology level with both the storage and network devices.

The main drawback is that unfencing isn't automatic. After a fenced
node is ready to rejoin, you have to clear the block at the switch
yourself.
-- 
Ken Gaillot <kgaillot at redhat.com>