[ClusterLabs] Antw: Re: Why Won't Resources Move?

Thu Aug 2 12:14:39 UTC 2018

> Hi!
> 
> I'm not familiar with Redhat, but is tis normal?:
> 
> > >   corosync: active/disabled
> > >   pacemaker: active/disabled
> 
> Regards,
> Ulrich

That's the default after a new install. I had not enabled them to start automatically yet. 

> 
> >>> Eric Robinson <eric.robinson at psmnv.com> schrieb am 02.08.2018 um
> >>> 03:44 in
> Nachricht
> <MWHPR03MB3296276B6B98EEBE8CF4262EFA2C0 at MWHPR03MB3296.namp
> rd03.prod.outlook.com>
> 
> >>  -----Original Message-----
> >> From: Users [mailto:users-bounces at clusterlabs.org] On Behalf Of Ken
> Gaillot
> >> Sent: Wednesday, August 01, 2018 2:17 PM
> >> To: Cluster Labs - All topics related to open-source clustering
> >> welcomed <users at clusterlabs.org>
> >> Subject: Re: [ClusterLabs] Why Won't Resources Move?
> >>
> >> On Wed, 2018-08-01 at 03:49 +0000, Eric Robinson wrote:
> >> > I have what seems to be a healthy cluster, but I can’t get
> >> > resources to move.
> >> >
> >> > Here’s what’s installed…
> >> >
> >> > [root at 001db01a cluster]# yum list installed|egrep "pacem|coro"
> >> > corosync.x86_64                  2.4.3-2.el7_5.1 @updates
> >> > corosynclib.x86_64               2.4.3-2.el7_5.1 @updates
> >> > pacemaker.x86_64                 1.1.18-11.el7_5.3 @updates
> >> > pacemaker-cli.x86_64             1.1.18-11.el7_5.3 @updates
> >> > pacemaker-cluster-libs.x86_64    1.1.18-11.el7_5.3 @updates
> >> > pacemaker-libs.x86_64            1.1.18-11.el7_5.3 @updates
> >> >
> >> > Cluster status looks good…
> >> >
> >> > [root at 001db01b cluster]# pcs status Cluster name: 001db01ab
> >> > Stack: corosync
> >> > Current DC: 001db01b (version 1.1.18-11.el7_5.3-2b07d5c5a9) -
> >> > partition with quorum Last updated: Wed Aug  1 03:44:47 2018 Last
> >> > change: Wed Aug  1 03:22:18 2018 by root via cibadmin on 001db01a
> >> >
> >> > 2 nodes configured
> >> > 11 resources configured
> >> >
> >> > Online: [ 001db01a 001db01b ]
> >> >
> >> > Full list of resources:
> >> >
> >> > p_vip_clust01  (ocf::heartbeat:IPaddr2):       Started 001db01b
> >> > p_azip_clust01 (ocf::heartbeat:AZaddr2):       Started 001db01b
> >> > Master/Slave Set: ms_drbd0 [p_drbd0]
> >> >      Masters: [ 001db01b ]
> >> >      Slaves: [ 001db01a ]
> >> > Master/Slave Set: ms_drbd1 [p_drbd1]
> >> >      Masters: [ 001db01b ]
> >> >      Slaves: [ 001db01a ]
> >> > p_fs_clust01   (ocf::heartbeat:Filesystem):    Started 001db01b
> >> > p_fs_clust02   (ocf::heartbeat:Filesystem):    Started 001db01b
> >> > p_vip_clust02  (ocf::heartbeat:IPaddr2):       Started 001db01b
> >> > p_azip_clust02 (ocf::heartbeat:AZaddr2):       Started 001db01b
> >> > p_mysql_001    (lsb:mysql_001):        Started 001db01b
> >> >
> >> > Daemon Status:
> >> >   corosync: active/disabled
> >> >   pacemaker: active/disabled
> >> >   pcsd: active/enabled
> >> >
> >> > Constraints look like this…
> >> >
> >> > [root at 001db01b cluster]# pcs constraint Location Constraints:
> >> > Ordering Constraints:
> >> >   promote ms_drbd0 then start p_fs_clust01 (kind:Mandatory)
> >> >   promote ms_drbd1 then start p_fs_clust02 (kind:Mandatory)
> >> >   start p_fs_clust01 then start p_vip_clust01 (kind:Mandatory)
> >> >   start p_vip_clust01 then start p_azip_clust01 (kind:Mandatory)
> >> >   start p_fs_clust02 then start p_vip_clust02 (kind:Mandatory)
> >> >   start p_vip_clust02 then start p_azip_clust02 (kind:Mandatory)
> >> >   start p_vip_clust01 then start p_mysql_001 (kind:Mandatory)
> >> > Colocation Constraints:
> >> >   p_azip_clust01 with p_vip_clust01 (score:INFINITY)
> >> >   p_fs_clust01 with ms_drbd0 (score:INFINITY) (with-rsc-role:Master)
> >> >   p_fs_clust02 with ms_drbd1 (score:INFINITY) (with-rsc-role:Master)
> >> >   p_vip_clust01 with p_fs_clust01 (score:INFINITY)
> >> >   p_vip_clust02 with p_fs_clust02 (score:INFINITY)
> >> >   p_azip_clust02 with p_vip_clust02 (score:INFINITY)
> >> >   p_mysql_001 with p_vip_clust01 (score:INFINITY) Ticket Constraints:
> >> >
> >> > But when I issue a move command, nothing at all happens.
> >> >
> >> > I see this in the log on one node…
> >> >
> >> > Aug 01 03:21:57 [16550] 001db01b        cib:     info:
> >> > cib_perform_op:  ++ /cib/configuration/constraints:  <rsc_location
> >> > id="cli-prefer-ms_drbd0" rsc="ms_drbd0" role="Started"
> >> > node="001db01a" score="INFINITY"/>
> >> > Aug 01 03:21:57 [16550] 001db01b        cib:     info:
> >> > cib_process_request:     Completed cib_modify operation for section
> >> > constraints: OK (rc=0, origin=001db01a/crm_resource/4,
> >> > version=0.138.0)
> >> > Aug 01 03:21:57 [16555] 001db01b       crmd:     info:
> >> > abort_transition_graph:  Transition aborted by rsc_location.cli-
> >> > prefer-ms_drbd0 'create': Configuration change | cib=0.138.0
> >> > source=te_update_diff:456 path=/cib/configuration/constraints
> >> > complete=true
> >> >
> >> > And I see this in the log on the other node…
> >> >
> >> > notice: p_drbd1_monitor_60000:69196:stderr [ Error signing on to
> >> > the CIB service: Transport endpoint is not connected ]
> >>
> >> The message likely came from the resource agent calling crm_attribute
> >> to
> set
> >> a node attribute. That message usually means the cluster isn't
> >> running on
> > that
> >> node, so it's highly suspect. The cib might have crashed, which
> >> should be
> in
> > the
> >> log as well. I'd look into that first.
> >
> >
> > I rebooted the server and afterwards I'm still getting tons of these...
> >
> > Aug  2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql:
> > Called /usr/sbin/crm_master -Q -l reboot -v 10000 Aug  2 01:43:40
> > 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Called
> > /usr/sbin/crm_master -Q -l reboot -v 10000 Aug  2 01:43:40 001db01a
> > drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Exit code
> 
> > 107
> > Aug  2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Exit
> > code
> 
> > 107
> > Aug  2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql:
> > Command
> > output:
> > Aug  2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql:
> > Command
> > output:
> > Aug  2 01:43:40 001db01a lrmd[2025]:  notice:
> > p_drbd0_monitor_60000:18627:stderr [ Error signing on to the CIB service:
> > Transport endpoint is not connected ]
> > Aug  2 01:43:40 001db01a lrmd[2025]:  notice:
> > p_drbd1_monitor_60000:18628:stderr [ Error signing on to the CIB service:
> > Transport endpoint is not connected ]
> >
> >
> >>
> >> >
> >> > Any thoughts?
> >> >
> >> > --Eric
> >> --
> >> Ken Gaillot <kgaillot at redhat.com>
> >> _______________________________________________
> >> Users mailing list: Users at clusterlabs.org
> >> https://lists.clusterlabs.org/mailman/listinfo/users
> >>
> >> Project Home: http://www.clusterlabs.org Getting started:
> >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> > _______________________________________________
> > Users mailing list: Users at clusterlabs.org
> > https://lists.clusterlabs.org/mailman/listinfo/users
> >
> > Project Home: http://www.clusterlabs.org Getting started:
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org