[ClusterLabs] Antw: Re: Why Won't Resources Move?
Andrei Borzenkov
arvidjaar at gmail.com
Thu Aug 2 06:10:33 EDT 2018
Отправлено с iPhone
> 2 авг. 2018 г., в 9:27, Ulrich Windl <Ulrich.Windl at rz.uni-regensburg.de> написал(а):
>
> Hi!
>
> I'm not familiar with Redhat, but is tis normal?:
>
>>> corosync: active/disabled
>>> pacemaker: active/disabled
>
Some administrators prefer starting cluster stack manually, so it may be intentional.
> Regards,
> Ulrich
>
>>>> Eric Robinson <eric.robinson at psmnv.com> schrieb am 02.08.2018 um 03:44 in
> Nachricht
> <MWHPR03MB3296276B6B98EEBE8CF4262EFA2C0 at MWHPR03MB3296.namprd03.prod.outlook.com>
>
>>> -----Original Message-----
>>> From: Users [mailto:users-bounces at clusterlabs.org] On Behalf Of Ken
> Gaillot
>>> Sent: Wednesday, August 01, 2018 2:17 PM
>>> To: Cluster Labs - All topics related to open-source clustering welcomed
>>> <users at clusterlabs.org>
>>> Subject: Re: [ClusterLabs] Why Won't Resources Move?
>>>
>>>> On Wed, 2018-08-01 at 03:49 +0000, Eric Robinson wrote:
>>>> I have what seems to be a healthy cluster, but I can’t get resources
>>>> to move.
>>>>
>>>> Here’s what’s installed…
>>>>
>>>> [root at 001db01a cluster]# yum list installed|egrep "pacem|coro"
>>>> corosync.x86_64 2.4.3-2.el7_5.1 @updates
>>>> corosynclib.x86_64 2.4.3-2.el7_5.1 @updates
>>>> pacemaker.x86_64 1.1.18-11.el7_5.3 @updates
>>>> pacemaker-cli.x86_64 1.1.18-11.el7_5.3 @updates
>>>> pacemaker-cluster-libs.x86_64 1.1.18-11.el7_5.3 @updates
>>>> pacemaker-libs.x86_64 1.1.18-11.el7_5.3 @updates
>>>>
>>>> Cluster status looks good…
>>>>
>>>> [root at 001db01b cluster]# pcs status
>>>> Cluster name: 001db01ab
>>>> Stack: corosync
>>>> Current DC: 001db01b (version 1.1.18-11.el7_5.3-2b07d5c5a9) -
>>>> partition with quorum Last updated: Wed Aug 1 03:44:47 2018 Last
>>>> change: Wed Aug 1 03:22:18 2018 by root via cibadmin on 001db01a
>>>>
>>>> 2 nodes configured
>>>> 11 resources configured
>>>>
>>>> Online: [ 001db01a 001db01b ]
>>>>
>>>> Full list of resources:
>>>>
>>>> p_vip_clust01 (ocf::heartbeat:IPaddr2): Started 001db01b
>>>> p_azip_clust01 (ocf::heartbeat:AZaddr2): Started 001db01b
>>>> Master/Slave Set: ms_drbd0 [p_drbd0]
>>>> Masters: [ 001db01b ]
>>>> Slaves: [ 001db01a ]
>>>> Master/Slave Set: ms_drbd1 [p_drbd1]
>>>> Masters: [ 001db01b ]
>>>> Slaves: [ 001db01a ]
>>>> p_fs_clust01 (ocf::heartbeat:Filesystem): Started 001db01b
>>>> p_fs_clust02 (ocf::heartbeat:Filesystem): Started 001db01b
>>>> p_vip_clust02 (ocf::heartbeat:IPaddr2): Started 001db01b
>>>> p_azip_clust02 (ocf::heartbeat:AZaddr2): Started 001db01b
>>>> p_mysql_001 (lsb:mysql_001): Started 001db01b
>>>>
>>>> Daemon Status:
>>>> corosync: active/disabled
>>>> pacemaker: active/disabled
>>>> pcsd: active/enabled
>>>>
>>>> Constraints look like this…
>>>>
>>>> [root at 001db01b cluster]# pcs constraint Location Constraints:
>>>> Ordering Constraints:
>>>> promote ms_drbd0 then start p_fs_clust01 (kind:Mandatory)
>>>> promote ms_drbd1 then start p_fs_clust02 (kind:Mandatory)
>>>> start p_fs_clust01 then start p_vip_clust01 (kind:Mandatory)
>>>> start p_vip_clust01 then start p_azip_clust01 (kind:Mandatory)
>>>> start p_fs_clust02 then start p_vip_clust02 (kind:Mandatory)
>>>> start p_vip_clust02 then start p_azip_clust02 (kind:Mandatory)
>>>> start p_vip_clust01 then start p_mysql_001 (kind:Mandatory)
>>>> Colocation Constraints:
>>>> p_azip_clust01 with p_vip_clust01 (score:INFINITY)
>>>> p_fs_clust01 with ms_drbd0 (score:INFINITY) (with-rsc-role:Master)
>>>> p_fs_clust02 with ms_drbd1 (score:INFINITY) (with-rsc-role:Master)
>>>> p_vip_clust01 with p_fs_clust01 (score:INFINITY)
>>>> p_vip_clust02 with p_fs_clust02 (score:INFINITY)
>>>> p_azip_clust02 with p_vip_clust02 (score:INFINITY)
>>>> p_mysql_001 with p_vip_clust01 (score:INFINITY) Ticket Constraints:
>>>>
>>>> But when I issue a move command, nothing at all happens.
>>>>
>>>> I see this in the log on one node…
>>>>
>>>> Aug 01 03:21:57 [16550] 001db01b cib: info:
>>>> cib_perform_op: ++ /cib/configuration/constraints: <rsc_location
>>>> id="cli-prefer-ms_drbd0" rsc="ms_drbd0" role="Started"
>>>> node="001db01a" score="INFINITY"/>
>>>> Aug 01 03:21:57 [16550] 001db01b cib: info:
>>>> cib_process_request: Completed cib_modify operation for section
>>>> constraints: OK (rc=0, origin=001db01a/crm_resource/4,
>>>> version=0.138.0)
>>>> Aug 01 03:21:57 [16555] 001db01b crmd: info:
>>>> abort_transition_graph: Transition aborted by rsc_location.cli-
>>>> prefer-ms_drbd0 'create': Configuration change | cib=0.138.0
>>>> source=te_update_diff:456 path=/cib/configuration/constraints
>>>> complete=true
>>>>
>>>> And I see this in the log on the other node…
>>>>
>>>> notice: p_drbd1_monitor_60000:69196:stderr [ Error signing on to the
>>>> CIB service: Transport endpoint is not connected ]
>>>
>>> The message likely came from the resource agent calling crm_attribute to
> set
>>> a node attribute. That message usually means the cluster isn't running on
>> that
>>> node, so it's highly suspect. The cib might have crashed, which should be
> in
>> the
>>> log as well. I'd look into that first.
>>
>>
>> I rebooted the server and afterwards I'm still getting tons of these...
>>
>> Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Called
>> /usr/sbin/crm_master -Q -l reboot -v 10000
>> Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Called
>> /usr/sbin/crm_master -Q -l reboot -v 10000
>> Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Exit code
>
>> 107
>> Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Exit code
>
>> 107
>> Aug 2 01:43:40 001db01a drbd(p_drbd0)[18627]: ERROR: ha01_mysql: Command
>> output:
>> Aug 2 01:43:40 001db01a drbd(p_drbd1)[18628]: ERROR: ha02_mysql: Command
>> output:
>> Aug 2 01:43:40 001db01a lrmd[2025]: notice:
>> p_drbd0_monitor_60000:18627:stderr [ Error signing on to the CIB service:
>> Transport endpoint is not connected ]
>> Aug 2 01:43:40 001db01a lrmd[2025]: notice:
>> p_drbd1_monitor_60000:18628:stderr [ Error signing on to the CIB service:
>> Transport endpoint is not connected ]
>>
>>
>>>
>>>>
>>>> Any thoughts?
>>>>
>>>> --Eric
>>> --
>>> Ken Gaillot <kgaillot at redhat.com>
>>> _______________________________________________
>>> Users mailing list: Users at clusterlabs.org
>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org Getting started:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list