[ClusterLabs] Antw: [EXT] Re: WebSite_start_0 on node2 'error' (1): call=6, status='complete', exitreason='Failed to access httpd status page.'

Tue Mar 23 11:40:42 EDT 2021

>>> Jason Long <hack3rcon at yahoo.com> schrieb am 23.03.2021 um 16:07 in
Nachricht
<848851435.2725012.1616512048076 at mail.yahoo.com>:
> Thank you.
> Thus, where I must define my node2 IP address? When node1 disconnected, I 
> want node2 replace it.

As most people (including me) tried to explain: If you want that, you need
another product, not pacemaker.

> 
> 
> 
> 
> 
> 
> On Tuesday, March 23, 2021, 01:03:39 PM GMT+4:30, Klaus Wenninger 
> <kwenning at redhat.com> wrote: 
> 
> 
> 
> 
> 
> On 3/23/21 9:13 AM, Jason Long wrote:
>> Thank you.
>> But: 
>
https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_f

> rom_Scratch/ch06.html
>> ?
>>
>> The floating IP address is: 
>
https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_f

> rom_Scratch/_add_a_resource.html
>> In the "Warning" written: "The chosen address must not already be in use on

> the network. Do not reuse an IP address one of the nodes already has 
> configured.", what does it mean?
> It means that if you would be using an IP that is already in use
> on your network - by one of your cluster-nodes or something else -
> pacemaker would possibly activate that IP and you would have
> a duplicate IP in your network.
> Thus for the question below: Don't use the IP od node2 for
> your floating IP.
> 
> Klaus
> 
>>
>> In the below command, "IP" is the IP address of my node2?
>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.120 
> cidr_netmask=32 op monitor interval=30s
>>
>> If yes, then I must update it with below command?
>>
>> # pcs resource update floating_ip ocf:heartbeat:IPaddr2 ip="Node2 IP" 
> cidr_netmask=32 op monitor interval=30s
>>
>>
>>
>>
>>
>>
>> On Tuesday, March 23, 2021, 12:02:15 AM GMT+4:30, Ken Gaillot 
> <kgaillot at redhat.com> wrote:
>>
>>
>>
>>
>>
>> On Mon, 2021-03-22 at 17:31 +0000, Jason Long wrote:
>>> Thank you.
>>>  From chapter 1 to 6, I never saw anything about configuring the
>>> floating IP address! Am I wrong?
>> Hi,
>>
>> Chapter 6 should be "Create an Active/Passive Cluster", which adds a
>> floating IP, then Chapter 7 is "Add Apache HTTP Server as a Cluster
>> Service".
>>
>>
>>
>>> On Monday, March 22, 2021, 07:06:47 PM GMT+4:30, Ken Gaillot <
>>> kgaillot at redhat.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Mon, 2021-03-22 at 08:15 +0000, Jason Long wrote:
>>>> Thank you.
>>>>
>>>> My test lab use VirtualBox with two VMs as below:
>>>> VM1: This VM has two NICs (NAT, Host-only Adapter)
>>>> VM2: This VM has one NIC (Host-only Adapter)
>>>>
>>>> On VM1, I use the NAT interface for the port forwarding:
>>>> "127.0.0.1:2080" on Host  FORWARDING TO 127.0.0.1:80 on Guest.
>>>>
>>>>
>>>> Yes, "systemctl" tell me:
>>>>
>>>> # systemctl is-enabled httpd.service
>>>> disabled
>>>>
>>>> I rebooted my nodes and one of the problems solved:
>>>> https://paste.ubuntu.com/p/7cQQtsXFPV/ 
>>>>
>>>> I did:
>>>> # pcs resource defaults resource-stickiness=100
>>>>
>>>>
>>>> When I browse "127.0.0.1:2080" then it shows me "My Test Site -
>>>> node1".
>>>>
>>>> I have two problems:
>>>>
>>>> 1- When I stopped the node1 VM and refresh the page then I can't
>>>> see
>>>> "My Test Site - node2"?
>>>>
>>>> # pcs cluster stop node1
>>>> node1: Stopping Cluster (pacemaker)...
>>>> node1: Stopping Cluster (corosync)...
>>>>
>>>> # pcs status
>>>> Error: error running crm_mon, is pacemaker running?
>>>> Could not connect to the CIB: Transport endpoint is not connected
>>>> crm_mon: Error: cluster is not available on this node
>>> Hi,
>>>
>>> pcs status doesn't test the web site, it shows the internal cluster
>>> status. Since the cluster isn't running on that node, it can't show
>>> anything.
>>>
>>> However the website is still active on the other node, and reachable
>>> from this node. You can confirm that by using wget or curl with the
>>> public web site URL (the floating IP address).
>>>
>>>> # pcs resource defaults
>>>> Error: unable to get cib
>>>>
>>>>
>>>> I think that it must forward my requests from node1 to node2
>>>> automatically and I see "My Test Site - node2" message.
>>>>
>>>>
>>>> 2- I start the node1 again, but when I browse "IP:80", then I can't
>>>> see "My Test Site - node1" message.
>>>>
>>>> # pcs cluster start node1
>>>> node1: Starting Cluster...
>>>>
>>>>
>>>> # pcs status
>>>> Cluster name: mycluster
>>>> Cluster Summary:
>>>>    * Stack: corosync
>>>>    * Current DC: node2 (version 2.0.5-10.fc33-ba59be7122) -
>>>> partition
>>>> with quorum
>>>>    * Last updated: Mon Mar 22 12:26:10 2021
>>>>    * Last change:  Mon Mar 22 12:08:02 2021 by root via cibadmin on
>>>> node1
>>>>    * 2 nodes configured
>>>>    * 2 resource instances configured
>>>>
>>>> Node List:
>>>>    * Online: [ node1 node2 ]
>>>>
>>>> Full List of Resources:
>>>>    * WebSite    (ocf::heartbeat:apache):    Started node2
>>>>    * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node2
>>>>
>>>> Daemon Status:
>>>>    corosync: active/enabled
>>>>    pacemaker: active/enabled
>>>>    pcsd: active/enabled
>>>>
>>>>
>>>>
>>>> Logs are:
>>>> https://paste.ubuntu.com/p/Yt4K2kPM7b/ 
>>>>
>>>>
>>>> Thank you again.
>>>>
>>>>
>>>> On Monday, March 22, 2021, 01:12:21 AM GMT+4:30, Reid Wahl <
>>>> nwahl at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hi, Jason.
>>>>
>>>> On Sun, Mar 21, 2021 at 5:21 AM Jason Long <hack3rcon at yahoo.com>
>>>> wrote:
>>>>> Hello,
>>>>> I used "Clusters from Scratch" to configuration two nodes. I got
>>>>> below error:
>>>>>
>>>>> # pcs status
>>>>> Cluster name: mycluster
>>>>> Cluster Summary:
>>>>>    * Stack: corosync
>>>>>    * Current DC: node1 (version 2.0.5-10.fc33-ba59be7122) -
>>>>> partition with quorum
>>>>>    * Last updated: Sun Mar 21 15:35:18 2021
>>>>>    * Last change:  Sun Mar 21 15:29:38 2021 by root via cibadmin
>>>>> on
>>>>> node1
>>>>>    * 2 nodes configured
>>>>>    * 2 resource instances configured
>>>>>
>>>>> Node List:
>>>>>    * Online: [ node1 node2 ]
>>>>>
>>>>> Full List of Resources:
>>>>>    * WebSite    (ocf::heartbeat:apache):    Stopped
>>>>>    * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node1
>>>>>
>>>>> Failed Resource Actions:
>>>>>    * WebSite_start_0 on node1 'error' (1): call=6,
>>>>> status='complete', exitreason='Failed to access httpd status
>>>>> page.', last-rc-change='2021-03-21 15:23:45 +03:30', queued=0ms,
>>>>> exec=1318ms
>>>>>    * WebSite_start_0 on node2 'error' (1): call=6,
>>>>> status='complete', exitreason='Failed to access httpd status
>>>>> page.', last-rc-change='2021-03-21 15:23:47 +03:30', queued=0ms,
>>>>> exec=1380ms
>>>>>
>>>>> Daemon Status:
>>>>>    corosync: active/enabled
>>>>>    pacemaker: active/enabled
>>>>>    pcsd: active/enabled
>>>>>
>>>>>
>>>>> *********
>>>>> I have some questions:
>>>>>
>>>>> 1- In "Chapter 6. Add Apache HTTP Server as a Cluster Service",
>>>>> an
>>>>> important note said:
>>>>> "Do not enable the httpd service. Services that are intended to
>>>>> be
>>>>> managed via the cluster software should never be managed by the
>>>>> OS.
>>>>> It is often useful, however, to manually start the service,
>>>>> verify
>>>>> that it works, then stop it again, before adding it to the
>>>>> cluster.
>>>>> This allows you to resolve any non-cluster-related problems
>>>>> before
>>>>> continuing. Since this is a simple example, we’ll skip that step
>>>>> here."
>>>>>
>>>>> If the Apache service is not enabled they how can I connect to it
>>>>> via below command:
>>>>>    
>>>>> # wget -O - http://localhost/server-status 
>>>>> --2021-03-21 15:38:39--  http://localhost/server-status 
>>>>> Resolving localhost (localhost)... 127.0.0.1, ::1
>>>>> Connecting to localhost (localhost)|127.0.0.1|:80... failed:
>>>>> Connection timed out.
>>>>> Connecting to localhost (localhost)|::1|:80... failed: Network is
>>>>> unreachable.
>>>> Pacemaker starts the httpd service by starting the
>>>> ocf:heartbeat:apache resource. The article is saying that the
>>>> httpd.service systemd unit should not be enabled to start
>>>> automatically at boot; it should only start when the cluster starts
>>>> it. That is `systemctl is-enabled httpd.service` should print
>>>> "disabled".
>>>>
>>>>>    
>>>>>
>>>>> 2- Below commands must be run on both nodes or just one node?
>>>>>
>>>>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2
>>>>> ip="IP_That_Never_Used_In_The_Network" cidr_netmask=32 op monitor
>>>>> interval=30s
>>>>>
>>>>> # pcs resource create WebSite ocf:heartbeat:apache
>>>>> configfile=/etc/httpd/conf/httpd.conf statusurl="
>>>>> http://localhost/server-status" op monitor interval=20s
>>>> Just one node.
>>>>
>>>>>    
>>>>>
>>>>> 3- Why "* WebSite    (ocf::heartbeat:apache):    Stopped" ?
>>>> The apache resource agent ran a command similar to `wget -O- -q -L
>>>> --
>>>> no-proxy --bind-address=127.0.0.1 <status_url>` and got an error.
>>>> It
>>>> tried this on a start operation on each node, and it failed on both
>>>> nodes. When a resource fails to start on a given node, the default
>>>> response is to prevent it from starting on that node again until
>>>> the
>>>> failure is cleared.
>>>>
>>>>
>>>>
>>>>>    
>>>>> Logs are:
>>>>> https://paste.ubuntu.com/p/MtkfXyRX4P/ 
>>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> _______________________________________________
>>>>> Manage your subscription:
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users 
>>>>>
>>>>> ClusterLabs home: https://www.clusterlabs.org/ 
>>>>>
>>>>
> 
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/