[ClusterLabs] WebSite_start_0 on node2 'error' (1): call=6, status='complete', exitreason='Failed to access httpd status page.'

Tue Mar 23 11:29:50 EDT 2021

On 3/23/21 4:07 PM, Jason Long wrote:
> Thank you.
> Thus, where I must define my node2 IP address? When node1 disconnected, I want node2 replace it.
>
You just need a single IP address that you are assigning to the virtual 
IP resource.
And pacemaker is gonna move that IP address - along with the web-proxy - 
between
the 2 nodes.
Of course node1 & node2 have IP addresses that are being used for 
cluster-communication
but they are totally independent (well maybe in the same subnet for a 
simple setup)
from the IP address your web-proxy is reachable at.

Klaus
>
>
>
>
> On Tuesday, March 23, 2021, 01:03:39 PM GMT+4:30, Klaus Wenninger <kwenning at redhat.com> wrote:
>
>
>
>
>
> On 3/23/21 9:13 AM, Jason Long wrote:
>> Thank you.
>> But: https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch06.html
>> ?
>>
>> The floating IP address is: https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_add_a_resource.html
>> In the "Warning" written: "The chosen address must not already be in use on the network. Do not reuse an IP address one of the nodes already has configured.", what does it mean?
> It means that if you would be using an IP that is already in use
> on your network - by one of your cluster-nodes or something else -
> pacemaker would possibly activate that IP and you would have
> a duplicate IP in your network.
> Thus for the question below: Don't use the IP od node2 for
> your floating IP.
>
> Klaus
>
>> In the below command, "IP" is the IP address of my node2?
>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.120 cidr_netmask=32 op monitor interval=30s
>>
>> If yes, then I must update it with below command?
>>
>> # pcs resource update floating_ip ocf:heartbeat:IPaddr2 ip="Node2 IP" cidr_netmask=32 op monitor interval=30s
>>
>>
>>
>>
>>
>>
>> On Tuesday, March 23, 2021, 12:02:15 AM GMT+4:30, Ken Gaillot <kgaillot at redhat.com> wrote:
>>
>>
>>
>>
>>
>> On Mon, 2021-03-22 at 17:31 +0000, Jason Long wrote:
>>> Thank you.
>>>    From chapter 1 to 6, I never saw anything about configuring the
>>> floating IP address! Am I wrong?
>> Hi,
>>
>> Chapter 6 should be "Create an Active/Passive Cluster", which adds a
>> floating IP, then Chapter 7 is "Add Apache HTTP Server as a Cluster
>> Service".
>>
>>
>>
>>> On Monday, March 22, 2021, 07:06:47 PM GMT+4:30, Ken Gaillot <
>>> kgaillot at redhat.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Mon, 2021-03-22 at 08:15 +0000, Jason Long wrote:
>>>> Thank you.
>>>>
>>>> My test lab use VirtualBox with two VMs as below:
>>>> VM1: This VM has two NICs (NAT, Host-only Adapter)
>>>> VM2: This VM has one NIC (Host-only Adapter)
>>>>
>>>> On VM1, I use the NAT interface for the port forwarding:
>>>> "127.0.0.1:2080" on Host  FORWARDING TO 127.0.0.1:80 on Guest.
>>>>
>>>>
>>>> Yes, "systemctl" tell me:
>>>>
>>>> # systemctl is-enabled httpd.service
>>>> disabled
>>>>
>>>> I rebooted my nodes and one of the problems solved:
>>>> https://paste.ubuntu.com/p/7cQQtsXFPV/
>>>>
>>>> I did:
>>>> # pcs resource defaults resource-stickiness=100
>>>>
>>>>
>>>> When I browse "127.0.0.1:2080" then it shows me "My Test Site -
>>>> node1".
>>>>
>>>> I have two problems:
>>>>
>>>> 1- When I stopped the node1 VM and refresh the page then I can't
>>>> see
>>>> "My Test Site - node2"?
>>>>
>>>> # pcs cluster stop node1
>>>> node1: Stopping Cluster (pacemaker)...
>>>> node1: Stopping Cluster (corosync)...
>>>>
>>>> # pcs status
>>>> Error: error running crm_mon, is pacemaker running?
>>>> Could not connect to the CIB: Transport endpoint is not connected
>>>> crm_mon: Error: cluster is not available on this node
>>> Hi,
>>>
>>> pcs status doesn't test the web site, it shows the internal cluster
>>> status. Since the cluster isn't running on that node, it can't show
>>> anything.
>>>
>>> However the website is still active on the other node, and reachable
>>> from this node. You can confirm that by using wget or curl with the
>>> public web site URL (the floating IP address).
>>>
>>>> # pcs resource defaults
>>>> Error: unable to get cib
>>>>
>>>>
>>>> I think that it must forward my requests from node1 to node2
>>>> automatically and I see "My Test Site - node2" message.
>>>>
>>>>
>>>> 2- I start the node1 again, but when I browse "IP:80", then I can't
>>>> see "My Test Site - node1" message.
>>>>
>>>> # pcs cluster start node1
>>>> node1: Starting Cluster...
>>>>
>>>>
>>>> # pcs status
>>>> Cluster name: mycluster
>>>> Cluster Summary:
>>>>      * Stack: corosync
>>>>      * Current DC: node2 (version 2.0.5-10.fc33-ba59be7122) -
>>>> partition
>>>> with quorum
>>>>      * Last updated: Mon Mar 22 12:26:10 2021
>>>>      * Last change:  Mon Mar 22 12:08:02 2021 by root via cibadmin on
>>>> node1
>>>>      * 2 nodes configured
>>>>      * 2 resource instances configured
>>>>
>>>> Node List:
>>>>      * Online: [ node1 node2 ]
>>>>
>>>> Full List of Resources:
>>>>      * WebSite    (ocf::heartbeat:apache):    Started node2
>>>>      * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node2
>>>>
>>>> Daemon Status:
>>>>      corosync: active/enabled
>>>>      pacemaker: active/enabled
>>>>      pcsd: active/enabled
>>>>
>>>>
>>>>
>>>> Logs are:
>>>> https://paste.ubuntu.com/p/Yt4K2kPM7b/
>>>>
>>>>
>>>> Thank you again.
>>>>
>>>>
>>>> On Monday, March 22, 2021, 01:12:21 AM GMT+4:30, Reid Wahl <
>>>> nwahl at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Hi, Jason.
>>>>
>>>> On Sun, Mar 21, 2021 at 5:21 AM Jason Long <hack3rcon at yahoo.com>
>>>> wrote:
>>>>> Hello,
>>>>> I used "Clusters from Scratch" to configuration two nodes. I got
>>>>> below error:
>>>>>
>>>>> # pcs status
>>>>> Cluster name: mycluster
>>>>> Cluster Summary:
>>>>>      * Stack: corosync
>>>>>      * Current DC: node1 (version 2.0.5-10.fc33-ba59be7122) -
>>>>> partition with quorum
>>>>>      * Last updated: Sun Mar 21 15:35:18 2021
>>>>>      * Last change:  Sun Mar 21 15:29:38 2021 by root via cibadmin
>>>>> on
>>>>> node1
>>>>>      * 2 nodes configured
>>>>>      * 2 resource instances configured
>>>>>
>>>>> Node List:
>>>>>      * Online: [ node1 node2 ]
>>>>>
>>>>> Full List of Resources:
>>>>>      * WebSite    (ocf::heartbeat:apache):    Stopped
>>>>>      * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node1
>>>>>
>>>>> Failed Resource Actions:
>>>>>      * WebSite_start_0 on node1 'error' (1): call=6,
>>>>> status='complete', exitreason='Failed to access httpd status
>>>>> page.', last-rc-change='2021-03-21 15:23:45 +03:30', queued=0ms,
>>>>> exec=1318ms
>>>>>      * WebSite_start_0 on node2 'error' (1): call=6,
>>>>> status='complete', exitreason='Failed to access httpd status
>>>>> page.', last-rc-change='2021-03-21 15:23:47 +03:30', queued=0ms,
>>>>> exec=1380ms
>>>>>
>>>>> Daemon Status:
>>>>>      corosync: active/enabled
>>>>>      pacemaker: active/enabled
>>>>>      pcsd: active/enabled
>>>>>
>>>>>
>>>>> *********
>>>>> I have some questions:
>>>>>
>>>>> 1- In "Chapter 6. Add Apache HTTP Server as a Cluster Service",
>>>>> an
>>>>> important note said:
>>>>> "Do not enable the httpd service. Services that are intended to
>>>>> be
>>>>> managed via the cluster software should never be managed by the
>>>>> OS.
>>>>> It is often useful, however, to manually start the service,
>>>>> verify
>>>>> that it works, then stop it again, before adding it to the
>>>>> cluster.
>>>>> This allows you to resolve any non-cluster-related problems
>>>>> before
>>>>> continuing. Since this is a simple example, we’ll skip that step
>>>>> here."
>>>>>
>>>>> If the Apache service is not enabled they how can I connect to it
>>>>> via below command:
>>>>>      
>>>>> # wget -O - http://localhost/server-status
>>>>> --2021-03-21 15:38:39--  http://localhost/server-status
>>>>> Resolving localhost (localhost)... 127.0.0.1, ::1
>>>>> Connecting to localhost (localhost)|127.0.0.1|:80... failed:
>>>>> Connection timed out.
>>>>> Connecting to localhost (localhost)|::1|:80... failed: Network is
>>>>> unreachable.
>>>> Pacemaker starts the httpd service by starting the
>>>> ocf:heartbeat:apache resource. The article is saying that the
>>>> httpd.service systemd unit should not be enabled to start
>>>> automatically at boot; it should only start when the cluster starts
>>>> it. That is `systemctl is-enabled httpd.service` should print
>>>> "disabled".
>>>>
>>>>>      
>>>>>
>>>>> 2- Below commands must be run on both nodes or just one node?
>>>>>
>>>>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2
>>>>> ip="IP_That_Never_Used_In_The_Network" cidr_netmask=32 op monitor
>>>>> interval=30s
>>>>>
>>>>> # pcs resource create WebSite ocf:heartbeat:apache
>>>>> configfile=/etc/httpd/conf/httpd.conf statusurl="
>>>>> http://localhost/server-status" op monitor interval=20s
>>>> Just one node.
>>>>
>>>>>      
>>>>>
>>>>> 3- Why "* WebSite    (ocf::heartbeat:apache):    Stopped" ?
>>>> The apache resource agent ran a command similar to `wget -O- -q -L
>>>> --
>>>> no-proxy --bind-address=127.0.0.1 <status_url>` and got an error.
>>>> It
>>>> tried this on a start operation on each node, and it failed on both
>>>> nodes. When a resource fails to start on a given node, the default
>>>> response is to prevent it from starting on that node again until
>>>> the
>>>> failure is cleared.
>>>>
>>>>
>>>>
>>>>>      
>>>>> Logs are:
>>>>> https://paste.ubuntu.com/p/MtkfXyRX4P/
>>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>> _______________________________________________
>>>>> Manage your subscription:
>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>
>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>