[ClusterLabs] WebSite_start_0 on node2 'error' (1): call=6, status='complete', exitreason='Failed to access httpd status page.'

Thu Mar 25 10:44:27 EDT 2021

Then, how can I sure my configuration is OK?
In a clustering environment, when a node disconnected then another node must replace it. Am I right?
I did a test:
I defined a NAT interface for my VM2 (node2) and used port forwarding: "127.0.0.1:2090" on Host  FORWARDING TO 127.0.0.1:80 on Guest.
When node1 is OK and I browse "http://127.0.0.1:2080" then it shown me "My Test Site - node1", but when I browse "http://127.0.0.1:2090" then it doesn't show anything.
I stopped node1 and when I browse "http://127.0.0.1:2080" it doesn't show anything, but when I browse "http://127.0.0.1:2090", then it has shown me "My Test Site - node2".
Could this mean that my cluster is working properly?

On Thursday, March 25, 2021, 05:20:33 PM GMT+4:30, Klaus Wenninger <kwenning at redhat.com> wrote: 

On 3/25/21 9:55 AM, Jason Long wrote:
> Thank you so much.
>>   Now you can proceed with the "Add Apache HTTP" section.
> What does it mean? I did all steps in the document.
>
>>   Once apache is set up as a cluster resource, you should be able to contact the web server at the floating IP...
> # pcs cluster stop node1
> node1: Stopping Cluster (pacemaker)...
> node1: Stopping Cluster (corosync)...
> #
> # pcs status
> Error: error running crm_mon, is pacemaker running?
>    Could not connect to the CIB: Transport endpoint is not connected
>    crm_mon: Error: cluster is not available on this node
> #
> # curl http://192.168.56.9
> <html>
>   <body>My Test Site - node2</body>
>   </html>
>
> Thank you about it, but I want to use these two VMs as an Apache Reverse Proxy Server. When one of my nodes stopped, then another node start servicing.
>
> My test lab use VirtualBox with two VMs as below:
> VM1: This VM has two NICs (NAT, Host-only Adapter)
> VM2: This VM has one NIC (Host-only Adapter)
>
> On VM1, I use the NAT interface for the port forwarding: "127.0.0.1:2080" on Host  FORWARDING TO 127.0.0.1:80 on Guest.
>
> When I stopped node1 and browse "http://127.0.0.1:2080" then I can't see anything. I want it shown me "My Test Site - node2". I think it is reasonable because when on of my Reverse Proxy Server (node1) stopped, then other Reverse Proxy Server (node2) started.
>
> How can I achieve this goal?
Definitely not using that NAT interface I would say.
It will just be able to connect you with a service running on VM1.
And that doesn't make any sense seen from a high-availability
point of view. Even if you setup NAT that would make the
proxy on node2 visible via VM1 this wouldn't give you
increased availability - rather the opposite due to increased
complexity. In high-availability we are speaking of a Single
Point of Failure (SPOF) which VM1 is gonna be here and what you
never ever wanna have.
>
>
>
>
>
> On Wednesday, March 24, 2021, 10:21:09 PM GMT+4:30, Ken Gaillot <kgaillot at redhat.com> wrote:
>
>
>
>
>
> On Wed, 2021-03-24 at 10:50 +0000, Jason Long wrote:
>> Thank you.
>> Form node1 and node2, I can ping the floating IP address
>> (192.168.56.9).
>> I stopped node1:
>>
>> # pcs cluster stop node1
>> node1: Stopping Cluster (pacemaker)...
>> node1: Stopping Cluster (corosync)...
>>
>> And from both machines, I can ping the floating IP address:
>>
>> [root at node1 ~]# ping 192.168.56.9
>> PING 192.168.56.9 (192.168.56.9) 56(84) bytes of data.
>> 64 bytes from 192.168.56.9: icmp_seq=1 ttl=64 time=0.504 ms
>> 64 bytes from 192.168.56.9: icmp_seq=2 ttl=64 time=0.750 ms
>> ...
>>
>> [root at node2 ~]# ping 192.168.56.9
>> PING 192.168.56.9 (192.168.56.9) 56(84) bytes of data.
>> 64 bytes from 192.168.56.9: icmp_seq=1 ttl=64 time=0.423 ms
>> 64 bytes from 192.168.56.9: icmp_seq=2 ttl=64 time=0.096 ms
>> ...
>>
>>
>> So?
> Now you can proceed with the "Add Apache HTTP" section. Once apache is
> set up as a cluster resource, you should be able to contact the web
> server at the floating IP (or more realistically whatever name you've
> associated with that IP), and have the cluster fail over both the IP
> address and web server as needed.
>
>
>> On Wednesday, March 24, 2021, 02:41:44 AM GMT+4:30, Ken Gaillot <
>> kgaillot at redhat.com> wrote:
>>
>>
>>
>>
>>
>> On Tue, 2021-03-23 at 20:15 +0000, Jason Long wrote:
>>> Thanks.
>>> The floating IP address must not use by other machines. I have two
>>> VMs that using "192.168.57.6" and "192.168.57.7". Could the
>>> floating
>>> IP address be "192.168.57.8"?
>> Yes, if it's in the same subnet and not already in use by some other
>> machine.
>>
>>> Which part of my configuration is wrong? Why, when I disconnect
>>> node1, then node2 doesn't replace it?
>> The first thing I would do is configure and test fencing. Once you're
>> confident fencing is working, add the floating IP address. Make sure
>> you can ping the floating IP address from some other machine. Then
>> test
>> fail-over and ensure you can still ping the floating IP. From there
>> it
>> should be straightforward.
>>
>>
>>>
>>>
>>>
>>>
>>> On Wednesday, March 24, 2021, 12:33:53 AM GMT+4:30, Ken Gaillot <
>>> kgaillot at redhat.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Tue, 2021-03-23 at 19:07 +0000, Jason Long wrote:
>>>> Thanks, but I want to have a cluster with two nodes and nothing
>>>> more!
>>> The end result is to have 2 nodes with 3 IP addresses:
>>>
>>> * The first node has a permanently assigned IP address that it
>>> brings
>>> up when it boots; this address is not managed by the cluster
>>>
>>> * The second node also has a permanent address not managed by the
>>> cluster
>>>
>>> * A third, unused IP address from the same subnet is used as a
>>> "floating" IP address, which means the cluster can sometimes run it
>>> on
>>> the first node and sometimes on the second node. This IP address is
>>> the
>>> one that users will use to contact the service.
>>>
>>> That way, users always have a single address that they use, no
>>> matter
>>> which node is providing the service.
>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tuesday, March 23, 2021, 07:59:57 PM GMT+4:30, Klaus Wenninger
>>>> <
>>>> kwenning at redhat.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 3/23/21 4:07 PM, Jason Long wrote:
>>>>> Thank you.
>>>>> Thus, where I must define my node2 IP address? When node1
>>>>> disconnected, I want node2 replace it.
>>>>>
>>>> You just need a single IP address that you are assigning to the
>>>> virtual
>>>> IP resource.
>>>> And pacemaker is gonna move that IP address - along with the web-
>>>> proxy -
>>>> between
>>>> the 2 nodes.
>>>> Of course node1 & node2 have IP addresses that are being used
>>>> for
>>>> cluster-communication
>>>> but they are totally independent (well maybe in the same subnet
>>>> for
>>>> a
>>>> simple setup)
>>>> from the IP address your web-proxy is reachable at.
>>>>
>>>> Klaus
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tuesday, March 23, 2021, 01:03:39 PM GMT+4:30, Klaus
>>>>> Wenninger
>>>>> <
>>>>> kwenning at redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 3/23/21 9:13 AM, Jason Long wrote:
>>>>>> Thank you.
>>>>>> But:
>>>>>> https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch06.html
>>>>>> ?
>>>>>>
>>>>>> The floating IP address is:
>>>>>> https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_add_a_resource.html
>>>>>> In the "Warning" written: "The chosen address must not
>>>>>> already
>>>>>> be
>>>>>> in use on the network. Do not reuse an IP address one of the
>>>>>> nodes already has configured.", what does it mean?
>>>>> It means that if you would be using an IP that is already in
>>>>> use
>>>>> on your network - by one of your cluster-nodes or something
>>>>> else
>>>>> -
>>>>> pacemaker would possibly activate that IP and you would have
>>>>> a duplicate IP in your network.
>>>>> Thus for the question below: Don't use the IP od node2 for
>>>>> your floating IP.
>>>>>
>>>>> Klaus
>>>>>
>>>>>> In the below command, "IP" is the IP address of my node2?
>>>>>> # pcs resource create ClusterIP
>>>>>> ocf:heartbeat:IPaddr2 ip=192.168.122.120 cidr_netmask=32 op
>>>>>> monitor interval=30s
>>>>>>
>>>>>> If yes, then I must update it with below command?
>>>>>>
>>>>>> # pcs resource update floating_ip ocf:heartbeat:IPaddr2
>>>>>> ip="Node2
>>>>>> IP" cidr_netmask=32 op monitor interval=30s
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tuesday, March 23, 2021, 12:02:15 AM GMT+4:30, Ken Gaillot
>>>>>> <
>>>>>> kgaillot at redhat.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, 2021-03-22 at 17:31 +0000, Jason Long wrote:
>>>>>>> Thank you.
>>>>>>>      From chapter 1 to 6, I never saw anything about
>>>>>>> configuring
>>>>>>> the
>>>>>>> floating IP address! Am I wrong?
>>>>>> Hi,
>>>>>>
>>>>>> Chapter 6 should be "Create an Active/Passive Cluster", which
>>>>>> adds a
>>>>>> floating IP, then Chapter 7 is "Add Apache HTTP Server as a
>>>>>> Cluster
>>>>>> Service".
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Monday, March 22, 2021, 07:06:47 PM GMT+4:30, Ken
>>>>>>> Gaillot
>>>>>>> <
>>>>>>> kgaillot at redhat.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 2021-03-22 at 08:15 +0000, Jason Long wrote:
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>> My test lab use VirtualBox with two VMs as below:
>>>>>>>> VM1: This VM has two NICs (NAT, Host-only Adapter)
>>>>>>>> VM2: This VM has one NIC (Host-only Adapter)
>>>>>>>>
>>>>>>>> On VM1, I use the NAT interface for the port forwarding:
>>>>>>>> "127.0.0.1:2080" on Host  FORWARDING TO 127.0.0.1:80 on
>>>>>>>> Guest.
>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, "systemctl" tell me:
>>>>>>>>
>>>>>>>> # systemctl is-enabled httpd.service
>>>>>>>> disabled
>>>>>>>>
>>>>>>>> I rebooted my nodes and one of the problems solved:
>>>>>>>> https://paste.ubuntu.com/p/7cQQtsXFPV/
>>>>>>>>
>>>>>>>> I did:
>>>>>>>> # pcs resource defaults resource-stickiness=100
>>>>>>>>
>>>>>>>>
>>>>>>>> When I browse "127.0.0.1:2080" then it shows me "My Test
>>>>>>>> Site
>>>>>>>> -
>>>>>>>> node1".
>>>>>>>>
>>>>>>>> I have two problems:
>>>>>>>>
>>>>>>>> 1- When I stopped the node1 VM and refresh the page then
>>>>>>>> I
>>>>>>>> can't
>>>>>>>> see
>>>>>>>> "My Test Site - node2"?
>>>>>>>>
>>>>>>>> # pcs cluster stop node1
>>>>>>>> node1: Stopping Cluster (pacemaker)...
>>>>>>>> node1: Stopping Cluster (corosync)...
>>>>>>>>
>>>>>>>> # pcs status
>>>>>>>> Error: error running crm_mon, is pacemaker running?
>>>>>>>> Could not connect to the CIB: Transport endpoint is not
>>>>>>>> connected
>>>>>>>> crm_mon: Error: cluster is not available on this node
>>>>>>> Hi,
>>>>>>>
>>>>>>> pcs status doesn't test the web site, it shows the internal
>>>>>>> cluster
>>>>>>> status. Since the cluster isn't running on that node, it
>>>>>>> can't
>>>>>>> show
>>>>>>> anything.
>>>>>>>
>>>>>>> However the website is still active on the other node, and
>>>>>>> reachable
>>>>>>> from this node. You can confirm that by using wget or curl
>>>>>>> with
>>>>>>> the
>>>>>>> public web site URL (the floating IP address).
>>>>>>>
>>>>>>>> # pcs resource defaults
>>>>>>>> Error: unable to get cib
>>>>>>>>
>>>>>>>>
>>>>>>>> I think that it must forward my requests from node1 to
>>>>>>>> node2
>>>>>>>> automatically and I see "My Test Site - node2" message.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2- I start the node1 again, but when I browse "IP:80",
>>>>>>>> then
>>>>>>>> I
>>>>>>>> can't
>>>>>>>> see "My Test Site - node1" message.
>>>>>>>>
>>>>>>>> # pcs cluster start node1
>>>>>>>> node1: Starting Cluster...
>>>>>>>>
>>>>>>>>
>>>>>>>> # pcs status
>>>>>>>> Cluster name: mycluster
>>>>>>>> Cluster Summary:
>>>>>>>>        * Stack: corosync
>>>>>>>>        * Current DC: node2 (version 2.0.5-10.fc33-
>>>>>>>> ba59be7122)
>>>>>>>> -
>>>>>>>> partition
>>>>>>>> with quorum
>>>>>>>>        * Last updated: Mon Mar 22 12:26:10 2021
>>>>>>>>        * Last change:  Mon Mar 22 12:08:02 2021 by root
>>>>>>>> via
>>>>>>>> cibadmin on
>>>>>>>> node1
>>>>>>>>        * 2 nodes configured
>>>>>>>>        * 2 resource instances configured
>>>>>>>>
>>>>>>>> Node List:
>>>>>>>>        * Online: [ node1 node2 ]
>>>>>>>>
>>>>>>>> Full List of Resources:
>>>>>>>>        * WebSite    (ocf::heartbeat:apache):    Started
>>>>>>>> node2
>>>>>>>>        * ClusterIP    (ocf::heartbeat:IPaddr2):    Started
>>>>>>>> node2
>>>>>>>>
>>>>>>>> Daemon Status:
>>>>>>>>        corosync: active/enabled
>>>>>>>>        pacemaker: active/enabled
>>>>>>>>        pcsd: active/enabled
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Logs are:
>>>>>>>> https://paste.ubuntu.com/p/Yt4K2kPM7b/
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you again.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Monday, March 22, 2021, 01:12:21 AM GMT+4:30, Reid
>>>>>>>> Wahl
>>>>>>>> <
>>>>>>>> nwahl at redhat.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Hi, Jason.
>>>>>>>>
>>>>>>>> On Sun, Mar 21, 2021 at 5:21 AM Jason Long <
>>>>>>>> hack3rcon at yahoo.com>
>>>>>>>> wrote:
>>>>>>>>> Hello,
>>>>>>>>> I used "Clusters from Scratch" to configuration two
>>>>>>>>> nodes.
>>>>>>>>> I got
>>>>>>>>> below error:
>>>>>>>>>
>>>>>>>>> # pcs status
>>>>>>>>> Cluster name: mycluster
>>>>>>>>> Cluster Summary:
>>>>>>>>>        * Stack: corosync
>>>>>>>>>        * Current DC: node1 (version 2.0.5-10.fc33-
>>>>>>>>> ba59be7122) -
>>>>>>>>> partition with quorum
>>>>>>>>>        * Last updated: Sun Mar 21 15:35:18 2021
>>>>>>>>>        * Last change:  Sun Mar 21 15:29:38 2021 by root
>>>>>>>>> via
>>>>>>>>> cibadmin
>>>>>>>>> on
>>>>>>>>> node1
>>>>>>>>>        * 2 nodes configured
>>>>>>>>>        * 2 resource instances configured
>>>>>>>>>
>>>>>>>>> Node List:
>>>>>>>>>        * Online: [ node1 node2 ]
>>>>>>>>>
>>>>>>>>> Full List of Resources:
>>>>>>>>>        * WebSite    (ocf::heartbeat:apache):    Stopped
>>>>>>>>>        * ClusterIP    (ocf::heartbeat:IPaddr2):
>>>>>>>>> Started
>>>>>>>>> node1
>>>>>>>>>
>>>>>>>>> Failed Resource Actions:
>>>>>>>>>        * WebSite_start_0 on node1 'error' (1): call=6,
>>>>>>>>> status='complete', exitreason='Failed to access httpd
>>>>>>>>> status
>>>>>>>>> page.', last-rc-change='2021-03-21 15:23:45 +03:30',
>>>>>>>>> queued=0ms,
>>>>>>>>> exec=1318ms
>>>>>>>>>        * WebSite_start_0 on node2 'error' (1): call=6,
>>>>>>>>> status='complete', exitreason='Failed to access httpd
>>>>>>>>> status
>>>>>>>>> page.', last-rc-change='2021-03-21 15:23:47 +03:30',
>>>>>>>>> queued=0ms,
>>>>>>>>> exec=1380ms
>>>>>>>>>
>>>>>>>>> Daemon Status:
>>>>>>>>>        corosync: active/enabled
>>>>>>>>>        pacemaker: active/enabled
>>>>>>>>>        pcsd: active/enabled
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *********
>>>>>>>>> I have some questions:
>>>>>>>>>
>>>>>>>>> 1- In "Chapter 6. Add Apache HTTP Server as a Cluster
>>>>>>>>> Service",
>>>>>>>>> an
>>>>>>>>> important note said:
>>>>>>>>> "Do not enable the httpd service. Services that are
>>>>>>>>> intended to
>>>>>>>>> be
>>>>>>>>> managed via the cluster software should never be
>>>>>>>>> managed
>>>>>>>>> by
>>>>>>>>> the
>>>>>>>>> OS.
>>>>>>>>> It is often useful, however, to manually start the
>>>>>>>>> service,
>>>>>>>>> verify
>>>>>>>>> that it works, then stop it again, before adding it to
>>>>>>>>> the
>>>>>>>>> cluster.
>>>>>>>>> This allows you to resolve any non-cluster-related
>>>>>>>>> problems
>>>>>>>>> before
>>>>>>>>> continuing. Since this is a simple example, we’ll skip
>>>>>>>>> that
>>>>>>>>> step
>>>>>>>>> here."
>>>>>>>>>
>>>>>>>>> If the Apache service is not enabled they how can I
>>>>>>>>> connect
>>>>>>>>> to it
>>>>>>>>> via below command:
>>>>>>>>>        
>>>>>>>>> # wget -O - http://localhost/server-status
>>>>>>>>> --2021-03-21 15:38:39--  http://localhost/server-status
>>>>>>>>> Resolving localhost (localhost)... 127.0.0.1, ::1
>>>>>>>>> Connecting to localhost (localhost)|127.0.0.1|:80...
>>>>>>>>> failed:
>>>>>>>>> Connection timed out.
>>>>>>>>> Connecting to localhost (localhost)|::1|:80... failed:
>>>>>>>>> Network is
>>>>>>>>> unreachable.
>>>>>>>> Pacemaker starts the httpd service by starting the
>>>>>>>> ocf:heartbeat:apache resource. The article is saying that
>>>>>>>> the
>>>>>>>> httpd.service systemd unit should not be enabled to start
>>>>>>>> automatically at boot; it should only start when the
>>>>>>>> cluster
>>>>>>>> starts
>>>>>>>> it. That is `systemctl is-enabled httpd.service` should
>>>>>>>> print
>>>>>>>> "disabled".
>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>> 2- Below commands must be run on both nodes or just one
>>>>>>>>> node?
>>>>>>>>>
>>>>>>>>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2
>>>>>>>>> ip="IP_That_Never_Used_In_The_Network" cidr_netmask=32
>>>>>>>>> op
>>>>>>>>> monitor
>>>>>>>>> interval=30s
>>>>>>>>>
>>>>>>>>> # pcs resource create WebSite ocf:heartbeat:apache
>>>>>>>>> configfile=/etc/httpd/conf/httpd.conf statusurl="
>>>>>>>>> http://localhost/server-status" op monitor interval=20s
>>>>>>>> Just one node.
>>>>>>>>
>>>>>>>>>        
>>>>>>>>>
>>>>>>>>> 3- Why "* WebSite    (ocf::heartbeat:apache):
>>>>>>>>> Stopped"
>>>>>>>>> ?
>>>>>>>> The apache resource agent ran a command similar to `wget
>>>>>>>> -O-
>>>>>>>> -q -L
>>>>>>>> --
>>>>>>>> no-proxy --bind-address=127.0.0.1 <status_url>` and got
>>>>>>>> an
>>>>>>>> error.
>>>>>>>> It
>>>>>>>> tried this on a start operation on each node, and it
>>>>>>>> failed
>>>>>>>> on both
>>>>>>>> nodes. When a resource fails to start on a given node,
>>>>>>>> the
>>>>>>>> default
>>>>>>>> response is to prevent it from starting on that node
>>>>>>>> again
>>>>>>>> until
>>>>>>>> the
>>>>>>>> failure is cleared.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>        
>>>>>>>>> Logs are:
>>>>>>>>> https://paste.ubuntu.com/p/MtkfXyRX4P/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Manage your subscription:
>>>>>>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>>>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>