[ClusterLabs] WebSite_start_0 on node2 'error' (1): call=6, status='complete', exitreason='Failed to access httpd status page.'
Jason Long
hack3rcon at yahoo.com
Tue Mar 23 11:07:28 EDT 2021
Thank you.
Thus, where I must define my node2 IP address? When node1 disconnected, I want node2 replace it.
On Tuesday, March 23, 2021, 01:03:39 PM GMT+4:30, Klaus Wenninger <kwenning at redhat.com> wrote:
On 3/23/21 9:13 AM, Jason Long wrote:
> Thank you.
> But: https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch06.html
> ?
>
> The floating IP address is: https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_add_a_resource.html
> In the "Warning" written: "The chosen address must not already be in use on the network. Do not reuse an IP address one of the nodes already has configured.", what does it mean?
It means that if you would be using an IP that is already in use
on your network - by one of your cluster-nodes or something else -
pacemaker would possibly activate that IP and you would have
a duplicate IP in your network.
Thus for the question below: Don't use the IP od node2 for
your floating IP.
Klaus
>
> In the below command, "IP" is the IP address of my node2?
> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.122.120 cidr_netmask=32 op monitor interval=30s
>
> If yes, then I must update it with below command?
>
> # pcs resource update floating_ip ocf:heartbeat:IPaddr2 ip="Node2 IP" cidr_netmask=32 op monitor interval=30s
>
>
>
>
>
>
> On Tuesday, March 23, 2021, 12:02:15 AM GMT+4:30, Ken Gaillot <kgaillot at redhat.com> wrote:
>
>
>
>
>
> On Mon, 2021-03-22 at 17:31 +0000, Jason Long wrote:
>> Thank you.
>> From chapter 1 to 6, I never saw anything about configuring the
>> floating IP address! Am I wrong?
> Hi,
>
> Chapter 6 should be "Create an Active/Passive Cluster", which adds a
> floating IP, then Chapter 7 is "Add Apache HTTP Server as a Cluster
> Service".
>
>
>
>> On Monday, March 22, 2021, 07:06:47 PM GMT+4:30, Ken Gaillot <
>> kgaillot at redhat.com> wrote:
>>
>>
>>
>>
>>
>> On Mon, 2021-03-22 at 08:15 +0000, Jason Long wrote:
>>> Thank you.
>>>
>>> My test lab use VirtualBox with two VMs as below:
>>> VM1: This VM has two NICs (NAT, Host-only Adapter)
>>> VM2: This VM has one NIC (Host-only Adapter)
>>>
>>> On VM1, I use the NAT interface for the port forwarding:
>>> "127.0.0.1:2080" on Host FORWARDING TO 127.0.0.1:80 on Guest.
>>>
>>>
>>> Yes, "systemctl" tell me:
>>>
>>> # systemctl is-enabled httpd.service
>>> disabled
>>>
>>> I rebooted my nodes and one of the problems solved:
>>> https://paste.ubuntu.com/p/7cQQtsXFPV/
>>>
>>> I did:
>>> # pcs resource defaults resource-stickiness=100
>>>
>>>
>>> When I browse "127.0.0.1:2080" then it shows me "My Test Site -
>>> node1".
>>>
>>> I have two problems:
>>>
>>> 1- When I stopped the node1 VM and refresh the page then I can't
>>> see
>>> "My Test Site - node2"?
>>>
>>> # pcs cluster stop node1
>>> node1: Stopping Cluster (pacemaker)...
>>> node1: Stopping Cluster (corosync)...
>>>
>>> # pcs status
>>> Error: error running crm_mon, is pacemaker running?
>>> Could not connect to the CIB: Transport endpoint is not connected
>>> crm_mon: Error: cluster is not available on this node
>> Hi,
>>
>> pcs status doesn't test the web site, it shows the internal cluster
>> status. Since the cluster isn't running on that node, it can't show
>> anything.
>>
>> However the website is still active on the other node, and reachable
>> from this node. You can confirm that by using wget or curl with the
>> public web site URL (the floating IP address).
>>
>>> # pcs resource defaults
>>> Error: unable to get cib
>>>
>>>
>>> I think that it must forward my requests from node1 to node2
>>> automatically and I see "My Test Site - node2" message.
>>>
>>>
>>> 2- I start the node1 again, but when I browse "IP:80", then I can't
>>> see "My Test Site - node1" message.
>>>
>>> # pcs cluster start node1
>>> node1: Starting Cluster...
>>>
>>>
>>> # pcs status
>>> Cluster name: mycluster
>>> Cluster Summary:
>>> * Stack: corosync
>>> * Current DC: node2 (version 2.0.5-10.fc33-ba59be7122) -
>>> partition
>>> with quorum
>>> * Last updated: Mon Mar 22 12:26:10 2021
>>> * Last change: Mon Mar 22 12:08:02 2021 by root via cibadmin on
>>> node1
>>> * 2 nodes configured
>>> * 2 resource instances configured
>>>
>>> Node List:
>>> * Online: [ node1 node2 ]
>>>
>>> Full List of Resources:
>>> * WebSite (ocf::heartbeat:apache): Started node2
>>> * ClusterIP (ocf::heartbeat:IPaddr2): Started node2
>>>
>>> Daemon Status:
>>> corosync: active/enabled
>>> pacemaker: active/enabled
>>> pcsd: active/enabled
>>>
>>>
>>>
>>> Logs are:
>>> https://paste.ubuntu.com/p/Yt4K2kPM7b/
>>>
>>>
>>> Thank you again.
>>>
>>>
>>> On Monday, March 22, 2021, 01:12:21 AM GMT+4:30, Reid Wahl <
>>> nwahl at redhat.com> wrote:
>>>
>>>
>>>
>>>
>>>
>>> Hi, Jason.
>>>
>>> On Sun, Mar 21, 2021 at 5:21 AM Jason Long <hack3rcon at yahoo.com>
>>> wrote:
>>>> Hello,
>>>> I used "Clusters from Scratch" to configuration two nodes. I got
>>>> below error:
>>>>
>>>> # pcs status
>>>> Cluster name: mycluster
>>>> Cluster Summary:
>>>> * Stack: corosync
>>>> * Current DC: node1 (version 2.0.5-10.fc33-ba59be7122) -
>>>> partition with quorum
>>>> * Last updated: Sun Mar 21 15:35:18 2021
>>>> * Last change: Sun Mar 21 15:29:38 2021 by root via cibadmin
>>>> on
>>>> node1
>>>> * 2 nodes configured
>>>> * 2 resource instances configured
>>>>
>>>> Node List:
>>>> * Online: [ node1 node2 ]
>>>>
>>>> Full List of Resources:
>>>> * WebSite (ocf::heartbeat:apache): Stopped
>>>> * ClusterIP (ocf::heartbeat:IPaddr2): Started node1
>>>>
>>>> Failed Resource Actions:
>>>> * WebSite_start_0 on node1 'error' (1): call=6,
>>>> status='complete', exitreason='Failed to access httpd status
>>>> page.', last-rc-change='2021-03-21 15:23:45 +03:30', queued=0ms,
>>>> exec=1318ms
>>>> * WebSite_start_0 on node2 'error' (1): call=6,
>>>> status='complete', exitreason='Failed to access httpd status
>>>> page.', last-rc-change='2021-03-21 15:23:47 +03:30', queued=0ms,
>>>> exec=1380ms
>>>>
>>>> Daemon Status:
>>>> corosync: active/enabled
>>>> pacemaker: active/enabled
>>>> pcsd: active/enabled
>>>>
>>>>
>>>> *********
>>>> I have some questions:
>>>>
>>>> 1- In "Chapter 6. Add Apache HTTP Server as a Cluster Service",
>>>> an
>>>> important note said:
>>>> "Do not enable the httpd service. Services that are intended to
>>>> be
>>>> managed via the cluster software should never be managed by the
>>>> OS.
>>>> It is often useful, however, to manually start the service,
>>>> verify
>>>> that it works, then stop it again, before adding it to the
>>>> cluster.
>>>> This allows you to resolve any non-cluster-related problems
>>>> before
>>>> continuing. Since this is a simple example, we’ll skip that step
>>>> here."
>>>>
>>>> If the Apache service is not enabled they how can I connect to it
>>>> via below command:
>>>>
>>>> # wget -O - http://localhost/server-status
>>>> --2021-03-21 15:38:39-- http://localhost/server-status
>>>> Resolving localhost (localhost)... 127.0.0.1, ::1
>>>> Connecting to localhost (localhost)|127.0.0.1|:80... failed:
>>>> Connection timed out.
>>>> Connecting to localhost (localhost)|::1|:80... failed: Network is
>>>> unreachable.
>>> Pacemaker starts the httpd service by starting the
>>> ocf:heartbeat:apache resource. The article is saying that the
>>> httpd.service systemd unit should not be enabled to start
>>> automatically at boot; it should only start when the cluster starts
>>> it. That is `systemctl is-enabled httpd.service` should print
>>> "disabled".
>>>
>>>>
>>>>
>>>> 2- Below commands must be run on both nodes or just one node?
>>>>
>>>> # pcs resource create ClusterIP ocf:heartbeat:IPaddr2
>>>> ip="IP_That_Never_Used_In_The_Network" cidr_netmask=32 op monitor
>>>> interval=30s
>>>>
>>>> # pcs resource create WebSite ocf:heartbeat:apache
>>>> configfile=/etc/httpd/conf/httpd.conf statusurl="
>>>> http://localhost/server-status" op monitor interval=20s
>>> Just one node.
>>>
>>>>
>>>>
>>>> 3- Why "* WebSite (ocf::heartbeat:apache): Stopped" ?
>>> The apache resource agent ran a command similar to `wget -O- -q -L
>>> --
>>> no-proxy --bind-address=127.0.0.1 <status_url>` and got an error.
>>> It
>>> tried this on a start operation on each node, and it failed on both
>>> nodes. When a resource fails to start on a given node, the default
>>> response is to prevent it from starting on that node again until
>>> the
>>> failure is cleared.
>>>
>>>
>>>
>>>>
>>>> Logs are:
>>>> https://paste.ubuntu.com/p/MtkfXyRX4P/
>>>>
>>>>
>>>> Thank you.
>>>>
>>>> _______________________________________________
>>>> Manage your subscription:
>>>> https://lists.clusterlabs.org/mailman/listinfo/users
>>>>
>>>> ClusterLabs home: https://www.clusterlabs.org/
>>>>
>>>
More information about the Users
mailing list