[ClusterLabs] WebSite_start_0 on node2 'error' (1): call=6, status='complete', exitreason='Failed to access httpd status page.'

Mon Mar 22 15:31:50 EDT 2021

On Mon, 2021-03-22 at 17:31 +0000, Jason Long wrote:
> Thank you.
> From chapter 1 to 6, I never saw anything about configuring the
> floating IP address! Am I wrong?

Hi,

Chapter 6 should be "Create an Active/Passive Cluster", which adds a
floating IP, then Chapter 7 is "Add Apache HTTP Server as a Cluster
Service".

> On Monday, March 22, 2021, 07:06:47 PM GMT+4:30, Ken Gaillot <
> kgaillot at redhat.com> wrote: 
> 
> 
> 
> 
> 
> On Mon, 2021-03-22 at 08:15 +0000, Jason Long wrote:
> > Thank you.
> > 
> > My test lab use VirtualBox with two VMs as below:
> > VM1: This VM has two NICs (NAT, Host-only Adapter)
> > VM2: This VM has one NIC (Host-only Adapter)
> > 
> > On VM1, I use the NAT interface for the port forwarding:
> > "127.0.0.1:2080" on Host  FORWARDING TO 127.0.0.1:80 on Guest.
> > 
> > 
> > Yes, "systemctl" tell me:
> > 
> > # systemctl is-enabled httpd.service
> > disabled
> > 
> > I rebooted my nodes and one of the problems solved:
> > https://paste.ubuntu.com/p/7cQQtsXFPV/
> > 
> > I did:
> > # pcs resource defaults resource-stickiness=100
> > 
> > 
> > When I browse "127.0.0.1:2080" then it shows me "My Test Site -
> > node1".
> > 
> > I have two problems:
> > 
> > 1- When I stopped the node1 VM and refresh the page then I can't
> > see
> > "My Test Site - node2"?
> > 
> > # pcs cluster stop node1
> > node1: Stopping Cluster (pacemaker)...
> > node1: Stopping Cluster (corosync)...
> > 
> > # pcs status
> > Error: error running crm_mon, is pacemaker running?
> > Could not connect to the CIB: Transport endpoint is not connected
> > crm_mon: Error: cluster is not available on this node
> 
> Hi,
> 
> pcs status doesn't test the web site, it shows the internal cluster
> status. Since the cluster isn't running on that node, it can't show
> anything.
> 
> However the website is still active on the other node, and reachable
> from this node. You can confirm that by using wget or curl with the
> public web site URL (the floating IP address).
> 
> > 
> > # pcs resource defaults
> > Error: unable to get cib
> > 
> > 
> > I think that it must forward my requests from node1 to node2
> > automatically and I see "My Test Site - node2" message.
> > 
> > 
> > 2- I start the node1 again, but when I browse "IP:80", then I can't
> > see "My Test Site - node1" message.
> > 
> > # pcs cluster start node1
> > node1: Starting Cluster...
> > 
> > 
> > # pcs status
> > Cluster name: mycluster
> > Cluster Summary:
> >   * Stack: corosync
> >   * Current DC: node2 (version 2.0.5-10.fc33-ba59be7122) -
> > partition
> > with quorum
> >   * Last updated: Mon Mar 22 12:26:10 2021
> >   * Last change:  Mon Mar 22 12:08:02 2021 by root via cibadmin on
> > node1
> >   * 2 nodes configured
> >   * 2 resource instances configured
> > 
> > Node List:
> >   * Online: [ node1 node2 ]
> > 
> > Full List of Resources:
> >   * WebSite    (ocf::heartbeat:apache):    Started node2
> >   * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node2
> > 
> > Daemon Status:
> >   corosync: active/enabled
> >   pacemaker: active/enabled
> >   pcsd: active/enabled
> > 
> > 
> > 
> > Logs are:
> > https://paste.ubuntu.com/p/Yt4K2kPM7b/
> > 
> > 
> > Thank you again.
> > 
> > 
> > On Monday, March 22, 2021, 01:12:21 AM GMT+4:30, Reid Wahl <
> > nwahl at redhat.com> wrote: 
> > 
> > 
> > 
> > 
> > 
> > Hi, Jason.
> > 
> > On Sun, Mar 21, 2021 at 5:21 AM Jason Long <hack3rcon at yahoo.com>
> > wrote:
> > > Hello,
> > > I used "Clusters from Scratch" to configuration two nodes. I got
> > > below error:
> > > 
> > > # pcs status
> > > Cluster name: mycluster
> > > Cluster Summary:
> > >   * Stack: corosync
> > >   * Current DC: node1 (version 2.0.5-10.fc33-ba59be7122) -
> > > partition with quorum
> > >   * Last updated: Sun Mar 21 15:35:18 2021
> > >   * Last change:  Sun Mar 21 15:29:38 2021 by root via cibadmin
> > > on
> > > node1
> > >   * 2 nodes configured
> > >   * 2 resource instances configured
> > > 
> > > Node List:
> > >   * Online: [ node1 node2 ]
> > > 
> > > Full List of Resources:
> > >   * WebSite    (ocf::heartbeat:apache):    Stopped
> > >   * ClusterIP    (ocf::heartbeat:IPaddr2):    Started node1
> > > 
> > > Failed Resource Actions:
> > >   * WebSite_start_0 on node1 'error' (1): call=6,
> > > status='complete', exitreason='Failed to access httpd status
> > > page.', last-rc-change='2021-03-21 15:23:45 +03:30', queued=0ms,
> > > exec=1318ms
> > >   * WebSite_start_0 on node2 'error' (1): call=6,
> > > status='complete', exitreason='Failed to access httpd status
> > > page.', last-rc-change='2021-03-21 15:23:47 +03:30', queued=0ms,
> > > exec=1380ms
> > > 
> > > Daemon Status:
> > >   corosync: active/enabled
> > >   pacemaker: active/enabled
> > >   pcsd: active/enabled
> > > 
> > > 
> > > *********
> > > I have some questions:
> > > 
> > > 1- In "Chapter 6. Add Apache HTTP Server as a Cluster Service",
> > > an
> > > important note said:
> > > "Do not enable the httpd service. Services that are intended to
> > > be
> > > managed via the cluster software should never be managed by the
> > > OS.
> > > It is often useful, however, to manually start the service,
> > > verify
> > > that it works, then stop it again, before adding it to the
> > > cluster.
> > > This allows you to resolve any non-cluster-related problems
> > > before
> > > continuing. Since this is a simple example, we’ll skip that step
> > > here."
> > > 
> > > If the Apache service is not enabled they how can I connect to it
> > > via below command: 
> > >   
> > > # wget -O - http://localhost/server-status
> > > --2021-03-21 15:38:39--  http://localhost/server-status
> > > Resolving localhost (localhost)... 127.0.0.1, ::1
> > > Connecting to localhost (localhost)|127.0.0.1|:80... failed:
> > > Connection timed out.
> > > Connecting to localhost (localhost)|::1|:80... failed: Network is
> > > unreachable.
> > 
> > Pacemaker starts the httpd service by starting the
> > ocf:heartbeat:apache resource. The article is saying that the
> > httpd.service systemd unit should not be enabled to start
> > automatically at boot; it should only start when the cluster starts
> > it. That is `systemctl is-enabled httpd.service` should print
> > "disabled".
> > 
> > >   
> > > 
> > > 2- Below commands must be run on both nodes or just one node?
> > > 
> > > # pcs resource create ClusterIP ocf:heartbeat:IPaddr2
> > > ip="IP_That_Never_Used_In_The_Network" cidr_netmask=32 op monitor
> > > interval=30s
> > > 
> > > # pcs resource create WebSite ocf:heartbeat:apache
> > > configfile=/etc/httpd/conf/httpd.conf statusurl="
> > > http://localhost/server-status" op monitor interval=20s
> > 
> > Just one node.
> > 
> > >   
> > > 
> > > 3- Why "* WebSite    (ocf::heartbeat:apache):    Stopped" ?
> > 
> > The apache resource agent ran a command similar to `wget -O- -q -L
> > --
> > no-proxy --bind-address=127.0.0.1 <status_url>` and got an error.
> > It
> > tried this on a start operation on each node, and it failed on both
> > nodes. When a resource fails to start on a given node, the default
> > response is to prevent it from starting on that node again until
> > the
> > failure is cleared.
> > 
> > 
> > 
> > >   
> > > Logs are:
> > > https://paste.ubuntu.com/p/MtkfXyRX4P/
> > > 
> > > 
> > > Thank you.
> > > 
> > > _______________________________________________
> > > Manage your subscription:
> > > https://lists.clusterlabs.org/mailman/listinfo/users
> > > 
> > > ClusterLabs home: https://www.clusterlabs.org/
> > > 
> > 
> > 
-- 
Ken Gaillot <kgaillot at redhat.com>