[Pacemaker] ERROR: Unable to find nic or netmask.

Tue Sep 2 01:47:18 EDT 2014

Hi,

After some investigation, it seems that my Apache is having trouble
starting in both nodes. I get the following error message when I try to
restart the service:

Job for httpd.service failed. See 'systemctl status httpd.service' and
'journalctl -xn' for details.

"systemctl status httpd.service" shows the following output:

httpd.service - The Apache HTTP Server
   Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled)
   Active: failed (Result: exit-code) since Tue 2014-09-02 13:45:52 SGT; 8s
ago
  Process: 26095 ExecStop=/bin/kill -WINCH ${MAINPID} (code=exited,
status=0/SUCCESS)
  Process: 26093 ExecStart=/usr/sbin/httpd $OPTIONS -DFOREGROUND
(code=exited, status=1/FAILURE)
 Main PID: 26093 (code=exited, status=1/FAILURE)

Sep 02 13:45:52 node02 httpd[26093]: AH00558: httpd: Could not reliably
det...ge
Sep 02 13:45:52 node02 httpd[26093]: (98)Address already in use: AH00072:
m...80
Sep 02 13:45:52 node02 httpd[26093]: no listening sockets available,
shutti...wn
Sep 02 13:45:52 node02 httpd[26093]: AH00015: Unable to open logs
Sep 02 13:45:52 node02 systemd[1]: httpd.service: main process exited,
code...RE
Sep 02 13:45:52 node02 systemd[1]: Failed to start The Apache HTTP Server.
Sep 02 13:45:52 node02 systemd[1]: Unit httpd.service entered failed state.
Hint: Some lines were ellipsized, use -l to show in full.

/var/log/messages also shows similar messages

Sep  2 13:41:12 node02 systemd: Starting The Apache HTTP Server...
Sep  2 13:41:12 node02 httpd: AH00558: httpd: Could not reliably determine
the server's fully qualified domain name, using 192.168.0.112. Set the
'ServerName' directive globally to suppress this message
Sep  2 13:41:12 node02 httpd: (98)Address already in use: AH00072:
make_sock: could not bind to address 127.0.0.1:80
Sep  2 13:41:12 node02 httpd: no listening sockets available, shutting down
Sep  2 13:41:12 node02 httpd: AH00015: Unable to open logs
Sep  2 13:41:12 node02 systemd: httpd.service: main process exited,
code=exited, status=1/FAILURE
Sep  2 13:41:12 node02 systemd: Failed to start The Apache HTTP Server.
Sep  2 13:41:12 node02 systemd: Unit httpd.service entered failed state.

Is this related to the problem?

On Tue, Sep 2, 2014 at 12:42 PM, Teerapatr Kittiratanachai <
maillist.tk at gmail.com> wrote:

> Try to set cidr_netmask=32 for resource only, and let the physical
> interface's netmask be 24.
>
> On Tue, Sep 2, 2014 at 11:27 AM, Sihan Goi <goister at gmail.com> wrote:
> > Got it. Changed the netmask for both PCs to 255.255.255.0 and changed
> > cidr_netmask to 24 and it works...sort of.
> >
> > It was working for a while, and then I rebooted both PCs, and now each
> > thinks its online and the other is offline.
> >
> > "pcs status" on my node01 gives the following output:
> > Cluster name: cluster_web
> > Last updated: Tue Sep  2 12:21:25 2014
> > Last change: Tue Sep  2 12:13:27 2014 via cibadmin on node02
> > Stack: corosync
> > Current DC: node01 (1) - partition WITHOUT quorum
> > Version: 1.1.10-32.el7_0-368c726
> > 2 Nodes configured
> > 2 Resources configured
> >
> >
> > Online: [ node01 ]
> > OFFLINE: [ node02 ]
> >
> > Full list of resources:
> >
> >  virtual_ip    (ocf::heartbeat:IPaddr2):    Started node01
> >  webserver    (ocf::heartbeat:apache):    Started node01
> >
> > PCSD Status:
> >   node01: Offline
> >   node02: Online
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/disabled
> >
> > However, "pcs status" on node02 shows the following output:
> > Cluster name: cluster_web
> > Last updated: Tue Sep  2 12:20:41 2014
> > Last change: Tue Sep  2 11:59:03 2014 via cibadmin on node02
> > Stack: corosync
> > Current DC: node02 (2) - partition WITHOUT quorum
> > Version: 1.1.10-32.el7_0-368c726
> > 2 Nodes configured
> > 2 Resources configured
> >
> >
> > Online: [ node02 ]
> > OFFLINE: [ node01 ]
> >
> > Full list of resources:
> >
> >  virtual_ip    (ocf::heartbeat:IPaddr2):    Started node02
> >  webserver    (ocf::heartbeat:apache):    Started node02
> >
> > PCSD Status:
> >   node01: Offline
> >   node02: Online
> >
> > Daemon Status:
> >   corosync: active/disabled
> >   pacemaker: active/disabled
> >   pcsd: active/disabled
> >
> > Seems like each node thinks it's online and the other is not. I'm
> running HA
> > on apache webserver, and if I access the webpage on node01, I get
> node01's
> > index.html. If I access it on node02, I get node02's index.html. If I
> access
> > it via another PC connected to the same AP, the webpage is unavailable.
> >
> > What could be wrong?
> >
> >
> > On Mon, Sep 1, 2014 at 9:09 PM, John Lauro <john.lauro at covenanteyes.com>
> > wrote:
> >>
> >> ip=192.168.0.110 cidr_netmask=32
> >> /32 leaves no room for any other IP addresses on that interface and so
> you
> >> have to specify the nic.  Are you certain 192.168.0.111 and
> 192.168.0.112 do
> >> not have a different netmask from 255.255.255.255, like 255.255.255.0
> for
> >> /24 or 255.255.0.0 for /16?  If they do have 255.255.255.255 too, then
> they
> >> are probably not setup correctly...
> >>
> >> PS: cidr_netmask is optional.  Assuming a proper netmask (not
> >> 255.255.255.2555) is on 192.168.0.111 and 192.168.0.112 it should work
> >> without specifying cidr_netmask.
> >>
> >>
> >> ________________________________
> >>
> >> From: "Sihan Goi" <goister at gmail.com>
> >> To: pacemaker at oss.clusterlabs.org
> >> Sent: Monday, September 1, 2014 4:17:20 AM
> >> Subject: [Pacemaker] ERROR: Unable to find nic or netmask.
> >>
> >>
> >> Hi,
> >>
> >> I'm trying to create a HA cluster with 2 CentOS 7 PCs connected to a
> >> wireless AP. The PCs have the static IP addresses 192.168.0.111 and
> >> 192.168.0.112 respectively and hostnames node01 and node02 respectively.
> >>
> >> I've tried to create a virtual IP address of 192.168.0.110 using the
> >> following command:
> >>
> >> pcs resource create virtual_ip ocf:heartbeat:IPaddr2 ip=192.168.0.110
> >> cidr_netmask=32 op monitor interval=30s
> >>
> >> However, when I do a "pcs status resources" I get the following output:
> >>
> >>  virtual_ip    (ocf::heartbeat:IPaddr2):    Stopped
> >>
> >> The virtual IP is stopped rather than started. I looked into
> >> /var/log/messages and /var/log/pacemaker.log
> >>  and I find the following error messages:
> >>
> >> node02 IPaddr2(virtual_ip)[25451]: ERROR: Unable to find nic or netmask.
> >> node02 IPaddr2(virtual_ip)[25451]: ERROR: [findif] failed
> >>
> >> It seems that it's unable to find my nic. How can I fix this?
> >>
> >> Thanks.
> >>
> >> _______________________________________________
> >> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >>
> >> Project Home: http://www.clusterlabs.org
> >> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >> Bugs: http://bugs.clusterlabs.org
> >>
> >>
> >
> >
> >
> > --
> > - Goi Sihan
> > goister at gmail.com
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

-- 
- Goi Sihan
goister at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clusterlabs.org/pipermail/pacemaker/attachments/20140902/841d049b/attachment-0003.html>