[ClusterLabs] newbie questions
Digimer
lists at alteeve.ca
Wed Jun 1 03:34:09 UTC 2016
On 31/05/16 10:41 PM, Jay Scott wrote:
> hooray for me, but, how?
>
> I got about 3/4 of Digimer's list done and got stuck.
> I did a pcs cluster status, and, behold, the cluster was up.
> I pinged the ClusterIP and it answered. I didn't know what
> to do with the 'delay="x"' part, that's the thing I couldn't figure
> out. (I've been assuming the delay part is a big deal.)
Delay works like this;
Both nodes are up, but comms break (switch loop/broadcast storm,
STP/stack renegotiation, iptables oops, whatever)... Both nodes declare
their peer lost.
Node 1's stonith config includes 'delay="15"'.
Node 1 looks up how to fence node 2, calls the fence.
Node 2 looks up how to fence node 1, calls fence (passing to the agent
the delay).
The fence agent running on node 1 executes without delay.
The fence agent running on node 2 sees a delay of 15 seconds, and sleeps.
Node 1 kills node 2 before the sleep exits, thus ensuring that node 1
lived and node 2 died. Assuming you have your services on node 1, then
that means no recovery is needed.
Now assume that node 1 truly died. Node 2's fence agent would exit the
sleep after 15 seconds and proceed to shoot node 1 and then recover any
resources that had been on node 1.
digimer
> However, there are more things for me to read and more experiments
> for me to try so I'm good for now.
>
> Thanks to everyone for the prompt help.
>
> j.
>
> On Tue, May 31, 2016 at 5:22 PM, Ken Gaillot <kgaillot at redhat.com
> <mailto:kgaillot at redhat.com>> wrote:
>
> On 05/31/2016 03:59 PM, Jay Scott wrote:
> > Greetings,
> >
> > Cluster newbie
> > Centos 7
> > trying to follow the "Clusters from Scratch" intro.
> > 2 nodes (yeah, I know, but I'm just learning)
> > <PRE>
> > [root at smoking ~]# pcs status
> > Cluster name:
> > Last updated: Tue May 31 15:32:18 2016 Last change: Tue May 31
> > 15:02:21
> > 2016 by root via cibadmin on smoking
> > Stack: unknown
>
> "Stack: unknown" is a big problem. The cluster isn't aware of the
> corosync configuration. Did you do the "pcs cluster setup" step?
>
> > Current DC: NONE
> > 2 nodes and 1 resource configured
> >
> > OFFLINE: [ mars smoking ]
> >
> > Full list of resources:
> >
> > ClusterIP (ocf::heartbeat:IPaddr2): Stopped
> >
> > PCSD Status:
> > smoking: Online
> > mars: Online
> >
> > Daemon Status:
> > corosync: active/enabled
> > pacemaker: active/enabled
> > pcsd: active/enabled
> > </PRE>
> >
> > What concerns me at the moment:
> > I did
> > pcs resource enable ClusterIP
> > while simultaneously doing
> > tail -f /var/log/cluster/corosync.log
> > (the only log in there)
>
> The system log (/var/log/messages or whatever your system has
> configured) is usually the best place to start. The cluster software
> sends messages of interest to end users there, and it includes messages
> from all components (corosync, pacemaker, resource agents, etc.).
>
> /var/log/cluster/corosync.log (and in some configurations,
> /var/log/pacemaker.log) have more detailed log information for
> debugging.
>
> > and nothing happens in the log, but the ClusterIP
> > stays "Stopped". Should I be able to ping that addr?
> > I can't.
> > It also says OFFLINE: and both of my machines are offline,
> > though the PCSD says they're online. Which do I trust?
>
> The first online/offline output is most important, and refers to the
> node's status in the actual cluster; the "PSCD" online/offline output
> simply tells whether the pcs daemon is running. Typically, the pcs
> daemon is enabled at boot and is always running. The pcs daemon is not
> part of the clustering itself; it's a front end to configuring and
> administering the cluster.
>
> > [root at smoking ~]# pcs property show stonith-enabled
> > Cluster Properties:
> > stonith-enabled: false
> >
> > yet I see entries in the corosync.log referring to stonith.
> > I'm guessing that's normal.
>
> Yes, you can enable stonith at any time, so the stonith daemon will
> still run, to stay aware of the cluster status.
>
> > My corosync.conf file says the quorum is off.
> >
> > I also don't know what to include in this for any of you to
> > help me debug.
> >
> > Ahh, also, is this considered "long", and if so, where would I post
> > to the web?
> >
> > thx.
> >
> > j.
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org <mailto:Users at clusterlabs.org>
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
>
>
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
More information about the Users
mailing list