[ClusterLabs] Antw: Re: Dual Primary DRBD + OCFS2 (elias)
Ulrich Windl
Ulrich.Windl at rz.uni-regensburg.de
Wed Nov 20 06:29:57 EST 2019
Maybe show what you did. Did DLM start successfully?
>>> ???? ??????? <elias at po-mayak.ru> schrieb am 20.11.2019 um 06:12 in
Nachricht
<20191120051305.052936005F7 at iwtm.local>:
> Thanks Roger!
>
> I configured according to the SUSE doc for OCFS2, but DLM resource stop with
> error -107 (no interface found).
> I think it is necessary to configure the OCFS2 cluster manually, but
> correctly do it through the RA Pacemaker.
>
> Ilya Nasonov
> elias at po-mayak
>
> От: users-request at clusterlabs.org
> Отправлено: 19 ноября 2019 г. в 19:32
> Кому: users at clusterlabs.org
> Тема: Users Digest, Vol 58, Issue 20
>
> Send Users mailing list submissions to
> users at clusterlabs.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.clusterlabs.org/mailman/listinfo/users
> or, via email, send a message with subject or body 'help' to
> users-request at clusterlabs.org
>
> You can reach the person managing the list at
> users-owner at clusterlabs.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Users digest..."
>
>
> Today's Topics:
>
> 1. Re: Antw: Re: Pacemaker 2.0.3-rc3 now available
> (Jehan-Guillaume de Rorthais)
> 2. corosync 3.0.1 on Debian/Buster reports some MTU errors
> (Jean-Francois Malouin)
> 3. Dual Primary DRBD + OCFS2 (???? ???????)
> 4. Re: Dual Primary DRBD + OCFS2 (Roger Zhou)
> 5. Q: ldirectord and "checktype = external-perl" broken?
> (Ulrich Windl)
> 6. Q: ocf:pacemaker:ping (Ulrich Windl)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 18 Nov 2019 18:13:57 +0100
> From: Jehan-Guillaume de Rorthais <jgdr at dalibo.com>
> To: Ken Gaillot <kgaillot at redhat.com>
> Cc: Cluster Labs - All topics related to open-source clustering
> welcomed <users at clusterlabs.org>
> Subject: Re: [ClusterLabs] Antw: Re: Pacemaker 2.0.3-rc3 now
> available
> Message-ID: <20191118181357.6899c051 at firost>
> Content-Type: text/plain; charset=UTF-8
>
> On Mon, 18 Nov 2019 10:45:25 -0600
> Ken Gaillot <kgaillot at redhat.com> wrote:
>
>> On Fri, 2019-11-15 at 14:35 +0100, Jehan-Guillaume de Rorthais wrote:
>> > On Thu, 14 Nov 2019 11:09:57 -0600
>> > Ken Gaillot <kgaillot at redhat.com> wrote:
>> >
>> > > On Thu, 2019-11-14 at 15:22 +0100, Ulrich Windl wrote:
>> > > > > > > Jehan-Guillaume de Rorthais <jgdr at dalibo.com> schrieb am
>> > > > > > > 14.11.2019 um
>> > > >
>> > > > 15:17 in
>> > > > Nachricht <20191114151719.6cbf4e38 at firost>:
>> > > > > On Wed, 13 Nov 2019 17:30:31 ?0600
>> > > > > Ken Gaillot <kgaillot at redhat.com> wrote:
>> > > > > ...
>> > > > > > A longstanding pain point in the logs has been improved.
>> > > > > > Whenever
>> > > > > > the
>> > > > > > scheduler processes resource history, it logs a warning for
>> > > > > > any
>> > > > > > failures it finds, regardless of whether they are new or old,
>> > > > > > which can
>> > > > > > confuse anyone reading the logs. Now, the log will contain
>> > > > > > the
>> > > > > > time of
>> > > > > > the failure, so it's obvious whether you're seeing the same
>> > > > > > event
>> > > > > > or
>> > > > > > not. The log will also contain the exit reason if one was
>> > > > > > provided by
>> > > > > > the resource agent, for easier troubleshooting.
>> > > > >
>> > > > > I've been hurt by this in the past and I was wondering what was
>> > > > > the
>> > > > > point of
>> > > > > warning again and again in the logs for past failures during
>> > > > > scheduling?
>> > > > > What this information brings to the administrator?
>> > >
>> > > The controller will log an event just once, when it happens.
>> > >
>> > > The scheduler, on the other hand, uses the entire recorded resource
>> > > history to determine the current resource state. Old failures (that
>> > > haven't been cleaned) must be taken into account.
>> >
>> > OK, I wasn't aware of this. If you have a few minutes, I would be
>> > interested to
>> > know why the full history is needed and not just find the latest
>> > entry from
>> > there. Or maybe there's some comments in the source code that already
>> > cover this question?
>>
>> The full *recorded* history consists of the most recent operation that
>> affects the state (like start/stop/promote/demote), the most recent
>> failed operation, and the most recent results of any recurring
>> monitors.
>>
>> For example there may be a failed monitor, but whether the resource is
>> considered failed or not would depend on whether there was a more
>> recent successful stop or start. Even if the failed monitor has been
>> superseded, it needs to stay in the history for display purposes until
>> the user has cleaned it up.
>
> OK, understood.
>
> Maybe that's why "FAILED" appears shortly in crm_mon during a resource move
> on
> a clean resource, but with past failures? Maybe I should dig this weird
> behavior and wrap up a bug report if I confirm this?
>
>> > > Every run of the scheduler is completely independent, so it doesn't
>> > > know about any earlier runs or what they logged. Think of it like
>> > > Frosty the Snowman saying "Happy Birthday!" every time his hat is
>> > > put
>> > > on.
>> >
>> > I don't have this ref :)
>>
>> I figured not everybody would, but it was too fun to pass up :)
>>
>> The snowman comes to life every time his magic hat is put on, but to
>> him each time feels like he's being born for the first time, so he says
>> "Happy Birthday!"
>>
>> https://www.youtube.com/watch?v=1PbWTEYoN8o
>
> heh :)
>
>> > > As far as each run is concerned, it is the first time it's seen the
>> > > history. This is what allows the DC role to move from node to node,
>> > > and
>> > > the scheduler to be run as a simulation using a saved CIB file.
>> > >
>> > > We could change the wording further if necessary. The previous
>> > > version
>> > > would log something like:
>> > >
>> > > warning: Processing failed monitor of my-rsc on node1: not running
>> > >
>> > > and this latest change will log it like:
>> > >
>> > > warning: Unexpected result (not running: No process state file
>> > > found)
>> > > was recorded for monitor of my-rsc on node1 at Nov 12 19:19:02 2019
>> >
>> > /result/state/ ?
>>
>> It's the result of a resource agent action, so it could be for example
>> a timeout or a permissions issue.
>
> ok
>
>> > > I wanted to be explicit about the message being about processing
>> > > resource history that may or may not be the first time it's been
>> > > processed and logged, but everything I came up with seemed too long
>> > > for
>> > > a log line. Another possibility might be something like:
>> > >
>> > > warning: Using my-rsc history to determine its current state on
>> > > node1:
>> > > Unexpected result (not running: No process state file found) was
>> > > recorded for monitor at Nov 12 19:19:02 2019
>> >
>> > I better like the first one.
>> >
>> > However, it feels like implementation details exposed to the world,
>> > isn't it? How useful is this information for the end user? What the
>> > user can do
>> > with this information? There's noting to fix and this is not actually
>> > an error
>> > of the current running process.
>> >
>> > I still fail to understand why the scheduler doesn't process the
>> > history
>> > silently, whatever it finds there, then warn for something really
>> > important if
>> > the final result is not expected...
>>
>> From the scheduler's point of view, it's all relevant information that
>> goes into the decision making. Even an old failure can cause new
>> actions, for example if quorum was not held at the time but has now
>> been reached, or if there is a failure-timeout that just expired. So
>> any failure history is important to understanding whatever the
>> scheduler says needs to be done.
>>
>> Also, the scheduler is run on the DC, which is not necessarily the node
>> that executed the action. So it's useful for troubleshooting to present
>> a picture of the whole cluster on the DC, rather than just what's the
>> situation on the local node.
>
> OK, kind of got it. The scheduler need to summarize the chain of event to
> define the state of a resource based on the last event.
>
>> I could see an argument for lowering it from warning to notice, but
>> it's a balance between what's most useful during normal operation and
>> what's most useful during troubleshooting.
>
> So in my humble opinion, the messages should definitely be at notice level.
> Maybe they should even go to debug level. I never had to troubleshoot a bad
> decision from the scheduler because of a bad state summary.
> Moreover, if needed, the admin can still study the history from cib backed
> up
> on disk, isn't it?
>
> The alternative would be to spit the event chain in details only if the
> result
> of the summary is different from what the scheduler was expecting?
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 18 Nov 2019 16:31:34 -0500
> From: Jean-Francois Malouin <Jean-Francois.Malouin at bic.mni.mcgill.ca>
> To: The Pacemaker Cluster List <users at clusterlabs.org>
> Subject: [ClusterLabs] corosync 3.0.1 on Debian/Buster reports some
> MTU errors
> Message-ID: <20191118213134.huecj2xnbtrtdqmm at bic.mni.mcgill.ca>
> Content-Type: text/plain; charset=us-ascii
>
> Hi,
>
> Maybe not directly a pacemaker question but maybe some of you have seen
this
> problem:
>
> A 2 node pacemaker cluster running corosync-3.0.1 with dual communication
> ring
> sometimes reports errors like this in the corosync log file:
>
> [KNET ] pmtud: PMTUD link change for host: 2 link: 0 from 470 to 1366
> [KNET ] pmtud: PMTUD link change for host: 2 link: 1 from 470 to 1366
> [KNET ] pmtud: Global data MTU changed to: 1366
> [CFG ] Modified entry 'totem.netmtu' in corosync.conf cannot be changed at
> run-time
> [CFG ] Modified entry 'totem.netmtu' in corosync.conf cannot be changed at
> run-time
>
> Those do not happen very frequenly, once a week or so...
>
> However the system log on the nodes reports those much more frequently, a
> few
> times a day:
>
> Nov 17 23:26:20 node1 corosync[2258]: [KNET ] link: host: 2 link: 1 is
> down
> Nov 17 23:26:20 node1 corosync[2258]: [KNET ] host: host: 2 (passive)
> best link: 0 (pri: 0)
> Nov 17 23:26:26 node1 corosync[2258]: [KNET ] rx: host: 2 link: 1 is up
> Nov 17 23:26:26 node1 corosync[2258]: [KNET ] host: host: 2 (passive)
> best link: 1 (pri: 1)
>
> Are those to be dismissed or are they indicative of a network
> misconfig/problem?
> I tried setting 'knet_transport: udpu' in the totem section (the default
> value)
> but it didn't seem to make a difference...Hard coding netmtu to 1500 and
> allowing for longer (10s) token timeout also didn't seem to affect the
> issue.
>
>
> Corosync config follows:
>
> /etc/corosync/corosync.conf
>
> totem {
> version: 2
> cluster_name: bicha
> transport: knet
> link_mode: passive
> ip_version: ipv4
> token: 10000
> netmtu: 1500
> knet_transport: sctp
> crypto_model: openssl
> crypto_hash: sha256
> crypto_cipher: aes256
> keyfile: /etc/corosync/authkey
> interface {
> linknumber: 0
> knet_transport: udp
> knet_link_priority: 0
> }
> interface {
> linknumber: 1
> knet_transport: udp
> knet_link_priority: 1
> }
> }
> quorum {
> provider: corosync_votequorum
> two_node: 1
> # expected_votes: 2
> }
> nodelist {
> node {
> ring0_addr: xxx.xxx.xxx.xxx
> ring1_addr: zzz.zzz.zzz.zzx
> name: node1
> nodeid: 1
> }
> node {
> ring0_addr: xxx.xxx.xxx.xxy
> ring1_addr: zzz.zzz.zzz.zzy
> name: node2
> nodeid: 2
> }
> }
> logging {
> to_logfile: yes
> to_syslog: yes
> logfile: /var/log/corosync/corosync.log
> syslog_facility: daemon
> debug: off
> timestamp: on
> logger_subsys {
> subsys: QUORUM
> debug: off
> }
> }
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 19 Nov 2019 13:51:59 +0500
> From: ???? ??????? <elias at po-mayak.ru>
> To: " users at clusterlabs.org" <users at clusterlabs.org>
> Subject: [ClusterLabs] Dual Primary DRBD + OCFS2
> Message-ID: <20191119085203.2771960014A at iwtm.local>
> Content-Type: text/plain; charset="utf-8"
>
> Hello!
>
> Configured a cluster (2-node DRBD+DLM+CFS2) and it works.
> I heard the opinion that OCFS2 file system is better. Found an old cluster
> setup description:
> https://wiki.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2
> but as I understand it, o2cb Service is not supported Pacemaker on Debian.
> Where can I get the latest information on setting up the OCFS2.
>
> ? ?????????,
> ???? ???????
> elias at po-mayak
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
<https://lists.clusterlabs.org/pipermail/users/attachments/20191119/95e4c791/
> attachment-0001.html>
>
> ------------------------------
>
> Message: 4
> Date: Tue, 19 Nov 2019 10:01:01 +0000
> From: Roger Zhou <ZZhou at suse.com>
> To: "users at clusterlabs.org" <users at clusterlabs.org>
> Subject: Re: [ClusterLabs] Dual Primary DRBD + OCFS2
> Message-ID: <572e29b1-4c05-a985-7419-462310d1c626 at suse.com>
> Content-Type: text/plain; charset="utf-8"
>
>
> On 11/19/19 4:51 PM, ???? ??????? wrote:
>> Hello!
>>
>> Configured a cluster (2-node DRBD+DLM+CFS2) and it works.
>>
>> I heard the opinion that OCFS2 file system is better. Found an old
>> cluster setup
>> description:https://wiki.clusterlabs.org/wiki/Dual_Primary_DRBD_%2B_OCFS2
>>
>> but as I understand it, o2cb Service is not supported Pacemaker on Debian.
>>
>> Where can I get the latest information on setting up the OCFS2.
>
> Probably you can refer to SUSE doc for OCFS2 with Pacemaker [1]. Should
> be not much different to adapt to Debian, I feel.
>
> [1]
> https://documentation.suse.com/sle-ha/15-SP1/html/SLE-HA-all/cha-ha-ocfs2.ht
> ml
>
> Cheers,
> Roger
>
>
>>
>> ? ?????????,
>> ???? ???????
>> elias at po-mayak
>>
>>
>> _______________________________________________
>> Manage your subscription:
>> https://lists.clusterlabs.org/mailman/listinfo/users
>>
>> ClusterLabs home: https://www.clusterlabs.org/
>>
>
> ------------------------------
>
> Message: 5
> Date: Tue, 19 Nov 2019 14:58:08 +0100
> From: "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de>
> To: <users at clusterlabs.org>
> Subject: [ClusterLabs] Q: ldirectord and "checktype = external-perl"
> broken?
> Message-ID: <5DD3F4F0020000A10003544E at gwsmtp.uni-regensburg.de>
> Content-Type: text/plain; charset=US-ASCII
>
> Hi!
>
> In SLES11 I developed some special check program for ldirectord 3.9.5 in
> Perl, but then I discovered that it won't work correctly with "checktype =
> external-perl". Changing to "checktype = external" made it work.
> Today I played with it in SLES12 SP4 and
> ldirectord-4.3.018.a7fb5035-3.25.1.18557.0.PTF.1153889.x86_64, just to
> discover that it still does not work.
>
> So I wonder: Is it really broken all the time, or is there some special
> thing to consider that isn't written in the manual page?
>
> Th effec tobservable is that the weight is set to 0 right after starting
> with weight = 1. If it works, the weight is set to 1.
>
> Regards,
> Ulrich
>
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Tue, 19 Nov 2019 15:32:43 +0100
> From: "Ulrich Windl" <Ulrich.Windl at rz.uni-regensburg.de>
> To: <users at clusterlabs.org>
> Subject: [ClusterLabs] Q: ocf:pacemaker:ping
> Message-ID: <5DD3FD0B020000A100035452 at gwsmtp.uni-regensburg.de>
> Content-Type: text/plain; charset=US-ASCII
>
> Hi!
>
> Seems today I'm digging out old stuff:
> I can remeber in 2011 that the documentation for ping's dampen was not very
> help ful. I think it still is:
>
> (RA info)
> node connectivity (ocf:pacemaker:ping)
>
> Every time the monitor action is run, this resource agent records (in the
> CIB) the current number of nodes the host can connect to using the system
> fping (preferred) or ping tool.
>
> Parameters (*: required, []: default):
>
> pidfile (string, [/var/run/ping-ping]):
> PID file
>
> dampen (integer, [5s]): Dampening interval
> The time to wait (dampening) further changes occur
>
> name (string, [pingd]): Attribute name
> The name of the attributes to set. This is the name to be used in the
> constraints.
>
> multiplier (integer, [1]): Value multiplier
> The number by which to multiply the number of connected ping nodes by
>
> host_list* (string): Host list
> A space separated list of ping nodes to count.
>
> attempts (integer, [3]): no. of ping attempts
> Number of ping attempts, per host, before declaring it dead
>
> timeout (integer, [2]): ping timeout in seconds
> How long, in seconds, to wait before declaring a ping lost
>
> options (string): Extra Options
> A catch all for any other options that need to be passed to ping.
>
> failure_score (integer):
> Resource is failed if the score is less than failure_score.
> Default never fails.
>
> use_fping (boolean, [1]): Use fping if available
> Use fping rather than ping, if found. If set to 0, fping
> will not be used even if present.
>
> debug (string, [false]): Verbose logging
> Enables to use default attrd_updater verbose logging on every call.
>
> Operations' defaults (advisory minimum):
>
> start timeout=60
> stop timeout=20
> monitor timeout=60 interval=10
> ---------
>
> "The name of the attributes to set.": Why plural ("attributes")?
> "The time to wait (dampening) further changes occur": Is this an English
> sentence at all?
>
> Regards,
> Ulrich
>
>
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users
>
> ClusterLabs home: https://www.clusterlabs.org/
>
> ------------------------------
>
> End of Users Digest, Vol 58, Issue 20
> *************************************
More information about the Users
mailing list