From elias at po-mayak.ru Sun Aug 2 23:26:21 2020 From: elias at po-mayak.ru (=?utf-8?B?0JjQu9GM0Y8g0J3QsNGB0L7QvdC+0LK=?=) Date: Mon, 3 Aug 2020 08:26:21 +0500 Subject: [ClusterLabs] Clear Pending Fencing Action Message-ID: <20200803032623.75154600275@iwtm.local> Hello! After troubleshooting 2-Node cluster, crm_mon deprecated actions are displayed in ?Pending Fencing Action:? list. How can I delete them. ?stonith_admin --cleanup --history=*? does not delete it. ? ?????????, ???? ??????? elias at po-mayak.ru -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwahl at redhat.com Mon Aug 3 01:04:26 2020 From: nwahl at redhat.com (Reid Wahl) Date: Sun, 2 Aug 2020 22:04:26 -0700 Subject: [ClusterLabs] Clear Pending Fencing Action In-Reply-To: <20200803032623.75154600275@iwtm.local> References: <20200803032623.75154600275@iwtm.local> Message-ID: Hi, ????. `stonith_admin --cleanup` doesn't get rid of pending actions, only failed ones. You might be hitting https://bugs.clusterlabs.org/show_bug.cgi?id=5401. I believe a simultaneous reboot of both nodes will clear the pending actions. I don't recall whether there's any other way to clear them. On Sun, Aug 2, 2020 at 8:26 PM ???? ??????? wrote: > Hello! > > > > After troubleshooting 2-Node cluster, crm_mon deprecated actions are > displayed in ?Pending Fencing Action:? list. > > How can I delete them. > > ?stonith_admin --cleanup --history=*? does not delete it. > > > > > > ? ?????????, > ???? ??????? > elias at po-mayak.ru > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl, RHCA Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ulrich.Windl at rz.uni-regensburg.de Mon Aug 3 01:53:36 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Mon, 03 Aug 2020 07:53:36 +0200 Subject: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] why is node fenced ? In-Reply-To: <577378942.29145077.1596207358970.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <5F2276A7020000A10003A4EC@gwsmtp.uni-regensburg.de> <1551037644.28469209.1596140634882.JavaMail.zimbra@helmholtz-muenchen.de> <5F23B44F020000A10003A58B@gwsmtp.uni-regensburg.de> <577378942.29145077.1596207358970.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <5F27A660020000A10003A63B@gwsmtp.uni-regensburg.de> >>> "Lentes, Bernd" schrieb am 31.07.2020 um 16:55 in Nachricht <577378942.29145077.1596207358970.JavaMail.zimbra at helmholtz-muenchen.de>: > > ----- On Jul 31, 2020, at 8:03 AM, Ulrich Windl > Ulrich.Windl at rz.uni-regensburg.de wrote: > > >>>> >>>> My guess is that ha-idg-1 was fenced because a failed migration from >>> ha-idg-2 >>>> is treated like a stop failure on ha-idg-2. Stop failures cause fencing. >> You >>>> should have tested your resource before going productive. >>> >>> Migration failed at 16:59:34. >>> Node is fenced at 17:05:35. 6 minutes later. >>> The cluster needs 6 minutes to decide to fence the node ? >>> I don't believe that the failed migration is the cause for the fencing. >> >> What are the values for migration timeout and for stop timeout? >> > > > primitive vm_nextcloud VirtualDomain \ > params config="/mnt/share/vm_nextcloud.xml" \ > params hypervisor="qemu:///system" \ > params migration_transport=ssh \ > params migrate_options="--p2p --tunnelled" \ > op start interval=0 timeout=120 \ > op stop interval=0 timeout=180 \ <================== > op monitor interval=30 timeout=25 \ > op migrate_from interval=0 timeout=300 \ <================= > op migrate_to interval=0 timeout=300 \ <================ > meta allow-migrate=true target-role=Started is-managed=true > maintenance=false \ > utilization cpu=1 hv_memory=4096 > > 3 or 5 minutes, not 6 minutes. My guess would rather be: 5 Minutes for migration; if it fails a stop attempt (actually a restart) with another 3 minutes. That would be 8 minutes. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From kgaillot at redhat.com Tue Aug 4 16:38:38 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 04 Aug 2020 15:38:38 -0500 Subject: [ClusterLabs] Antw: [EXT] Coming in Pacemaker 2.0.5: finer control over resource and operation defaults In-Reply-To: <5F1A8AAA020000A10003A3EB@gwsmtp.uni-regensburg.de> References: <99c11c73d59560fccd472d09c3b76073dab1b73e.camel@redhat.com> <5F1A8AAA020000A10003A3EB@gwsmtp.uni-regensburg.de> Message-ID: On Fri, 2020-07-24 at 09:15 +0200, Ulrich Windl wrote: > > > > Ken Gaillot schrieb am 23.07.2020 um > > > > 23:54 in > > Nachricht > <99c11c73d59560fccd472d09c3b76073dab1b73e.camel at redhat.com>: > > Hi all, > > > > Pacemaker 2.0.4 is barely out the door, and we're already looking > > ahead > > to 2.0.5, expected at the end of this year. > > > > One of the new features, already available in the master branch, > > will > > be finer?grained control over resource and operation defaults. > > > > Currently, you can set meta?attribute values in the CIB's > > rsc_defaults > > section to apply to all resources, and op_defaults to apply to all > > operations. Rules can be used to apply defaults only during certain > > times. For example, to set a default stickiness of INFINITY during > > business hours and 0 outside those hours: > > > > > > > > > > > operation="date_spec"> > > > weekdays="1?5"/> > > > > > > > name="resource?stickiness" > > value="INFINITY"/> > > > > > > > name="resource?stickiness" > > value="0"/> > > > > > > > > But what if you want to change the default stickiness of just pgsql > > databases? Or the default timeout of only start operations? > > We are using a rather similar scenario, like the stickyness. However > we > distinguish between productive and "no so productive" (test, > developement) > resources. First we release the stickiness of non-essential resources > so that > they can be re-balanced if needed. Later when the productive > resources are > released, the nodes maybe balanced already using the non-essential > resources. > > At the moment we copied the rules to each resource, which is not > nice, of > course. > > I'd appreciate: > date_spec be defined once and reused often > rule be defined once and reused often Reusing a rule is already possible -- define it as usual in one place, and in the other use That doesn't work for date_spec though. > > > > > 2.0.5 will add new rule expressions for this purpose. Examples: > > > > > > > > > > > provider="heartbeat" type="pgsqlms"/> > > > > > value="INFINITY"/> > > > > > > > > > > > > > > > interval="0"/> > > > > > > > > > > > > You can combine rsc_expression and op_expression in op_defaults > > rules, > > if for example you want to set a default stop timeout for all > > ocf:heartbeat:docker resources. > > > > This obviously can be convenient if you have many resources of the > > same > > type, but it has one other trick up its sleeve: this is the only > > way > > you can affect the meta?attributes of resources implicitly created > > by > > Pacemaker for bundles. > > > > When you configure a bundle, Pacemaker will implicitly create > > container > > resources (ocf:heartbeat:docker, ocf:heartbeat:rkt, or > > ocf:heartbeat:podman) and if appropriate, IP resources > > (ocf:heartbeat:IPaddr2). Previously, there was no way to directly > > affect these resources, but with these new expressions you can at > > least > > configure defaults that apply to them, without having to use those > > same > > defaults for all your resources. > > ?? > > Ken Gaillot > > > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot From acecile at le-vert.net Sun Aug 9 15:11:51 2020 From: acecile at le-vert.net (=?UTF-8?Q?Adam_C=c3=a9cile?=) Date: Sun, 9 Aug 2020 21:11:51 +0200 Subject: [ClusterLabs] Automatic recover from split brain ? Message-ID: Hello, I'm experiencing issue with corosync/pacemaker running on Debian Buster. Cluster has three nodes running in VMWare virtual machine and the cluster fails when VEEAM backups the virtual machine (I know it's doing bad things, like freezing completely the VM for a few minutes to make disk snapshot). My biggest issue is that once the backup has been completed, the cluster stays in split brain state, and I'd like it to heal itself. Here current status: One node is isolated: Stack: corosync Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition WITHOUT quorum Last updated: Sat Aug? 8 11:59:46 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host2.domain.com ] OFFLINE: [ host3.domain.com host1.domain.com ] Two others are seeing each others: Stack: corosync Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with quorum Last updated: Sat Aug? 8 12:07:56 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host3.domain.com host1.domain.com ] OFFLINE: [ host2.domain.com ] The problem is that one of the resources is a floating IP address which is currently assigned to two different hosts... Can you help me configuring the cluster correctly so this cannot occurs ? Thanks in advance, Adam. From kgaillot at redhat.com Mon Aug 10 16:19:10 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 10 Aug 2020 15:19:10 -0500 Subject: [ClusterLabs] Automatic recover from split brain ? In-Reply-To: References: Message-ID: On Sun, 2020-08-09 at 21:11 +0200, Adam C?cile wrote: > Hello, > > > I'm experiencing issue with corosync/pacemaker running on Debian > Buster. > Cluster has three nodes running in VMWare virtual machine and the > cluster fails when VEEAM backups the virtual machine (I know it's > doing > bad things, like freezing completely the VM for a few minutes to > make > disk snapshot). > > My biggest issue is that once the backup has been completed, the > cluster > stays in split brain state, and I'd like it to heal itself. Here Fencing is how the cluster prevents split-brain. When one node is lost, the other nodes will not recover any resources from it until it's fenced. For VMWare there's a fence_vmware_soap fence agent. However that's intended for failure scenarios, not a planned outage like a backup snapshot. For planned outages, you can set the cluster-wide property "maintenance-mode" to true. The cluster won't start, monitor, or stop resources while in maintenance mode. You can use rules to automatically put the cluster in maintenance mode at specific times. However I believe even in maintenance mode, the node will get fenced if it drops out of the corosync membership. Ideally you'd put the cluster in maintenance mode, stop pacemaker and corosync on the node, do the backup, then start pacemaker and corosync, wait for them to come up, and take the cluster out of maintenance mode. Alternatively, if you want the resources to move to other nodes while the backup is being done, you could put the node in standby rather than set maintenance mode. > current > status: > > > One node is isolated: > > Stack: corosync > Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition > WITHOUT quorum > Last updated: Sat Aug 8 11:59:46 2020 > Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on > host1.domain.com > > 3 nodes configured > 6 resources configured > > Online: [ host2.domain.com ] > OFFLINE: [ host3.domain.com host1.domain.com ] > > > Two others are seeing each others: > > Stack: corosync > Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition > with > quorum > Last updated: Sat Aug 8 12:07:56 2020 > Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on > host1.domain.com > > 3 nodes configured > 6 resources configured > > Online: [ host3.domain.com host1.domain.com ] > OFFLINE: [ host2.domain.com ] > > > The problem is that one of the resources is a floating IP address > which > is currently assigned to two different hosts... > > > Can you help me configuring the cluster correctly so this cannot > occurs ? > > > Thanks in advance, > > Adam. > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot From acecile at le-vert.net Sat Aug 8 06:10:08 2020 From: acecile at le-vert.net (=?UTF-8?Q?Adam_C=c3=a9cile?=) Date: Sat, 8 Aug 2020 12:10:08 +0200 Subject: [ClusterLabs] Automatic recover from split brain ? Message-ID: Hello, I'm experiencing issue with corosync/pacemaker running on Debian Buster. Cluster has three nodes running in VMWare virtual machine and the cluster fails when VEEAM backups the virtual machine (I know it's doing bad things, like freezing completely the VM for a few minutes to make disk snapshot). My biggest issue is that once the backup has been completed, the cluster stays in split brain state, and I'd like it to heal itself. Here current status: One node is isolated: Stack: corosync Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition WITHOUT quorum Last updated: Sat Aug? 8 11:59:46 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host2.domain.com ] OFFLINE: [ host3.domain.com host1.domain.com ] Two others are seeing each others: Stack: corosync Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with quorum Last updated: Sat Aug? 8 12:07:56 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host3.domain.com host1.domain.com ] OFFLINE: [ host2.domain.com ] The problem is that one of the resources is a floating IP address which is currently assigned to two different hosts... Can you help me configuring the cluster correctly so this cannot occurs ? Thanks in advance, Adam. From bernd.lentes at helmholtz-muenchen.de Thu Aug 6 05:06:35 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Thu, 6 Aug 2020 11:06:35 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> Message-ID: <505919068.32962787.1596704795719.JavaMail.zimbra@helmholtz-muenchen.de> ----- Am 29. Jul 2020 um 18:53 schrieb kgaillot kgaillot at redhat.com: > Since the ha-idg-2 is now shutting down, ha-idg-1 becomes DC. The other way round. >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> unpack_rsc_op_failure: Processing failed migrate_to of vm_nextcloud >> on ha-idg-1: unknown error | rc=1 >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> unpack_rsc_op_failure: Processing failed start of vm_nextcloud on >> ha-idg-2: unknown error | rc >> >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: info: >> native_color: Resource vm_nextcloud cannot run anywhere >> logical >> >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable >> (pending) >> ??? > > So this appears to be the problem. From these logs I would guess the > successful stop on ha-idg-1 did not get written to the CIB for some > reason. I'd look at the pe input from this transition on ha-idg-2 to > confirm that. > > Without the DC knowing about the stop, it tries to schedule a new one, > but the node is shutting down so it can't do it, which means it has to > be fenced. > >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable >> (offline) >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> pe_fence_node: Cluster node ha-idg-1 will be fenced: resource >> actions are unrunnable >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> stage6: Scheduling Node ha-idg-1 for STONITH >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: info: >> native_stop_constraints: vm_nextcloud_stop_0 is implicit after ha- >> idg-1 is fenced >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: notice: >> LogNodeActions: * Fence (Off) ha-idg-1 'resource actions are >> unrunnable' >> >> >> Why does it say "Jul 20 17:05:35 [10690] ha-idg- >> 2 pengine: warning: custom_action: Action vm_nextcloud_stop_0 >> on ha-idg-1 is unrunnable (offline)" although >> "Jul 20 17:04:06 [23768] ha-idg-1 crmd: notice: >> process_lrm_event: Result of stop operation for vm_nextcloud on >> ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 confirmed=true >> cib-update=5960" >> says that stop was ok ? Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From kgaillot at redhat.com Mon Aug 10 12:47:24 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 10 Aug 2020 11:47:24 -0500 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote Message-ID: Hi all, Looking ahead to the Pacemaker 2.0.5 release expected at the end of this year, here is a new feature already in the master branch. When configuring resource operations, Pacemaker lets you set an "on- fail" policy to specify whether to restart the resource, fence the node, etc., if the operation fails. With 2.0.5, a new possible value will be "demote", which will mean "demote this resource but do not fully restart it". "Demote" will be a valid value only for promote actions, and for recurring monitors with "role" set to "Master". Once the resource is demoted, it will be eligible for promotion again, so if the promotion scores have not changed, a promote on the same node may be attempted. If this is not desired, the agent can change the promotion scores either in the failed monitor or the demote. The intended use case is an application where a successful demote assures a well-functioning service, and a full restart would be unnecessarily heavyweight. A large database might be an example. Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option to specify what happens to resources when quorum is lost (the default being to stop them). With 2.0.5, "demote" will be a possible value here as well, and will mean "demote all promotable resources and stop all other resources". The intended use case is an application that cannot cause any harm after being demoted, and may be useful in a demoted role even if there is no quorum. A database that operates read-only when demoted and doesn't depend on any non-promotable resources might be an example. Happy clustering :) -- Ken Gaillot From bernd.lentes at helmholtz-muenchen.de Sun Aug 9 16:17:24 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Sun, 9 Aug 2020 22:17:24 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> Message-ID: <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> ----- Am 29. Jul 2020 um 18:53 schrieb kgaillot kgaillot at redhat.com: > On Wed, 2020-07-29 at 17:26 +0200, Lentes, Bernd wrote: >> Hi, >> >> a few days ago one of my nodes was fenced and i don't know why, which >> is something i really don't like. >> What i did: >> I put one node (ha-idg-1) in standby. The resources on it (most of >> all virtual domains) were migrated to ha-idg-2, >> except one domain (vm_nextcloud). On ha-idg-2 a mountpoint was >> missing the xml of the domain points to. >> Then the cluster tries to start vm_nextcloud on ha-idg-2 which of >> course also failed. >> Then ha-idg-1 was fenced. >> I did a "crm history" over the respective time period, you find it >> here: >> https://hmgubox2.helmholtz-muenchen.de/index.php/s/529dfcXf5a72ifF >> >> Here, from my point of view, the most interesting from the logs: >> ha-idg-1: >> Jul 20 16:59:33 [23763] ha-idg-1 cib: info: >> cib_perform_op: Diff: --- 2.16196.19 2 >> Jul 20 16:59:33 [23763] ha-idg-1 cib: info: >> cib_perform_op: Diff: +++ 2.16197.0 bc9a558dfbe6d7196653ce56ad1ee758 >> Jul 20 16:59:33 [23763] ha-idg-1 cib: info: >> cib_perform_op: + /cib: @epoch=16197, @num_updates=0 >> Jul 20 16:59:33 [23763] ha-idg-1 cib: info: >> cib_perform_op: + /cib/configuration/nodes/node[@id='1084777482']/i >> nstance_attributes[@id='nodes-108 >> 4777482']/nvpair[@id='nodes-1084777482-standby']: @value=on >> ha-idg-1 set to standby >> >> Jul 20 16:59:34 [23768] ha-idg-1 crmd: notice: >> process_lrm_event: ha-idg-1-vm_nextcloud_migrate_to_0:3169 [ >> error: Cannot access storage file >> '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubuntu/ >> ubuntu-18.04.4-live-server-amd64.iso': No such file or >> directory\nocf-exit-reason:vm_nextcloud: live migration to ha-idg-2 >> failed: 1\n ] >> migration failed >> >> Jul 20 17:04:01 [23767] ha-idg-1 pengine: error: >> native_create_actions: Resource vm_nextcloud is active on 2 nodes >> (attempting recovery) >> ??? > > This is standard for a failed live migration -- the cluster doesn't > know how far the migration actually got before failing, so it has to > assume the VM could be active on either node. (The log message would > make more sense saying "might be active" rather than "is active".) > >> Jul 20 17:04:01 [23767] ha-idg-1 pengine: notice: >> LogAction: * >> Recover vm_nextcloud ( ha-idg-2 ) > > The recovery from that situation is a full stop on both nodes, and > start on one of them. > >> Jul 20 17:04:01 [23768] ha-idg-1 crmd: notice: >> te_rsc_command: Initiating stop operation vm_nextcloud_stop_0 on ha- >> idg-2 | action 106 >> Jul 20 17:04:01 [23768] ha-idg-1 crmd: notice: >> te_rsc_command: Initiating stop operation vm_nextcloud_stop_0 >> locally on ha-idg-1 | action 2 >> >> Jul 20 17:04:01 [23768] ha-idg-1 crmd: info: >> match_graph_event: Action vm_nextcloud_stop_0 (106) confirmed >> on ha-idg-2 (rc=0) >> >> Jul 20 17:04:06 [23768] ha-idg-1 crmd: notice: >> process_lrm_event: Result of stop operation for vm_nextcloud on >> ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 confirmed=true >> cib-update=5960 > > It looks like both stops succeeded. > >> Jul 20 17:05:29 [23761] ha-idg-1 pacemakerd: notice: >> crm_signal_dispatch: Caught 'Terminated' signal | 15 (invoking >> handler) >> systemctl stop pacemaker.service >> >> >> ha-idg-2: >> Jul 20 17:04:03 [10691] ha-idg-2 crmd: notice: >> process_lrm_event: Result of stop operation for vm_nextcloud on >> ha-idg-2: 0 (ok) | call=157 key=vm_nextcloud_stop_0 confirmed=true >> cib-update=57 >> the log from ha-idg-2 is two seconds ahead of ha-idg-1 >> >> Jul 20 17:04:08 [10688] ha-idg-2 lrmd: notice: >> log_execute: executing - rsc:vm_nextcloud action:start >> call_id:192 >> Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: >> operation_finished: vm_nextcloud_start_0:29107:stderr [ error: >> Failed to create domain from /mnt/share/vm_nextcloud.xml ] >> Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: >> operation_finished: vm_nextcloud_start_0:29107:stderr [ error: >> Cannot access storage file >> '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubuntu/ >> ubuntu-18.04.4-live-server-amd64.iso': No such file or directory ] >> Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: >> operation_finished: vm_nextcloud_start_0:29107:stderr [ ocf- >> exit-reason:Failed to start virtual domain vm_nextcloud. ] >> Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: >> log_finished: finished - rsc:vm_nextcloud action:start call_id:192 >> pid:29107 exit-code:1 exec-time:581ms queue-time:0ms >> start on ha-idg-2 failed > > The start failed ... > >> Jul 20 17:05:32 [10691] ha-idg-2 crmd: info: >> do_dc_takeover: Taking over DC status for this partition >> ha-idg-1 stopped pacemaker > > Since the ha-idg-2 is now shutting down, ha-idg-1 becomes DC. > >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> unpack_rsc_op_failure: Processing failed migrate_to of vm_nextcloud >> on ha-idg-1: unknown error | rc=1 >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> unpack_rsc_op_failure: Processing failed start of vm_nextcloud on >> ha-idg-2: unknown error | rc >> >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: info: >> native_color: Resource vm_nextcloud cannot run anywhere >> logical >> >> Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: >> custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable >> (pending) >> ??? > > So this appears to be the problem. From these logs I would guess the > successful stop on ha-idg-1 did not get written to the CIB for some > reason. I'd look at the pe input from this transition on ha-idg-2 to > confirm that. > > Without the DC knowing about the stop, it tries to schedule a new one, > but the node is shutting down so it can't do it, which means it has to > be fenced. > >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is unrunnable >> (offline) >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> pe_fence_node: Cluster node ha-idg-1 will be fenced: resource >> actions are unrunnable >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: >> stage6: Scheduling Node ha-idg-1 for STONITH >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: info: >> native_stop_constraints: vm_nextcloud_stop_0 is implicit after ha- >> idg-1 is fenced >> Jul 20 17:05:35 [10690] ha-idg-2 pengine: notice: >> LogNodeActions: * Fence (Off) ha-idg-1 'resource actions are >> unrunnable' >> >> >> Why does it say "Jul 20 17:05:35 [10690] ha-idg- >> 2 pengine: warning: custom_action: Action vm_nextcloud_stop_0 >> on ha-idg-1 is unrunnable (offline)" although >> "Jul 20 17:04:06 [23768] ha-idg-1 crmd: notice: >> process_lrm_event: Result of stop operation for vm_nextcloud on >> ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 confirmed=true >> cib-update=5960" >> says that stop was ok ? I'm stll digging in the logs trying to understand what happened. What i'm wondering abaout: Jul 20 17:04:26 [23768] ha-idg-1 crmd: notice: run_graph: Transition 4515 (Complete=10, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-3117.bz2): Complete Jul 20 17:05:29 [23768] ha-idg-1 crmd: info: abort_transition_graph: Transition 4515 aborted by status-1084777482-shutdown doing create shutdown=1595257529: Transient attribute change | cib=2.16197.72 source=abort_unless_down:317 path=/cib/status/node_state[@id='1084777482']/transient_attributes[@id='1084777482']/instance_attributes[@id='status-1084777482'] complete=true A transition is completed and three secons later aborted. How can that be? Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From mariusz.gronczewski at efigence.com Mon Aug 10 08:21:53 2020 From: mariusz.gronczewski at efigence.com (Mariusz Gronczewski) Date: Mon, 10 Aug 2020 14:21:53 +0200 Subject: [ClusterLabs] How to specify which IP pcs should use? Message-ID: <20200810142153.603f56dd@hydra.home.zxz.li> Hi, Pacemaker 2, the current setup is * management network with host's hostname resolving to host's management IP * cluster network for Pacemaker/Corosync communication * corosync set up with node name and IP of the cluster network pcs status shows both nodes online, added config syncs to the other node etc. but pcs cluster status shows one node being offline. After a look in firewall logs it appears all of the communication is going just fine on the cluster network but PCS tries to talk with PCSD on port 2224 via *management* network instead of using IP set as ring0_addr in corosync Is "just use host's hostname regardless of config" something normal ? Is there a separate setting to pcs about which IP it should use ? Regards -- Mariusz Gronczewski, Administrator Efigence S. A. ul. Wo?oska 9a, 02-583 Warszawa T: [+48] 22 380 13 13 NOC: [+48] 22 380 10 20 E: admin at efigence.com From kgaillot at redhat.com Mon Aug 10 17:59:07 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 10 Aug 2020 16:59:07 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: On Sun, 2020-08-09 at 22:17 +0200, Lentes, Bernd wrote: > > ----- Am 29. Jul 2020 um 18:53 schrieb kgaillot kgaillot at redhat.com: > > > On Wed, 2020-07-29 at 17:26 +0200, Lentes, Bernd wrote: > > > Hi, > > > > > > a few days ago one of my nodes was fenced and i don't know why, > > > which > > > is something i really don't like. > > > What i did: > > > I put one node (ha-idg-1) in standby. The resources on it (most > > > of > > > all virtual domains) were migrated to ha-idg-2, > > > except one domain (vm_nextcloud). On ha-idg-2 a mountpoint was > > > missing the xml of the domain points to. > > > Then the cluster tries to start vm_nextcloud on ha-idg-2 which of > > > course also failed. > > > Then ha-idg-1 was fenced. > > > I did a "crm history" over the respective time period, you find > > > it > > > here: > > > https://hmgubox2.helmholtz-muenchen.de/index.php/s/529dfcXf5a72ifF > > > > > > Here, from my point of view, the most interesting from the logs: > > > ha-idg-1: > > > Jul 20 16:59:33 [23763] ha-idg-1 cib: info: > > > cib_perform_op: Diff: --- 2.16196.19 2 > > > Jul 20 16:59:33 [23763] ha-idg-1 cib: info: > > > cib_perform_op: Diff: +++ 2.16197.0 > > > bc9a558dfbe6d7196653ce56ad1ee758 > > > Jul 20 16:59:33 [23763] ha-idg-1 cib: info: > > > cib_perform_op: + /cib: @epoch=16197, @num_updates=0 > > > Jul 20 16:59:33 [23763] ha-idg-1 cib: info: > > > cib_perform_op: + /cib/configuration/nodes/node[@id='1084777482 > > > ']/i > > > nstance_attributes[@id='nodes-108 > > > 4777482']/nvpair[@id='nodes-1084777482-standby']: @value=on > > > ha-idg-1 set to standby > > > > > > Jul 20 16:59:34 [23768] ha-idg-1 crmd: notice: > > > process_lrm_event: ha-idg-1-vm_nextcloud_migrate_to_0:3169 > > > [ > > > error: Cannot access storage file > > > '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubu > > > ntu/ > > > ubuntu-18.04.4-live-server-amd64.iso': No such file or > > > directory\nocf-exit-reason:vm_nextcloud: live migration to ha- > > > idg-2 > > > failed: 1\n ] > > > migration failed > > > > > > Jul 20 17:04:01 [23767] ha-idg-1 pengine: error: > > > native_create_actions: Resource vm_nextcloud is active on 2 > > > nodes > > > (attempting recovery) > > > ??? > > > > This is standard for a failed live migration -- the cluster doesn't > > know how far the migration actually got before failing, so it has > > to > > assume the VM could be active on either node. (The log message > > would > > make more sense saying "might be active" rather than "is active".) > > > > > Jul 20 17:04:01 [23767] ha-idg-1 pengine: notice: > > > LogAction: * > > > Recover vm_nextcloud ( ha-idg-2 ) > > > > The recovery from that situation is a full stop on both nodes, and > > start on one of them. > > > > > Jul 20 17:04:01 [23768] ha-idg-1 crmd: notice: > > > te_rsc_command: Initiating stop operation vm_nextcloud_stop_0 on > > > ha- > > > idg-2 | action 106 > > > Jul 20 17:04:01 [23768] ha-idg-1 crmd: notice: > > > te_rsc_command: Initiating stop operation vm_nextcloud_stop_0 > > > locally on ha-idg-1 | action 2 > > > > > > Jul 20 17:04:01 [23768] ha-idg-1 crmd: info: > > > match_graph_event: Action vm_nextcloud_stop_0 (106) > > > confirmed > > > on ha-idg-2 (rc=0) > > > > > > Jul 20 17:04:06 [23768] ha-idg-1 crmd: notice: > > > process_lrm_event: Result of stop operation for > > > vm_nextcloud on > > > ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 > > > confirmed=true > > > cib-update=5960 > > > > It looks like both stops succeeded. > > > > > Jul 20 17:05:29 [23761] ha-idg-1 pacemakerd: notice: > > > crm_signal_dispatch: Caught 'Terminated' signal | 15 > > > (invoking > > > handler) > > > systemctl stop pacemaker.service > > > > > > > > > ha-idg-2: > > > Jul 20 17:04:03 [10691] ha-idg-2 crmd: notice: > > > process_lrm_event: Result of stop operation for > > > vm_nextcloud on > > > ha-idg-2: 0 (ok) | call=157 key=vm_nextcloud_stop_0 > > > confirmed=true > > > cib-update=57 > > > the log from ha-idg-2 is two seconds ahead of ha-idg-1 > > > > > > Jul 20 17:04:08 [10688] ha-idg-2 lrmd: notice: > > > log_execute: executing - rsc:vm_nextcloud action:start > > > call_id:192 > > > Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: > > > operation_finished: vm_nextcloud_start_0:29107:stderr [ > > > error: > > > Failed to create domain from /mnt/share/vm_nextcloud.xml ] > > > Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: > > > operation_finished: vm_nextcloud_start_0:29107:stderr [ > > > error: > > > Cannot access storage file > > > '/mnt/mcd/AG_BioInformatik/Technik/software_und_treiber/linux/ubu > > > ntu/ > > > ubuntu-18.04.4-live-server-amd64.iso': No such file or directory > > > ] > > > Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: > > > operation_finished: vm_nextcloud_start_0:29107:stderr [ ocf- > > > exit-reason:Failed to start virtual domain vm_nextcloud. ] > > > Jul 20 17:04:09 [10688] ha-idg-2 lrmd: notice: > > > log_finished: finished - rsc:vm_nextcloud action:start > > > call_id:192 > > > pid:29107 exit-code:1 exec-time:581ms queue-time:0ms > > > start on ha-idg-2 failed > > > > The start failed ... > > > > > Jul 20 17:05:32 [10691] ha-idg-2 crmd: info: > > > do_dc_takeover: Taking over DC status for this partition > > > ha-idg-1 stopped pacemaker > > > > Since the ha-idg-2 is now shutting down, ha-idg-1 becomes DC. > > > > > Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: > > > unpack_rsc_op_failure: Processing failed migrate_to of > > > vm_nextcloud > > > on ha-idg-1: unknown error | rc=1 > > > Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: > > > unpack_rsc_op_failure: Processing failed start of vm_nextcloud > > > on > > > ha-idg-2: unknown error | rc > > > > > > Jul 20 17:05:33 [10690] ha-idg-2 pengine: info: > > > native_color: Resource vm_nextcloud cannot run anywhere > > > logical > > > > > > Jul 20 17:05:33 [10690] ha-idg-2 pengine: warning: > > > custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is > > > unrunnable > > > (pending) > > > ??? > > > > So this appears to be the problem. From these logs I would guess > > the > > successful stop on ha-idg-1 did not get written to the CIB for some > > reason. I'd look at the pe input from this transition on ha-idg-2 > > to > > confirm that. > > > > Without the DC knowing about the stop, it tries to schedule a new > > one, > > but the node is shutting down so it can't do it, which means it has > > to > > be fenced. > > > > > Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: > > > custom_action: Action vm_nextcloud_stop_0 on ha-idg-1 is > > > unrunnable > > > (offline) > > > Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: > > > pe_fence_node: Cluster node ha-idg-1 will be fenced: resource > > > actions are unrunnable > > > Jul 20 17:05:35 [10690] ha-idg-2 pengine: warning: > > > stage6: Scheduling Node ha-idg-1 for STONITH > > > Jul 20 17:05:35 [10690] ha-idg-2 pengine: info: > > > native_stop_constraints: vm_nextcloud_stop_0 is implicit after > > > ha- > > > idg-1 is fenced > > > Jul 20 17:05:35 [10690] ha-idg-2 pengine: notice: > > > LogNodeActions: * Fence (Off) ha-idg-1 'resource actions are > > > unrunnable' > > > > > > > > > Why does it say "Jul 20 17:05:35 [10690] ha-idg- > > > 2 pengine: warning: custom_action: Action > > > vm_nextcloud_stop_0 > > > on ha-idg-1 is unrunnable (offline)" although > > > "Jul 20 17:04:06 [23768] ha-idg-1 crmd: notice: > > > process_lrm_event: Result of stop operation for > > > vm_nextcloud on > > > ha-idg-1: 0 (ok) | call=3197 key=vm_nextcloud_stop_0 > > > confirmed=true > > > cib-update=5960" > > > says that stop was ok ? > > I'm stll digging in the logs trying to understand what happened. > What i'm wondering abaout: > > Jul 20 17:04:26 [23768] ha-idg-1 crmd: notice: > run_graph: Transition 4515 (Complete=10, Pending=0, Fired=0, > Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input- > 3117.bz2): Complete > > Jul 20 17:05:29 [23768] ha-idg-1 crmd: info: > abort_transition_graph: Transition 4515 aborted by status- > 1084777482-shutdown doing create shutdown=1595257529: Transient > attribute change | cib=2.16197.72 > source=abort_unless_down:317 path=/cib/status/node_state[@id='10847 > 77482']/transient_attributes[@id='1084777482']/instance_attributes[@i > d='status-1084777482'] complete=true > > > A transition is completed and three secons later aborted. How can > that be? The most recent transition is aborted, but since all its actions are complete, the only effect is to trigger a new transition. We should probably rephrase the log message. In fact, the whole "transition" terminology is kind of obscure. It's hard to come up with something better though. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 -- Ken Gaillot From arvidjaar at gmail.com Tue Aug 11 02:48:15 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 11 Aug 2020 09:48:15 +0300 Subject: [ClusterLabs] Automatic recover from split brain ? In-Reply-To: References: Message-ID: 08.08.2020 13:10, Adam C?cile ?????: > Hello, > > > I'm experiencing issue with corosync/pacemaker running on Debian Buster. > Cluster has three nodes running in VMWare virtual machine and the > cluster fails when VEEAM backups the virtual machine (I know it's doing > bad things, like freezing completely the VM for a few minutes to make > disk snapshot). > > My biggest issue is that once the backup has been completed, the cluster > stays in split brain state, and I'd like it to heal itself. Here current > status: > > > One node is isolated: > > Stack: corosync > Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition > WITHOUT quorum > Last updated: Sat Aug? 8 11:59:46 2020 > Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on > host1.domain.com > > 3 nodes configured > 6 resources configured > > Online: [ host2.domain.com ] > OFFLINE: [ host3.domain.com host1.domain.com ] > > > Two others are seeing each others: > > Stack: corosync > Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with > quorum > Last updated: Sat Aug? 8 12:07:56 2020 > Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on > host1.domain.com > > 3 nodes configured > 6 resources configured > > Online: [ host3.domain.com host1.domain.com ] > OFFLINE: [ host2.domain.com ] > Show your full configuration including defined STONITH resources and cluster options (most importantly, no-quorum-policy and stonith-enabled). > > The problem is that one of the resources is a floating IP address which > is currently assigned to two different hosts... > Of course - each partition assumes another partition is dead and so it is free to take over remaining resources. > > Can you help me configuring the cluster correctly so this cannot occurs ? > Define "correctly". The most straightforward text book answer - you need to have STONITH resources that will eliminate "lost" node. But your lost node is in the middle of performing backup. Eliminating it may invalidate backup being created. So another answer would be - put cluster in maintenance mode, perform backup, resume normal operation. Usually backup software allows hooks to be executed before and after backup. It may work too. Or find a way to not freeze VM during backup ... e.g. by using different backup method? From tojeline at redhat.com Tue Aug 11 03:22:25 2020 From: tojeline at redhat.com (Tomas Jelinek) Date: Tue, 11 Aug 2020 09:22:25 +0200 Subject: [ClusterLabs] How to specify which IP pcs should use? In-Reply-To: <20200810142153.603f56dd@hydra.home.zxz.li> References: <20200810142153.603f56dd@hydra.home.zxz.li> Message-ID: <9d6eda77-6ea3-d3e6-95ac-e4068e8dbcd3@redhat.com> Hi Mariusz, You haven't mention pcs version you are running. Based on you mentioning running Pacemaker 2, I suppose you are running pcs 0.10.x. The text bellow applies to pcs 0.10.x. Pcs doesn't depend on or use corosync.conf when connecting to other nodes. The reason is pcs must be able to connect to nodes not specified in corosync.conf, e.g. when there is no cluster created yet. Instead, pcs has its own config file mapping node names to addresses. The easiest way to set it is to specify an address for each node in 'pcs host auth' command like this: pcs host auth addr= addr= ... Specifying addresses is not mandatory. If the addresses are omitted, pcs uses node names as addresses. See man pcs for more details. To fix your issues, run 'pcs host auth' and specify all nodes and their addresses. Running the command on one node of your cluster should be enough. Regards, Tomas Dne 10. 08. 20 v 14:21 Mariusz Gronczewski napsal(a): > Hi, > > Pacemaker 2, the current setup is > > * management network with host's hostname resolving to host's > management IP > * cluster network for Pacemaker/Corosync communication > * corosync set up with node name and IP of the cluster network > > pcs status shows both nodes online, added config syncs to the other > node etc. but pcs cluster status shows one node being offline. > > After a look in firewall logs it appears all of the communication is > going just fine on the cluster network but PCS tries to talk with PCSD > on port 2224 via *management* network instead of using IP set as > ring0_addr in corosync > > Is "just use host's hostname regardless of config" something normal ? > Is there a separate setting to pcs about which IP it should use ? > > Regards > From acecile at le-vert.net Tue Aug 11 03:34:40 2020 From: acecile at le-vert.net (=?UTF-8?Q?Adam_C=c3=a9cile?=) Date: Tue, 11 Aug 2020 09:34:40 +0200 Subject: [ClusterLabs] Automatic recover from split brain ? In-Reply-To: References:

Message-ID: <69719d4e-d990-a2ba-4347-0e679b582d3f@le-vert.net> On 8/11/20 8:48 AM, Andrei Borzenkov wrote: > 08.08.2020 13:10, Adam C?cile ?????: >> Hello, >> >> >> I'm experiencing issue with corosync/pacemaker running on Debian Buster. >> Cluster has three nodes running in VMWare virtual machine and the >> cluster fails when VEEAM backups the virtual machine (I know it's doing >> bad things, like freezing completely the VM for a few minutes to make >> disk snapshot). >> >> My biggest issue is that once the backup has been completed, the cluster >> stays in split brain state, and I'd like it to heal itself. Here current >> status: >> >> >> One node is isolated: >> >> Stack: corosync >> Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition >> WITHOUT quorum >> Last updated: Sat Aug? 8 11:59:46 2020 >> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on >> host1.domain.com >> >> 3 nodes configured >> 6 resources configured >> >> Online: [ host2.domain.com ] >> OFFLINE: [ host3.domain.com host1.domain.com ] >> >> >> Two others are seeing each others: >> >> Stack: corosync >> Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with >> quorum >> Last updated: Sat Aug? 8 12:07:56 2020 >> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on >> host1.domain.com >> >> 3 nodes configured >> 6 resources configured >> >> Online: [ host3.domain.com host1.domain.com ] >> OFFLINE: [ host2.domain.com ] >> > Show your full configuration including defined STONITH resources and > cluster options (most importantly, no-quorum-policy and stonith-enabled). Hello, Stonith is disabled and I tried various settings for no-quorum-policy. >> The problem is that one of the resources is a floating IP address which >> is currently assigned to two different hosts... >> > Of course - each partition assumes another partition is dead and so it > is free to take over remaining resources. I understand that but I still don't get why once all nodes are back online, the cluster does not heal from resources running one multiple hosts. > >> Can you help me configuring the cluster correctly so this cannot occurs ? >> > Define "correctly". > > The most straightforward text book answer - you need to have STONITH > resources that will eliminate "lost" node. But your lost node is in the > middle of performing backup. Eliminating it may invalidate backup being > created. Yeah but well, no. Killing the node is worse, sensible services are already running in clustering mode at application level so they do not rely on corosync. Basically corosync is providing a floating IP for some external non critical access and starting systemd timers that are pointless to be run on multiple hosts. Nothing critical here. > > So another answer would be - put cluster in maintenance mode, perform > backup, resume normal operation. Usually backup software allows hooks to > be executed before and after backup. It may work too. This in indeed something I might look at, but again, for my trivial needs it sounds a bit overkill to me. > Or find a way to not freeze VM during backup ... e.g. by using different > backup method? Or tweaks some network settings so corosync does not consider the node as being dead too soon ? Backup won't last more than 2 minutes and the freeze is usually way below. I can definitely leave with cluster state being unknown for a couple of minutes. Is that possible ? Removing VEEAM is indeed my last option and the one I used so far, but this time I was hoping someone else would be experiencing the same issue and could help me fixing that in a clean way. Thanks > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From gandalf.levert at gmail.com Sun Aug 9 15:14:00 2020 From: gandalf.levert at gmail.com (Adam Cecile) Date: Sun, 9 Aug 2020 21:14:00 +0200 Subject: [ClusterLabs] Automatic recover from split brain ? Message-ID: <926856b0-bd63-ab6d-f617-f0ed952b3904@gmail.com> Hello, I'm experiencing issue with corosync/pacemaker running on Debian Buster. Cluster has three nodes running in VMWare virtual machine and the cluster fails when VEEAM backups the virtual machine (I know it's doing bad things, like freezing completely the VM for a few minutes to make disk snapshot). My biggest issue is that once the backup has been completed, the cluster stays in split brain state, and I'd like it to heal itself. Here current status: One node is isolated: Stack: corosync Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition WITHOUT quorum Last updated: Sat Aug? 8 11:59:46 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host2.domain.com ] OFFLINE: [ host3.domain.com host1.domain.com ] Two others are seeing each others: Stack: corosync Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with quorum Last updated: Sat Aug? 8 12:07:56 2020 Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on host1.domain.com 3 nodes configured 6 resources configured Online: [ host3.domain.com host1.domain.com ] OFFLINE: [ host2.domain.com ] The problem is that one of the resources is a floating IP address which is currently assigned to two different hosts... Can you help me configuring the cluster correctly so this cannot occurs ? Thanks in advance, Adam. From arvidjaar at gmail.com Tue Aug 11 14:30:54 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 11 Aug 2020 21:30:54 +0300 Subject: [ClusterLabs] Automatic recover from split brain ? In-Reply-To: <69719d4e-d990-a2ba-4347-0e679b582d3f@le-vert.net> References:

<69719d4e-d990-a2ba-4347-0e679b582d3f@le-vert.net> Message-ID: <16b699d9-4c97-72a3-afd5-d56bd4ec1c28@gmail.com> 11.08.2020 10:34, Adam C?cile ?????: > On 8/11/20 8:48 AM, Andrei Borzenkov wrote: >> 08.08.2020 13:10, Adam C?cile ?????: >>> Hello, >>> >>> >>> I'm experiencing issue with corosync/pacemaker running on Debian Buster. >>> Cluster has three nodes running in VMWare virtual machine and the >>> cluster fails when VEEAM backups the virtual machine (I know it's doing >>> bad things, like freezing completely the VM for a few minutes to make >>> disk snapshot). >>> >>> My biggest issue is that once the backup has been completed, the cluster >>> stays in split brain state, and I'd like it to heal itself. Here current >>> status: >>> >>> >>> One node is isolated: >>> >>> Stack: corosync >>> Current DC: host2.domain.com (version 2.0.1-9e909a5bdd) - partition >>> WITHOUT quorum >>> Last updated: Sat Aug? 8 11:59:46 2020 >>> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on >>> host1.domain.com >>> >>> 3 nodes configured >>> 6 resources configured >>> >>> Online: [ host2.domain.com ] >>> OFFLINE: [ host3.domain.com host1.domain.com ] >>> >>> >>> Two others are seeing each others: >>> >>> Stack: corosync >>> Current DC: host3.domain.com (version 2.0.1-9e909a5bdd) - partition with >>> quorum >>> Last updated: Sat Aug? 8 12:07:56 2020 >>> Last change: Fri Jul 24 07:18:12 2020 by root via cibadmin on >>> host1.domain.com >>> >>> 3 nodes configured >>> 6 resources configured >>> >>> Online: [ host3.domain.com host1.domain.com ] >>> OFFLINE: [ host2.domain.com ] >>> >> Show your full configuration including defined STONITH resources and >> cluster options (most importantly, no-quorum-policy and stonith-enabled). > > Hello, > > Stonith is disabled and I tried various settings for no-quorum-policy. > >>> The problem is that one of the resources is a floating IP address which >>> is currently assigned to two different hosts... >>> >> Of course - each partition assumes another partition is dead and so it >> is free to take over remaining resources. > I understand that but I still don't get why once all nodes are back > online, the cluster does not heal from resources running one multiple > hosts. In my limited testing it does - after nodes see each other pacemaker sees resources active on multiple nodes and tries to fix it. This is with pacemaker 2.0.3. Check logs on all nodes what happens around time node becomes alive again. >> >>> Can you help me configuring the cluster correctly so this cannot >>> occurs ? >>> >> Define "correctly". >> >> The most straightforward text book answer - you need to have STONITH >> resources that will eliminate "lost" node. But your lost node is in the >> middle of performing backup. Eliminating it may invalidate backup being >> created. > Yeah but well, no. Killing the node is worse, sensible services are > already running in clustering mode at application level so they do not > rely on corosync. Basically corosync is providing a floating IP for some > external non critical access and starting systemd timers that are > pointless to be run on multiple hosts. Nothing critical here. >> >> So another answer would be - put cluster in maintenance mode, perform >> backup, resume normal operation. Usually backup software allows hooks to >> be executed before and after backup. It may work too. > This in indeed something I might look at, but again, for my trivial > needs it sounds a bit overkill to me. >> Or find a way to not freeze VM during backup ... e.g. by using different >> backup method? > > Or tweaks some network settings so corosync does not consider the node > as being dead too soon ? Backup won't last more than 2 minutes and the > freeze is usually way below. I can definitely leave with cluster state > being unknown for a couple of minutes. Is that possible ? > You probably can increase token time but that also directly affects how fast pacemaker will start and also how fast it will react to node failures. > Removing VEEAM is indeed my last option and the one I used so far, but > this time I was hoping someone else would be experiencing the same issue > and could help me fixing that in a clean way. > > > Thanks > >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > From rohitsaini111.forum at gmail.com Wed Aug 12 10:18:57 2020 From: rohitsaini111.forum at gmail.com (Rohit Saini) Date: Wed, 12 Aug 2020 19:48:57 +0530 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth Message-ID: Hi Team, Question-1: Similar to pcs alerts, do we have something similar for qdevice/qnetd? This is to detect asynchronously if any of the member is unreachable/joined/left and if that member is qdevice or qnetd. Question-2: Same above question for booth nodes and arbitrator. Is there any way to receive events from booth daemon? My main objective is to see if these daemons give events related to their internal state transitions and raise some alarms accordingly. For example, boothd arbitrator is unreachable, ticket moved from x to y, etc. Thanks, Rohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfriesse at redhat.com Wed Aug 12 11:28:15 2020 From: jfriesse at redhat.com (Jan Friesse) Date: Wed, 12 Aug 2020 17:28:15 +0200 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth In-Reply-To: References: Message-ID: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> Hi Rohit, Rohit Saini napsal(a): > Hi Team, > > Question-1: > Similar to pcs alerts, do we have something similar for qdevice/qnetd? This You mean pacemaker alerts right? > is to detect asynchronously if any of the member is unreachable/joined/left > and if that member is qdevice or qnetd. Nope but actually shouldn't be that hard to implement. What exactly would you like to see there? > > Question-2: > Same above question for booth nodes and arbitrator. Is there any way to > receive events from booth daemon? Not directly (again, shouldn't be that hard to implement). But pacemaker alerts should be triggered when service changes state because of ticket grant/reject, isn't it? > > My main objective is to see if these daemons give events related to > their internal state transitions and raise some alarms accordingly. For > example, boothd arbitrator is unreachable, ticket moved from x to y, etc. I don't think "boothd arbitrator is unreachable" alert is really doable. Ticket moved from x to y would be probably two alerts - 1. ticket rejected on X and 2. granted on Y. Would you mind to elaborate a bit more on events you would like to see and potentially open issue for upstream project (or, if you have a RH subscription try to contact GSS, so I get more time to work on this issue). Regards, Honza > > Thanks, > Rohit > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > From kgaillot at redhat.com Wed Aug 12 14:57:07 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Wed, 12 Aug 2020 13:57:07 -0500 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote Message-ID: (Apologies if this shows up twice -- clusterlabs.org had a mail configuration issue over the weekend, and I'm not sure if my original message was permanently lost or will eventually show up) Hi all, Looking ahead to the Pacemaker 2.0.5 release expected at the end of this year, here is a new feature already in the master branch. When configuring resource operations, Pacemaker lets you set an "on- fail" policy to specify whether to restart the resource, fence the node, etc., if the operation fails. With 2.0.5, a new possible value will be "demote", which will mean "demote this resource but do not fully restart it". "Demote" will be a valid value only for promote actions, and for recurring monitors with "role" set to "Master". Once the resource is demoted, it will be eligible for promotion again, so if the promotion scores have not changed, a promote on the same node may be attempted. If this is not desired, the agent can change the promotion scores either in the failed monitor or the demote. The intended use case is an application where a successful demote assures a well-functioning service, and a full restart would be unnecessarily heavyweight. A large database might be an example. Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option to specify what happens to resources when quorum is lost (the default being to stop them). With 2.0.5, "demote" will be a possible value here as well, and will mean "demote all promotable resources and stop all other resources". The intended use case is an application that cannot cause any harm after being demoted, and may be useful in a demoted role even if there is no quorum. A database that operates read-only when demoted and doesn't depend on any non-promotable resources might be an example. Happy clustering :) -- Ken Gaillot From rohitsaini111.forum at gmail.com Thu Aug 13 03:02:45 2020 From: rohitsaini111.forum at gmail.com (Rohit Saini) Date: Thu, 13 Aug 2020 12:32:45 +0530 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth In-Reply-To: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> References: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> Message-ID: Hi Honza, Thanks for your reply. Please find the attached image below: [image: image.png] Yes, I am talking about pacemaker alerts only. Please find my suggestions/requirements below: *Booth:* 1. Node5 booth-arbitrator should be able to give event when any of the booth node joins or leaves. booth-ip can be passed in event. 2. Event when booth-arbitrator is up successfully and has started monitoring the booth nodes. 2. Geo site booth should be able to give event when its booth peers joins/leaves. For example, Geo site1 gives an event when node5 booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can be passed in event. 3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5) should give events. Note: pacemaker alerts works in a cluster. Since, arbitrator is a non-cluster node, not sure how exactly it will work there. But this is good to have feature. *Qnetd/Qdevice:* This is similar to above. 1. Node5 qnetd should be able to raise an event when any of the cluster node joins/leaves the quorum. 2. Event when qnetd is up successfully and has started monitoring the cluster nodes 3. Cluster node should be able to give event when any of the quorum node leaves/joins. If you see on high level, then these are kind of node/resource events wrt booth and qnetd/qdevice. As of today wrt booth/qnetd, I don't see any provision where any of the nodes gives any event when its peer leaves/joins. This makes it difficult to know whether geo sites nodes can see booth-arbitrator or not. This is true the other way around also where booth-arbitrator cannot see geo booth sites. I am not sure how others are doing it in today's deployment, but I see need of monitoring of every other booth/qnet node. So that on basis of event, appropriate alarms can be raised and action can be taken accordingly. Please let me know if you agree on the usecases. I'll raise feature-request on the pacemaker upstream project accordingly. Thanks, Rohit On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse wrote: > Hi Rohit, > > Rohit Saini napsal(a): > > Hi Team, > > > > Question-1: > > Similar to pcs alerts, do we have something similar for qdevice/qnetd? > This > > You mean pacemaker alerts right? > > > is to detect asynchronously if any of the member is > unreachable/joined/left > > and if that member is qdevice or qnetd. > > Nope but actually shouldn't be that hard to implement. What exactly > would you like to see there? > > > > > Question-2: > > Same above question for booth nodes and arbitrator. Is there any way to > > receive events from booth daemon? > > Not directly (again, shouldn't be that hard to implement). But pacemaker > alerts should be triggered when service changes state because of ticket > grant/reject, isn't it? > > > > > My main objective is to see if these daemons give events related to > > their internal state transitions and raise some alarms accordingly. For > > example, boothd arbitrator is unreachable, ticket moved from x to y, etc. > > I don't think "boothd arbitrator is unreachable" alert is really doable. > Ticket moved from x to y would be probably two alerts - 1. ticket > rejected on X and 2. granted on Y. > > Would you mind to elaborate a bit more on events you would like to see > and potentially open issue for upstream project (or, if you have a RH > subscription try to contact GSS, so I get more time to work on this issue). > > Regards, > Honza > > > > > Thanks, > > Rohit > > > > > > > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 75435 bytes Desc: not available URL: From jfriesse at redhat.com Thu Aug 13 03:33:46 2020 From: jfriesse at redhat.com (Jan Friesse) Date: Thu, 13 Aug 2020 09:33:46 +0200 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth In-Reply-To: References: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> Message-ID: <4ad3e321-4204-859f-cc2b-18f67e1faf9e@redhat.com> Hi Rohit, > Hi Honza, > Thanks for your reply. Please find the attached image below: > > [image: image.png] > > Yes, I am talking about pacemaker alerts only. > > Please find my suggestions/requirements below: > > *Booth:* > 1. Node5 booth-arbitrator should be able to give event when any of the > booth node joins or leaves. booth-ip can be passed in event. This is not how booth works. Ticket leader (so site booth, never arbitrator) executes election and get replies from other sites/arbitrator. Follower executes election when leader hasn't for configured timeout. What I want to say is, that there is no "membership" - as in (for example) corosync fashion. The best we could get is the rough estimation based on election request/replies. > 2. Event when booth-arbitrator is up successfully and has started > monitoring the booth nodes. This is basically start of service. I think it's doable with small change in unit file (something like https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html) > 2. Geo site booth should be able to give event when its booth peers > joins/leaves. For example, Geo site1 gives an event when node5 > booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can be > passed in event. > 3. On ticket movements (revoke/grant), every booth node(Site1/2 and node5) > should give events. That would be doable > > Note: pacemaker alerts works in a cluster. Since, arbitrator is a > non-cluster node, not sure how exactly it will work there. But this is good > to have feature. > > *Qnetd/Qdevice:* > This is similar to above. > 1. Node5 qnetd should be able to raise an event when any of the cluster > node joins/leaves the quorum. Doable > 2. Event when qnetd is up successfully and has started monitoring the > cluster nodes Qnetd itself is not monitoring qdevice nodes (it doesn't have list of nodes). It monitors node status after node joins (= it would be possible to trigger event on leave). So that may be enough. > 3. Cluster node should be able to give event when any of the quorum node > leaves/joins. You mean qdevice should be able to trigger event when connected to qnetd? > > If you see on high level, then these are kind of node/resource events wrt > booth and qnetd/qdevice. Yeah > > As of today wrt booth/qnetd, I don't see any provision where any of the > nodes gives any event when its peer leaves/joins. This makes it difficult > to know whether geo sites nodes can see booth-arbitrator or not. This is Got it. That's exactly what would be really problematic to implement, because of no "membership" in booth. It would be, however, possible to implement message when ticket was granted/rejected and have a list of other booths replies and what was their votes. > true the other way around also where booth-arbitrator cannot see geo booth > sites. > I am not sure how others are doing it in today's deployment, but I see need > of monitoring of every other booth/qnet node. So that on basis of event, > appropriate alarms can be raised and action can be taken accordingly. > > Please let me know if you agree on the usecases. I'll raise feature-request I can agree on usecases, but (especially with booth) there are technical problems on realizing them. > on the pacemaker upstream project accordingly. Please use booth (https://github.com/ClusterLabs/booth) and qdevice (https://github.com/corosync/corosync-qdevice) upstream rather than pacemaker, because these requests has really nothing to do with pcmk. Regards, honza > > Thanks, > Rohit > > On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse wrote: > >> Hi Rohit, >> >> Rohit Saini napsal(a): >>> Hi Team, >>> >>> Question-1: >>> Similar to pcs alerts, do we have something similar for qdevice/qnetd? >> This >> >> You mean pacemaker alerts right? >> >>> is to detect asynchronously if any of the member is >> unreachable/joined/left >>> and if that member is qdevice or qnetd. >> >> Nope but actually shouldn't be that hard to implement. What exactly >> would you like to see there? >> >>> >>> Question-2: >>> Same above question for booth nodes and arbitrator. Is there any way to >>> receive events from booth daemon? >> >> Not directly (again, shouldn't be that hard to implement). But pacemaker >> alerts should be triggered when service changes state because of ticket >> grant/reject, isn't it? >> >>> >>> My main objective is to see if these daemons give events related to >>> their internal state transitions and raise some alarms accordingly. For >>> example, boothd arbitrator is unreachable, ticket moved from x to y, etc. >> >> I don't think "boothd arbitrator is unreachable" alert is really doable. >> Ticket moved from x to y would be probably two alerts - 1. ticket >> rejected on X and 2. granted on Y. >> >> Would you mind to elaborate a bit more on events you would like to see >> and potentially open issue for upstream project (or, if you have a RH >> subscription try to contact GSS, so I get more time to work on this issue). >> >> Regards, >> Honza >> >>> >>> Thanks, >>> Rohit >>> >>> >>> >>> _______________________________________________ >>> Manage your subscription: >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> ClusterLabs home: https://www.clusterlabs.org/ >>> >> >> > From luckydogxf at gmail.com Thu Aug 13 04:33:41 2020 From: luckydogxf at gmail.com (luckydog xf) Date: Thu, 13 Aug 2020 16:33:41 +0800 Subject: [ClusterLabs] Pacemaker-remote: Connection to cluster failed: Transport endpoint is not connected Message-ID: Hi, guys, I'm running SLES12 sp3 and pacemaker-remote-1.1.16-4.8.x86_64, a few months ago, compute nodes of openstack are running well. But today when I setup a new compute node, it's said," ------------- Aug 13 16:31:04 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121. Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: notice: bind_and_listen: Listening on address :: Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_ro Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_rw Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: cib_shm Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: attrd Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: stonith-ng Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: qb_ipcs_us_publish: server name: crmd Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: main: Starting ----------- and after I run `crm_mon -1r`, two lines are appended of pacemaker.log Aug 13 16:31:38 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: error: ipc_proxy_accept: No ipc providers available for uid 0 gid 0 Aug 13 16:31:38 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: error: handle_new_connection: Error in connection setup (43122-43151-15): Remote I/O error (121) And the output of `crm_mon -1r` is Connection to cluster failed: Transport endpoint is not connected My environment is almost the same with other servers. So what's up ? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From luckydogxf at gmail.com Thu Aug 13 05:15:39 2020 From: luckydogxf at gmail.com (luckydog xf) Date: Thu, 13 Aug 2020 17:15:39 +0800 Subject: [ClusterLabs] Pacemaker-remote: Connection to cluster failed: Transport endpoint is not connected In-Reply-To: References: Message-ID: Probably I didn't configure pacemaker resource properly, now it's OK. --- crm configure show remote-db8-ca-3a-69-60-f4 node remote-db8-ca-3a-69-60-f4:remote \ attributes OpenStack-role=compute standby=off primitive remote-db8-ca-3a-69-60-f4 ocf:pacemaker:remote \ params reconnect_interval=60s server=db8-ca-3a-69-60-f4.ipa.pthl.hk \ op monitor interval=30s \ op start interval=0 timeout=60s \ op stop interval=0 timeout=60s \ meta target-role=Started --- On Thu, Aug 13, 2020 at 4:33 PM luckydog xf wrote: > Hi, guys, > > I'm running SLES12 sp3 and pacemaker-remote-1.1.16-4.8.x86_64, a few > months ago, compute nodes of openstack are running well. But today when I > setup a new compute node, it's said," > ------------- > Aug 13 16:31:04 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: notice: > lrmd_init_remote_tls_server: Starting a tls listener on port 3121. > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: notice: > bind_and_listen: Listening on address :: > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: cib_ro > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: cib_rw > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: cib_shm > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: attrd > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: stonith-ng > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > qb_ipcs_us_publish: server name: crmd > Aug 13 16:31:05 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: info: > main: Starting > ----------- > and after I run `crm_mon -1r`, two lines are appended of pacemaker.log > > Aug 13 16:31:38 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: error: > ipc_proxy_accept: No ipc providers available for uid 0 gid 0 > Aug 13 16:31:38 [43122] db8-ca-3a-69-60-f4 pacemaker_remoted: error: > handle_new_connection: Error in connection setup (43122-43151-15): Remote > I/O error (121) > > And the output of `crm_mon -1r` is > > Connection to cluster failed: Transport endpoint is not connected > > My environment is almost the same with other servers. So what's up ? > Thanks, > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clusterlabs at t.poki.me Thu Aug 13 06:32:32 2020 From: clusterlabs at t.poki.me (clusterlabs at t.poki.me) Date: Thu, 13 Aug 2020 12:32:32 +0200 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote In-Reply-To: References: Message-ID: <03646c86-ae3e-c76c-c6cd-589ea2f5d97a@poki.me> Hello, wanted to point out one thing that occurred to me when thinking about the paragraph below. On 8/12/20 8:57 PM, Ken Gaillot wrote: > Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option > to specify what happens to resources when quorum is lost (the default > being to stop them). With 2.0.5, "demote" will be a possible value here > as well, and will mean "demote all promotable resources and stop all > other resources". > > The intended use case is an application that cannot cause any harm > after being demoted, and may be useful in a demoted role even if there > is no quorum. A database that operates read-only when demoted and > doesn't depend on any non-promotable resources might be an example. Perhaps not that expected corollary with this cluster-wide setting (only acknowledged when the cluster is upgraded in its entirety?), if I understand it correctly, is that previously promoted resource will get stopped anyway once it depends on a simple resource that doesn't specify "on-fail" on its own (putting global/resource defaults aside). It is this implicit resource "composability" (a.k.a. resource trees, with some tweaks applicable right at this "composed service" level) idea of long declined RGManager (RIP) that is still quite an appealing and natural way (despite having less expressive power in general) one can think of composed services (i.e. the behaviour of a final brings-me-value unit rather than sum of moving parts behind it). Through the prism of this tree model, if it could be proved that no other simple resources share dependency on any of the prerequisites with this promoted resource to be demoted because of "on fail" event, it would be intuitive to expect they will be kept running and hence will prevent this promoted resource from consequentially being stopped despite just a weaker form of its demotion is the first and foremost choice requested by the user. Similarly with clones and other promotable prerequisites, except it might be wise to demote them as well if it would not conflict with demotion of the dependent promoted resource that just suffered a failure. Rationale for this is that prerequisite resource _solely_ consumed by resources that don't need quorum as well hardly needs quorum on its own, otherwise this is a conflicting/fishy configuration in some way. (It would likewise be interesting and configuration-proofing-friendly to investigate such "composing rules of soundness".) I see there are some practical limits to these semi-recursive and potentially explosive graph problems, but there's also a question what's an intuitively expected behaviour, and possible disconnect could at least be addressed in the documentation. [Alternatively, some kind of "back-propagation" (intuitively, "this terminal resource has powers to steer the behaviour of its prerequisites, as the configuration of this terminal resource is what's wanted by the outer surroundings of this box, afterall) flag could be devised such that it would override any behaviour of the prerequisite resources on a subset of conflicting options, like on-fail, given that there is no conflict with any other resource dependent on the same resource.] Thanks & cheers -- poki From clusterlabs at t.poki.me Thu Aug 13 11:34:30 2020 From: clusterlabs at t.poki.me (clusterlabs at t.poki.me) Date: Thu, 13 Aug 2020 17:34:30 +0200 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote In-Reply-To: <03646c86-ae3e-c76c-c6cd-589ea2f5d97a@poki.me> References: <03646c86-ae3e-c76c-c6cd-589ea2f5d97a@poki.me> Message-ID: <7ee8dae1-3f73-80b5-a36c-0b3f6c311b5a@poki.me> Sorry, followed too much into the intermixing of two rather unrelated things, when failure occurs and when quorum is lost. I've meant to dedicate the comment solely to the latter, but managed to cross that line. Corrections below: On 8/13/20 12:32 PM, clusterlabs at t.poki.me wrote: > wanted to point out one thing that occurred to me when thinking about > the paragraph below. > > On 8/12/20 8:57 PM, Ken Gaillot wrote: >> Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option >> to specify what happens to resources when quorum is lost (the default >> being to stop them). With 2.0.5, "demote" will be a possible value here >> as well, and will mean "demote all promotable resources and stop all >> other resources". >> >> The intended use case is an application that cannot cause any harm >> after being demoted, and may be useful in a demoted role even if there >> is no quorum. A database that operates read-only when demoted and >> doesn't depend on any non-promotable resources might be an example. > > Perhaps not that expected corollary with this cluster-wide setting > (only acknowledged when the cluster is upgraded in its entirety?), if > I understand it correctly, is that previously promoted resource will > get stopped anyway once it depends on a simple resource that doesn't > specify "on-fail" on its own (putting global/resource defaults aside). Scratch trailing " that ...", there's no individual per-resource (nor per-resource-operation) override. Would be, though, interesting to consider allowing for an escape "no-quorum-policy=as-active-state-failure" that would turn loss of quorum into started/promoted/demoted state maintenance failure, which could then trigger individual on-fail response. Who knows. > It is this implicit resource "composability" (a.k.a. resource trees, > with some tweaks applicable right at this "composed service" level) > idea of long declined RGManager (RIP) that is still quite an appealing > and natural way (despite having less expressive power in general) one > can think of composed services (i.e. the behaviour of a final > brings-me-value unit rather than sum of moving parts behind it). > Through the prism of this tree model, if it could be proved that no > other simple resources share dependency on any of the prerequisites > with this promoted resource to be demoted because of "on fail" event, s/on fail/quorum lost/ > it would be intuitive to expect they will be kept running and hence > will prevent this promoted resource from consequentially being stopped > despite just a weaker form of its demotion is the first and foremost > choice requested by the user.? Similarly with clones and other > promotable prerequisites, except it might be wise to demote them as > well if it would not conflict with demotion of the dependent promoted > resource that just suffered a failure. > > Rationale for this is that prerequisite resource _solely_ consumed by > resources that don't need quorum as well hardly needs quorum on its own, > otherwise this is a conflicting/fishy configuration in some way. > (It would likewise be interesting and configuration-proofing-friendly > to investigate such "composing rules of soundness".) > > I see there are some practical limits to these semi-recursive and > potentially explosive graph problems, but there's also a question > what's an intuitively expected behaviour, and possible disconnect > could at least be addressed in the documentation. > > [Alternatively, some kind of "back-propagation" (intuitively, > "this terminal resource has powers to steer the behaviour of its > prerequisites, as the configuration of this terminal resource is > what's wanted by the outer surroundings of this box, afterall) > flag could be devised such that it would override any behaviour of > the prerequisite resources on a subset of conflicting options, like > on-fail, given that there is no conflict with any other resource Again, scrach "like on-fail", in this case it would perhaps mean altering individual yet externally not directly configurable approach of Pacemaker towards the resource in particular circumstance. (It's part of the conceivability problem amongst different audiences, since not every internal flag/tracking and conditional handling maps to user configurable item -- main pillar of explaining the mechanics to users -- in particular context.) > dependent on the same resource.] > > Thanks & cheers -- poki From rohitsaini111.forum at gmail.com Fri Aug 14 00:22:48 2020 From: rohitsaini111.forum at gmail.com (Rohit Saini) Date: Fri, 14 Aug 2020 09:52:48 +0530 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth In-Reply-To: <4ad3e321-4204-859f-cc2b-18f67e1faf9e@redhat.com> References: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> <4ad3e321-4204-859f-cc2b-18f67e1faf9e@redhat.com> Message-ID: Thanks Honza. I have raised these on both upstream projects. I will leave upto implementer how best this can be done, considering the technical limitations you mentioned. https://github.com/corosync/corosync-qdevice/issues/13 https://github.com/ClusterLabs/booth/issues/99 Thanks, Rohit On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse wrote: > Hi Rohit, > > > Hi Honza, > > Thanks for your reply. Please find the attached image below: > > > > [image: image.png] > > > > Yes, I am talking about pacemaker alerts only. > > > > Please find my suggestions/requirements below: > > > > *Booth:* > > 1. Node5 booth-arbitrator should be able to give event when any of the > > booth node joins or leaves. booth-ip can be passed in event. > > This is not how booth works. Ticket leader (so site booth, never > arbitrator) executes election and get replies from other > sites/arbitrator. Follower executes election when leader hasn't for > configured timeout. > > What I want to say is, that there is no "membership" - as in (for > example) corosync fashion. > > The best we could get is the rough estimation based on election > request/replies. > > > 2. Event when booth-arbitrator is up successfully and has started > > monitoring the booth nodes. > > This is basically start of service. I think it's doable with small > change in unit file (something like > > https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html > ) > > > 2. Geo site booth should be able to give event when its booth peers > > joins/leaves. For example, Geo site1 gives an event when node5 > > booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can > be > > passed in event. > > 3. On ticket movements (revoke/grant), every booth node(Site1/2 and > node5) > > should give events. > > That would be doable > > > > > Note: pacemaker alerts works in a cluster. Since, arbitrator is a > > non-cluster node, not sure how exactly it will work there. But this is > good > > to have feature. > > > > *Qnetd/Qdevice:* > > This is similar to above. > > 1. Node5 qnetd should be able to raise an event when any of the cluster > > node joins/leaves the quorum. > > Doable > > > 2. Event when qnetd is up successfully and has started monitoring the > > cluster nodes > > Qnetd itself is not monitoring qdevice nodes (it doesn't have list of > nodes). It monitors node status after node joins (= it would be possible > to trigger event on leave). So that may be enough. > > > 3. Cluster node should be able to give event when any of the quorum node > > leaves/joins. > > You mean qdevice should be able to trigger event when connected to qnetd? > > > > > If you see on high level, then these are kind of node/resource events wrt > > booth and qnetd/qdevice. > > Yeah > > > > > As of today wrt booth/qnetd, I don't see any provision where any of the > > nodes gives any event when its peer leaves/joins. This makes it difficult > > to know whether geo sites nodes can see booth-arbitrator or not. This is > > Got it. That's exactly what would be really problematic to implement, > because of no "membership" in booth. It would be, however, possible to > implement message when ticket was granted/rejected and have a list of > other booths replies and what was their votes. > > > true the other way around also where booth-arbitrator cannot see geo > booth > > sites. > > I am not sure how others are doing it in today's deployment, but I see > need > > of monitoring of every other booth/qnet node. So that on basis of event, > > appropriate alarms can be raised and action can be taken accordingly. > > > > Please let me know if you agree on the usecases. I'll raise > feature-request > > I can agree on usecases, but (especially with booth) there are technical > problems on realizing them. > > > on the pacemaker upstream project accordingly. > > Please use booth (https://github.com/ClusterLabs/booth) and qdevice > (https://github.com/corosync/corosync-qdevice) upstream rather than > pacemaker, because these requests has really nothing to do with pcmk. > > Regards, > honza > > > > > Thanks, > > Rohit > > > > On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse wrote: > > > >> Hi Rohit, > >> > >> Rohit Saini napsal(a): > >>> Hi Team, > >>> > >>> Question-1: > >>> Similar to pcs alerts, do we have something similar for qdevice/qnetd? > >> This > >> > >> You mean pacemaker alerts right? > >> > >>> is to detect asynchronously if any of the member is > >> unreachable/joined/left > >>> and if that member is qdevice or qnetd. > >> > >> Nope but actually shouldn't be that hard to implement. What exactly > >> would you like to see there? > >> > >>> > >>> Question-2: > >>> Same above question for booth nodes and arbitrator. Is there any way to > >>> receive events from booth daemon? > >> > >> Not directly (again, shouldn't be that hard to implement). But pacemaker > >> alerts should be triggered when service changes state because of ticket > >> grant/reject, isn't it? > >> > >>> > >>> My main objective is to see if these daemons give events related to > >>> their internal state transitions and raise some alarms accordingly. > For > >>> example, boothd arbitrator is unreachable, ticket moved from x to y, > etc. > >> > >> I don't think "boothd arbitrator is unreachable" alert is really doable. > >> Ticket moved from x to y would be probably two alerts - 1. ticket > >> rejected on X and 2. granted on Y. > >> > >> Would you mind to elaborate a bit more on events you would like to see > >> and potentially open issue for upstream project (or, if you have a RH > >> subscription try to contact GSS, so I get more time to work on this > issue). > >> > >> Regards, > >> Honza > >> > >>> > >>> Thanks, > >>> Rohit > >>> > >>> > >>> > >>> _______________________________________________ > >>> Manage your subscription: > >>> https://lists.clusterlabs.org/mailman/listinfo/users > >>> > >>> ClusterLabs home: https://www.clusterlabs.org/ > >>> > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernd.lentes at helmholtz-muenchen.de Fri Aug 14 06:17:51 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Fri, 14 Aug 2020 12:17:51 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <1954017432.49729402.1597400271575.JavaMail.zimbra@helmholtz-muenchen.de> ----- On Aug 10, 2020, at 11:59 PM, kgaillot kgaillot at redhat.com wrote: > The most recent transition is aborted, but since all its actions are > complete, the only effect is to trigger a new transition. > > We should probably rephrase the log message. In fact, the whole > "transition" terminology is kind of obscure. It's hard to come up with > something better though. > Hi Ken, i don't get it. How can s.th. be aborted which is already completed ? Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From gbulfon at sonicle.com Fri Aug 14 09:09:04 2020 From: gbulfon at sonicle.com (Gabriele Bulfon) Date: Fri, 14 Aug 2020 15:09:04 +0200 (CEST) Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <68622439.459.1596025362188@www> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> Message-ID: <1118615335.1895.1597410544148@www> Thanks to all your suggestions, I now have the systems with stonith configured on ipmi. ? Two questions: - how can I simulate a stonith situation to check that everything is ok? - considering that I have both nodes with stonith against the other node, once the two nodes can communicate, how can I be sure the two nodes will not try to stonith each other? ? :) Thanks! Gabriele ? ? Sonicle S.r.l.? :? http://www.sonicle.com Music:? http://www.gabrielebulfon.com Quantum Mechanics :? http://www.cdbaby.com/cd/gabrielebulfon Da: Gabriele Bulfon A: Cluster Labs - All topics related to open-source clustering welcomed Data: 29 luglio 2020 14.22.42 CEST Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing ? It is a ZFS based illumos system. I don't think SBD is an option. Is there a reliable ZFS based stonith? ? Gabriele ? ? Sonicle S.r.l.? :? http://www.sonicle.com Music:? http://www.gabrielebulfon.com Quantum Mechanics :? http://www.cdbaby.com/cd/gabrielebulfon Da: Andrei Borzenkov A: Cluster Labs - All topics related to open-source clustering welcomed Data: 29 luglio 2020 9.46.09 CEST Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing ? On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon gbulfon at sonicle.com wrote: That one was taken from a specific implementation on Solaris 11. The situation is a dual node server with shared storage controller: both nodes see the same disks concurrently. Here we must be sure that the two nodes are not going to import/mount the same zpool at the same time, or we will encounter data corruption: ? ssh based "stonith" cannot guarantee it. ? node 1 will be perferred for pool 1, node 2 for pool 2, only in case one of the node goes down or is taken offline the resources should be first free by the leaving node and taken by the other node. ? Would you suggest one of the available stonith in this case? ? ? IPMI, managed PDU, SBD ... In practice, the only stonith method that works in case of complete node outage including any power supply is SBD. _______________________________________________Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs home: https://www.clusterlabs.org/ _______________________________________________Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/usersClusterLabs home: https://www.clusterlabs.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernd.lentes at helmholtz-muenchen.de Fri Aug 14 14:37:43 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Fri, 14 Aug 2020 20:37:43 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> ----- On Aug 9, 2020, at 10:17 PM, Bernd Lentes bernd.lentes at helmholtz-muenchen.de wrote: >> So this appears to be the problem. From these logs I would guess the >> successful stop on ha-idg-1 did not get written to the CIB for some >> reason. I'd look at the pe input from this transition on ha-idg-2 to >> confirm that. >> >> Without the DC knowing about the stop, it tries to schedule a new one, >> but the node is shutting down so it can't do it, which means it has to >> be fenced. I checked all relevant pe-files in this time period. This is what i found out (i just write the important entries): ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-3116 -G transition-3116.xml -D transition-3116.dot Current cluster status: ... vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-1 Transition Summary: ... * Migrate vm_nextcloud ( ha-idg-1 -> ha-idg-2 ) Executing cluster transition: * Resource action: vm_nextcloud migrate_from on ha-idg-2 <======= migrate vm_nextcloud * Resource action: vm_nextcloud stop on ha-idg-1 * Pseudo action: vm_nextcloud_start_0 Revised cluster status: Node ha-idg-1 (1084777482): standby Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-2 ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-error-48 -G transition-4514.xml -D transition-4514.dot Current cluster status: Node ha-idg-1 (1084777482): standby Online: [ ha-idg-2 ] ... vm_nextcloud (ocf::heartbeat:VirtualDomain): FAILED[ ha-idg-2 ha-idg-1 ] <====== migration failed Transition Summary: .. * Recover vm_nextcloud ( ha-idg-2 ) Executing cluster transition: * Resource action: vm_nextcloud stop on ha-idg-2 * Resource action: vm_nextcloud stop on ha-idg-1 * Resource action: vm_nextcloud start on ha-idg-2 * Resource action: vm_nextcloud monitor=30000 on ha-idg-2 Revised cluster status: vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-2 ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-3117 -G transition-3117.xml -D transition-3117.dot Current cluster status: Node ha-idg-1 (1084777482): standby Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): FAILED ha-idg-2 <====== start on ha-idg-2 failed Transition Summary: * Stop vm_nextcloud ( ha-idg-2 ) due to node availability <==== stop vm_nextcloud (what means due to node availability ?) Executing cluster transition: * Resource action: vm_nextcloud stop on ha-idg-2 Revised cluster status: vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input-3118 -G transition-4516.xml -D transition-4516.dot Current cluster status: Node ha-idg-1 (1084777482): standby Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <============== vm_nextcloud is stopped Transition Summary: * Shutdown ha-idg-1 Executing cluster transition: * Resource action: vm_nextcloud stop on ha-idg-1 <==== why stop ? It is already stopped Revised cluster status: vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-input-3545 -G transition-0.xml -D transition-0.dot Current cluster status: Node ha-idg-1 (1084777482): pending Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <====== vm_nextcloud is stopped Transition Summary: Executing cluster transition: Using the original execution date of: 2020-07-20 15:05:33Z Revised cluster status: vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-warn-749 -G transition-1.xml -D transition-1.dot Current cluster status: Node ha-idg-1 (1084777482): OFFLINE (standby) Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <======= vm_nextcloud is stopped Transition Summary: * Fence (Off) ha-idg-1 'resource actions are unrunnable' Executing cluster transition: * Fencing ha-idg-1 (Off) * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It is already stopped ? Revised cluster status: Node ha-idg-1 (1084777482): OFFLINE (standby) Online: [ ha-idg-2 ] vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped I don't understand why the cluster tries to stop a resource which is already stopped. Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From gerry at infinityit.ie Fri Aug 14 11:48:32 2020 From: gerry at infinityit.ie (Gerry Kernan) Date: Fri, 14 Aug 2020 15:48:32 +0000 Subject: [ClusterLabs] DRBD resource not starting Message-ID: Hi Im trying to add a drbd resource to pacemaker cluster on centos 7 But getting this error on pcs status drbd_r0 (ocf::linbit:drbd): ORPHANED FAILED (blocked)[ vps1 vps2 ] if I try and the resource from cli I get out out below [root at VPS1 drbd-utils]# pcs resource debug-start rep-r0 Operation start for rep-r0:0 (ocf:linbit:drbd) returned: 'unknown error' (1) > stdout: drbdsetup - Configure the DRBD kernel module. > stdout: > stdout: USAGE: drbdsetup command {arguments} [options] > stdout: > stdout: Commands: > stdout: primary - Change the role of a node in a resource to primary. > stdout: secondary - Change the role of a node in a resource to secondary. > stdout: attach - Attach a lower-level device to an existing replicated device. > stdout: disk-options - Change the disk options of an attached lower-level device. > stdout: detach - Detach the lower-level device of a replicated device. > stdout: connect - Attempt to (re)establish a replication link to a peer host. > stdout: new-peer - Make a peer host known to a resource. > stdout: del-peer - Remove a connection to a peer host. > stdout: new-path - Add a path (endpoint address pair) where a peer host should be > stdout: reachable. > stdout: del-path - Remove a path (endpoint address pair) from a connection to a peer > stdout: host. > stdout: net-options - Change the network options of a connection. > stdout: disconnect - Unconnect from a peer host. > stdout: resize - Reexamine the lower-level device sizes to resize a replicated > stdout: device. > stdout: resource-options - Change the resource options of an existing resource. > stdout: peer-device-options - Change peer-device options. > stdout: new-current-uuid - Generate a new current UUID. > stdout: invalidate - Replace the local data of a volume with that of a peer. > stdout: invalidate-remote - Replace a peer's data of a volume with the local data. > stdout: pause-sync - Stop resynchronizing between a local and a peer device. > stdout: resume-sync - Allow resynchronization to resume on a replicated device. > stdout: suspend-io - Suspend I/O on a replicated device. > stdout: resume-io - Resume I/O on a replicated device. > stdout: outdate - Mark the data on a lower-level device as outdated. > stdout: verify - Verify the data on a lower-level device against a peer device. > stdout: down - Take a resource down. > stdout: role - Show the current role of a resource. > stdout: cstate - Show the current state of a connection. > stdout: dstate - Show the current disk state of a lower-level device. > stdout: show-gi - Show the data generation identifiers for a device on a particular > stdout: connection, with explanations. > stdout: get-gi - Show the data generation identifiers for a device on a particular > stdout: connection. > stdout: show - Show the current configuration of a resource, or of all resources. > stdout: status - Show the state of a resource, or of all resources. > stdout: check-resize - Remember the current size of a lower-level device. > stdout: events2 - Show the current state and all state changes of a resource, or of > stdout: all resources. > stdout: wait-sync-volume - Wait until resync finished on a volume. > stdout: wait-sync-connection - Wait until resync finished on all volumes of a > stdout: connection. > stdout: wait-sync-resource - Wait until resync finished on all volumes. > stdout: wait-connect-volume - Wait until a device on a peer is visible. > stdout: wait-connect-connection - Wait until all peer volumes of connection are > stdout: visible. > stdout: wait-connect-resource - Wait until all connections are establised. > stdout: new-resource - Create a new resource. > stdout: new-minor - Create a new replicated device within a resource. > stdout: del-minor - Remove a replicated device. > stdout: del-resource - Remove a resource. > stdout: forget-peer - Completely remove any reference to a unconnected peer from > stdout: meta-data. > stdout: > stdout: Use 'drbdsetup help command' for command-specific help. > stdout: > stdout: > stdout: invalid command > stdout: > stdout: > stdout: > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 318: USAGE:: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 320: Commands:: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 321: primary: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 322: secondary: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 323: attach: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 324: disk-options: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 325: detach: command not found > stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: syntax error near unexpected token `(' > stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: ` connect - Attempt to (re)establish a replication link to a peer host.' > stderr: new-minor r0 0 0: sysfs node '/sys/devices/virtual/block/drbd0' (already? still?) exists > stderr: r0: Failure: (161) Minor or volume exists already (delete it first) > stderr: Command 'drbdsetup new-minor r0 0 0' terminated with exit code 10 > stderr: Aug 14 16:47:38 ERROR: r0: Called drbdadm -c /etc/drbd.conf -S new-minor r0 > stderr: Aug 14 16:47:38 ERROR: r0: Exit code 10 > stderr: Aug 14 16:47:38 ERROR: r0: Command output: Gerry -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmoneta at gmail.com Thu Aug 13 19:12:43 2020 From: hmoneta at gmail.com (Howard) Date: Thu, 13 Aug 2020 16:12:43 -0700 Subject: [ClusterLabs] Filtering info messages from pacemaker.log Message-ID: Hi there. Really getting a lot of good from using Pacemaker. Quick question on filtering entries out of the pacemaker.log. We have other streaming slots that are active for our Pacemaker managed PostgreSQL cluster in addition to the slot handling the replication. The /var/log/pacemaker/pacemaker.log is full of these below entries writing four new rows every second. Is there some way to exclude this noise from the log? "pgsqlms(pgsqld)[xxxxxxx]: xxxxxxxxxx INFO: Ignoring unknown application_name/node "PostgreSQL JDBC Driver"", Thanks for any insight. Howard -------------- next part -------------- An HTML attachment was scrubbed... URL: From jgdr at dalibo.com Sat Aug 15 10:06:55 2020 From: jgdr at dalibo.com (Jehan-Guillaume de Rorthais) Date: Sat, 15 Aug 2020 16:06:55 +0200 Subject: [ClusterLabs] Filtering info messages from pacemaker.log In-Reply-To: References: Message-ID: <20200815160655.0bca507f@firost> Hi, On Thu, 13 Aug 2020 16:12:43 -0700 Howard wrote: > Hi there. Really getting a lot of good from using Pacemaker. Quick > question on filtering entries out of the pacemaker.log. > > We have other streaming slots that are active for our Pacemaker managed > PostgreSQL cluster in addition to the slot handling the replication. The > /var/log/pacemaker/pacemaker.log is full of these below entries writing > four new rows every second. Is there some way to exclude this noise from > the log? > > "pgsqlms(pgsqld)[xxxxxxx]: xxxxxxxxxx INFO: Ignoring unknown > application_name/node "PostgreSQL JDBC Driver"", I just created an issue in PAF repository to keep track of this issue. See: https://github.com/ClusterLabs/PAF/issues/178 I'll try to work on this soon. Do not hesitate to post some fix suggestion if you have some. In the meantime, maybe you can filter out this flood with some rsyslog setup? Thank you for your report! From hunter86_bg at yahoo.com Sat Aug 15 11:16:09 2020 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Sat, 15 Aug 2020 18:16:09 +0300 Subject: [ClusterLabs] DRBD resource not starting In-Reply-To: References: Message-ID: And how did you define the drbd resource? Best Regards, Strahil Nikolov ?? 14 ?????? 2020 ?. 18:48:32 GMT+03:00, Gerry Kernan ??????: >Hi >Im trying to add a drbd resource to pacemaker cluster on centos 7 > > >But getting this error on pcs status >drbd_r0 (ocf::linbit:drbd): ORPHANED FAILED (blocked)[ vps1 >vps2 ] > > >if I try and the resource from cli I get out out below > >[root at VPS1 drbd-utils]# pcs resource debug-start rep-r0 >Operation start for rep-r0:0 (ocf:linbit:drbd) returned: 'unknown >error' (1) >> stdout: drbdsetup - Configure the DRBD kernel module. >> stdout: >> stdout: USAGE: drbdsetup command {arguments} [options] >> stdout: >> stdout: Commands: >> stdout: primary - Change the role of a node in a resource to >primary. >> stdout: secondary - Change the role of a node in a resource to >secondary. >> stdout: attach - Attach a lower-level device to an existing >replicated device. >> stdout: disk-options - Change the disk options of an attached >lower-level device. >> stdout: detach - Detach the lower-level device of a replicated >device. >> stdout: connect - Attempt to (re)establish a replication link to >a peer host. >> stdout: new-peer - Make a peer host known to a resource. >> stdout: del-peer - Remove a connection to a peer host. >> stdout: new-path - Add a path (endpoint address pair) where a >peer host should be >> stdout: reachable. >> stdout: del-path - Remove a path (endpoint address pair) from a >connection to a peer >> stdout: host. >> stdout: net-options - Change the network options of a >connection. >> stdout: disconnect - Unconnect from a peer host. >> stdout: resize - Reexamine the lower-level device sizes to >resize a replicated >> stdout: device. >> stdout: resource-options - Change the resource options of an >existing resource. >> stdout: peer-device-options - Change peer-device options. >> stdout: new-current-uuid - Generate a new current UUID. >> stdout: invalidate - Replace the local data of a volume with >that of a peer. >> stdout: invalidate-remote - Replace a peer's data of a volume >with the local data. >> stdout: pause-sync - Stop resynchronizing between a local and a >peer device. >> stdout: resume-sync - Allow resynchronization to resume on a >replicated device. >> stdout: suspend-io - Suspend I/O on a replicated device. >> stdout: resume-io - Resume I/O on a replicated device. >> stdout: outdate - Mark the data on a lower-level device as >outdated. >> stdout: verify - Verify the data on a lower-level device against >a peer device. >> stdout: down - Take a resource down. >> stdout: role - Show the current role of a resource. >> stdout: cstate - Show the current state of a connection. >> stdout: dstate - Show the current disk state of a lower-level >device. >> stdout: show-gi - Show the data generation identifiers for a >device on a particular >> stdout: connection, with explanations. >> stdout: get-gi - Show the data generation identifiers for a >device on a particular >> stdout: connection. >> stdout: show - Show the current configuration of a resource, or >of all resources. >> stdout: status - Show the state of a resource, or of all >resources. >> stdout: check-resize - Remember the current size of a >lower-level device. >> stdout: events2 - Show the current state and all state changes >of a resource, or of >> stdout: all resources. >> stdout: wait-sync-volume - Wait until resync finished on a >volume. >> stdout: wait-sync-connection - Wait until resync finished on all >volumes of a >> stdout: connection. >> stdout: wait-sync-resource - Wait until resync finished on all >volumes. >> stdout: wait-connect-volume - Wait until a device on a peer is >visible. >> stdout: wait-connect-connection - Wait until all peer volumes of >connection are >> stdout: visible. >> stdout: wait-connect-resource - Wait until all connections are >establised. >> stdout: new-resource - Create a new resource. >> stdout: new-minor - Create a new replicated device within a >resource. >> stdout: del-minor - Remove a replicated device. >> stdout: del-resource - Remove a resource. >> stdout: forget-peer - Completely remove any reference to a >unconnected peer from >> stdout: meta-data. >> stdout: >> stdout: Use 'drbdsetup help command' for command-specific help. >> stdout: >> stdout: >> stdout: invalid command >> stdout: >> stdout: >> stdout: >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 318: USAGE:: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 320: Commands:: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 321: primary: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 322: secondary: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 323: attach: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 324: disk-options: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 325: detach: >command not found >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: syntax >error near unexpected token `(' >> stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: ` >connect - Attempt to (re)establish a replication link to a peer host.' >> stderr: new-minor r0 0 0: sysfs node >'/sys/devices/virtual/block/drbd0' (already? still?) exists >> stderr: r0: Failure: (161) Minor or volume exists already (delete it >first) >> stderr: Command 'drbdsetup new-minor r0 0 0' terminated with exit >code 10 >> stderr: Aug 14 16:47:38 ERROR: r0: Called drbdadm -c /etc/drbd.conf >-S new-minor r0 >> stderr: Aug 14 16:47:38 ERROR: r0: Exit code 10 >> stderr: Aug 14 16:47:38 ERROR: r0: Command output: > >Gerry From nwahl at redhat.com Sat Aug 15 21:25:07 2020 From: nwahl at redhat.com (Reid Wahl) Date: Sat, 15 Aug 2020 18:25:07 -0700 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <1118615335.1895.1597410544148@www> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> Message-ID: On Fri, Aug 14, 2020 at 6:10 AM Gabriele Bulfon wrote: > Thanks to all your suggestions, I now have the systems with stonith > configured on ipmi. > > Two questions: > - how can I simulate a stonith situation to check that everything is ok? > You can run `stonith_admin -B ` to tell Pacemaker to reboot the node using the configured stonith devices. If you want to test a network failure, you can have iptables block inbound and outbound traffic on the heartbeat IP address on one node. > - considering that I have both nodes with stonith against the other node, > once the two nodes can communicate, how can I be sure the two nodes will > not try to stonith each other? > The simplest option is to add a delay attribute (e.g., delay=10) to one of the stonith devices. That way, if both nodes want to fence each other, the node whose stonith device has a delay configured will wait for the delay to expire before executing the reboot action. Alternatively, you can set up corosync-qdevice, using a separate system running qnetd server as a quorum arbitrator. > :) > Thanks! > Gabriele > > > > *Sonicle S.r.l. *: http://www.sonicle.com > *Music: *http://www.gabrielebulfon.com > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > ------------------------------ > > > *Da:* Gabriele Bulfon > *A:* Cluster Labs - All topics related to open-source clustering welcomed > > *Data:* 29 luglio 2020 14.22.42 CEST > *Oggetto:* Re: [ClusterLabs] Antw: [EXT] Stonith failing > > > > It is a ZFS based illumos system. > I don't think SBD is an option. > Is there a reliable ZFS based stonith? > > Gabriele > > > > *Sonicle S.r.l. *: http://www.sonicle.com > *Music: *http://www.gabrielebulfon.com > *Quantum Mechanics : *http://www.cdbaby.com/cd/gabrielebulfon > > ------------------------------ > > > *Da:* Andrei Borzenkov > *A:* Cluster Labs - All topics related to open-source clustering welcomed > > *Data:* 29 luglio 2020 9.46.09 CEST > *Oggetto:* Re: [ClusterLabs] Antw: [EXT] Stonith failing > > > > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon > wrote: > >> That one was taken from a specific implementation on Solaris 11. >> The situation is a dual node server with shared storage controller: both >> nodes see the same disks concurrently. >> Here we must be sure that the two nodes are not going to import/mount the >> same zpool at the same time, or we will encounter data corruption: >> > > ssh based "stonith" cannot guarantee it. > > >> node 1 will be perferred for pool 1, node 2 for pool 2, only in case one >> of the node goes down or is taken offline the resources should be first >> free by the leaving node and taken by the other node. >> >> Would you suggest one of the available stonith in this case? >> >> > > IPMI, managed PDU, SBD ... > In practice, the only stonith method that works in case of complete node > outage including any power supply is SBD. > > _______________________________________________ > Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > _______________________________________________ > Manage your subscription:https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl, RHCA Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA -------------- next part -------------- An HTML attachment was scrubbed... URL: From arvidjaar at gmail.com Sun Aug 16 05:40:05 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Sun, 16 Aug 2020 12:40:05 +0300 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> Message-ID: <032dd57d-957f-3e64-c42c-d03ded631a27@gmail.com> 16.08.2020 04:25, Reid Wahl ?????: > > >> - considering that I have both nodes with stonith against the other node, >> once the two nodes can communicate, how can I be sure the two nodes will >> not try to stonith each other? >> > > The simplest option is to add a delay attribute (e.g., delay=10) to one of > the stonith devices. That way, if both nodes want to fence each other, the > node whose stonith device has a delay configured will wait for the delay to > expire before executing the reboot action. > Current pacemaker (2.0.4) also supports priority-fencing-delay option that computes delay based on which resources are active on specific node, so favoring node with "more important" resources. > Alternatively, you can set up corosync-qdevice, using a separate system > running qnetd server as a quorum arbitrator. > Any solution that is based on node suicide is prone to complete cluster loss. In particular, in two node cluster with qdevice surviving node will commit suicide is qnetd is not accessible. As long as external stonith is reasonably reliable it is much preferred to any solution based on quorum (unless you have very specific requirements and can tolerate running remaining nodes in "frozen" mode to limit unavailability). And before someone jumps in - SBD falls into "solution based on suicide" as well. From gerry at infinityit.ie Sat Aug 15 12:02:05 2020 From: gerry at infinityit.ie (Gerry Kernan) Date: Sat, 15 Aug 2020 16:02:05 +0000 Subject: [ClusterLabs] DRBD resource not starting In-Reply-To: References: , Message-ID: <6550F8AA-40D1-41A3-B568-131E293C9C80@infinityit.ie> Hi Got it sorted Removed drbd-pacemaker and reinstalled drbd90-utilise All ok now Gerry Sent from my iPhone > On 15 Aug 2020, at 16:18, Strahil Nikolov wrote: > > ?And how did you define the drbd resource? > > Best Regards, > Strahil Nikolov > > ?? 14 ?????? 2020 ?. 18:48:32 GMT+03:00, Gerry Kernan ??????: >> Hi >> Im trying to add a drbd resource to pacemaker cluster on centos 7 >> >> >> But getting this error on pcs status >> drbd_r0 (ocf::linbit:drbd): ORPHANED FAILED (blocked)[ vps1 >> vps2 ] >> >> >> if I try and the resource from cli I get out out below >> >> [root at VPS1 drbd-utils]# pcs resource debug-start rep-r0 >> Operation start for rep-r0:0 (ocf:linbit:drbd) returned: 'unknown >> error' (1) >>> stdout: drbdsetup - Configure the DRBD kernel module. >>> stdout: >>> stdout: USAGE: drbdsetup command {arguments} [options] >>> stdout: >>> stdout: Commands: >>> stdout: primary - Change the role of a node in a resource to >> primary. >>> stdout: secondary - Change the role of a node in a resource to >> secondary. >>> stdout: attach - Attach a lower-level device to an existing >> replicated device. >>> stdout: disk-options - Change the disk options of an attached >> lower-level device. >>> stdout: detach - Detach the lower-level device of a replicated >> device. >>> stdout: connect - Attempt to (re)establish a replication link to >> a peer host. >>> stdout: new-peer - Make a peer host known to a resource. >>> stdout: del-peer - Remove a connection to a peer host. >>> stdout: new-path - Add a path (endpoint address pair) where a >> peer host should be >>> stdout: reachable. >>> stdout: del-path - Remove a path (endpoint address pair) from a >> connection to a peer >>> stdout: host. >>> stdout: net-options - Change the network options of a >> connection. >>> stdout: disconnect - Unconnect from a peer host. >>> stdout: resize - Reexamine the lower-level device sizes to >> resize a replicated >>> stdout: device. >>> stdout: resource-options - Change the resource options of an >> existing resource. >>> stdout: peer-device-options - Change peer-device options. >>> stdout: new-current-uuid - Generate a new current UUID. >>> stdout: invalidate - Replace the local data of a volume with >> that of a peer. >>> stdout: invalidate-remote - Replace a peer's data of a volume >> with the local data. >>> stdout: pause-sync - Stop resynchronizing between a local and a >> peer device. >>> stdout: resume-sync - Allow resynchronization to resume on a >> replicated device. >>> stdout: suspend-io - Suspend I/O on a replicated device. >>> stdout: resume-io - Resume I/O on a replicated device. >>> stdout: outdate - Mark the data on a lower-level device as >> outdated. >>> stdout: verify - Verify the data on a lower-level device against >> a peer device. >>> stdout: down - Take a resource down. >>> stdout: role - Show the current role of a resource. >>> stdout: cstate - Show the current state of a connection. >>> stdout: dstate - Show the current disk state of a lower-level >> device. >>> stdout: show-gi - Show the data generation identifiers for a >> device on a particular >>> stdout: connection, with explanations. >>> stdout: get-gi - Show the data generation identifiers for a >> device on a particular >>> stdout: connection. >>> stdout: show - Show the current configuration of a resource, or >> of all resources. >>> stdout: status - Show the state of a resource, or of all >> resources. >>> stdout: check-resize - Remember the current size of a >> lower-level device. >>> stdout: events2 - Show the current state and all state changes >> of a resource, or of >>> stdout: all resources. >>> stdout: wait-sync-volume - Wait until resync finished on a >> volume. >>> stdout: wait-sync-connection - Wait until resync finished on all >> volumes of a >>> stdout: connection. >>> stdout: wait-sync-resource - Wait until resync finished on all >> volumes. >>> stdout: wait-connect-volume - Wait until a device on a peer is >> visible. >>> stdout: wait-connect-connection - Wait until all peer volumes of >> connection are >>> stdout: visible. >>> stdout: wait-connect-resource - Wait until all connections are >> establised. >>> stdout: new-resource - Create a new resource. >>> stdout: new-minor - Create a new replicated device within a >> resource. >>> stdout: del-minor - Remove a replicated device. >>> stdout: del-resource - Remove a resource. >>> stdout: forget-peer - Completely remove any reference to a >> unconnected peer from >>> stdout: meta-data. >>> stdout: >>> stdout: Use 'drbdsetup help command' for command-specific help. >>> stdout: >>> stdout: >>> stdout: invalid command >>> stdout: >>> stdout: >>> stdout: >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 318: USAGE:: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 320: Commands:: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 321: primary: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 322: secondary: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 323: attach: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 324: disk-options: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: line 325: detach: >> command not found >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: syntax >> error near unexpected token `(' >>> stderr: /usr/lib/ocf/resource.d/linbit/drbd: eval: line 326: ` >> connect - Attempt to (re)establish a replication link to a peer host.' >>> stderr: new-minor r0 0 0: sysfs node >> '/sys/devices/virtual/block/drbd0' (already? still?) exists >>> stderr: r0: Failure: (161) Minor or volume exists already (delete it >> first) >>> stderr: Command 'drbdsetup new-minor r0 0 0' terminated with exit >> code 10 >>> stderr: Aug 14 16:47:38 ERROR: r0: Called drbdadm -c /etc/drbd.conf >> -S new-minor r0 >>> stderr: Aug 14 16:47:38 ERROR: r0: Exit code 10 >>> stderr: Aug 14 16:47:38 ERROR: r0: Command output: >> >> Gerry From kwenning at redhat.com Mon Aug 17 03:06:12 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Mon, 17 Aug 2020 09:06:12 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <032dd57d-957f-3e64-c42c-d03ded631a27@gmail.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <032dd57d-957f-3e64-c42c-d03ded631a27@gmail.com> Message-ID: <96e70278-7fab-d86d-4df8-2f1c8b4291a0@redhat.com> On 8/16/20 11:40 AM, Andrei Borzenkov wrote: > 16.08.2020 04:25, Reid Wahl ?????: >> >>> - considering that I have both nodes with stonith against the other node, >>> once the two nodes can communicate, how can I be sure the two nodes will >>> not try to stonith each other? >>> >> The simplest option is to add a delay attribute (e.g., delay=10) to one of >> the stonith devices. That way, if both nodes want to fence each other, the >> node whose stonith device has a delay configured will wait for the delay to >> expire before executing the reboot action. If your fence-agent supports a delay attribute you can of course use that. As this isn't available with every fence-agent or is looking differently depending on the fence-agent we've introduced pcmk_delay_max & pcmk_delay_base. These are applied prior to actually calling the fence-agent and thus are always available and always look the same. The delay is gonna be some random time between pcmk_delay_base and pcmk_delay_max. This takes us to another approach how you can reduce chances of a fatal fence-race. Assuming that the reason why the fence-race is triggered is detected around the same time when just adding a random time you will very likely prevent them killing each other. This is especially interesting when there is no clear / easy way to determine which of the nodes is more important at this time. >> > Current pacemaker (2.0.4) also supports priority-fencing-delay option > that computes delay based on which resources are active on specific > node, so favoring node with "more important" resources. > >> Alternatively, you can set up corosync-qdevice, using a separate system >> running qnetd server as a quorum arbitrator. >> > Any solution that is based on node suicide is prone to complete cluster > loss. In particular, in two node cluster with qdevice surviving node > will commit suicide is qnetd is not accessible. I don't think that what Reid suggested was going for nodes that loose quorum to commit suicide right away. You can use quorum simply as a means of preventing fence-races otherwise inherent to 2-node-clusters. > > As long as external stonith is reasonably reliable it is much preferred > to any solution based on quorum (unless you have very specific > requirements and can tolerate running remaining nodes in "frozen" mode > to limit unavailability). Well we can name the predominant scenario why one might not want to depend on fencing-devices like ipmi: If you want to cover a scenario where the nodes don't just loose corosync connectivity but as well access from one node to the fencing device of the other is interrupted you probably won't get around an approach that involves some kind of arbitrator. > > And before someone jumps in - SBD falls into "solution based on suicide" > as well. Got your point without that hint ;-) Klaus > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From arvidjaar at gmail.com Mon Aug 17 03:19:33 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Mon, 17 Aug 2020 10:19:33 +0300 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <96e70278-7fab-d86d-4df8-2f1c8b4291a0@redhat.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <032dd57d-957f-3e64-c42c-d03ded631a27@gmail.com> <96e70278-7fab-d86d-4df8-2f1c8b4291a0@redhat.com> Message-ID: 17.08.2020 10:06, Klaus Wenninger ?????: >> >>> Alternatively, you can set up corosync-qdevice, using a separate system >>> running qnetd server as a quorum arbitrator. >>> >> Any solution that is based on node suicide is prone to complete cluster >> loss. In particular, in two node cluster with qdevice surviving node >> will commit suicide is qnetd is not accessible. > I don't think that what Reid suggested was going for nodes > that loose quorum to commit suicide right away. > You can use quorum simply as a means of preventing fence-races > otherwise inherent to 2-node-clusters. Can you please show the configuration example how to do it? Sorry, but I do not understand how is it possible. >> >> As long as external stonith is reasonably reliable it is much preferred >> to any solution based on quorum (unless you have very specific >> requirements and can tolerate running remaining nodes in "frozen" mode >> to limit unavailability). > Well we can name the predominant scenario why one might not want to depend > on fencing-devices like ipmi: If you want to cover a scenario where the > nodes don't > just loose corosync connectivity but as well access from one node to the > fencing > device of the other is interrupted you probably won't get around an > approach that > involves some kind of arbitrator. Sure. Which is why I said "reasonably reliable". Still even in this case one must understand all pros and cons to decide which risk is more important to mitigate. From kwenning at redhat.com Mon Aug 17 03:25:30 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Mon, 17 Aug 2020 09:25:30 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <032dd57d-957f-3e64-c42c-d03ded631a27@gmail.com> <96e70278-7fab-d86d-4df8-2f1c8b4291a0@redhat.com> Message-ID: On 8/17/20 9:19 AM, Andrei Borzenkov wrote: > 17.08.2020 10:06, Klaus Wenninger ?????: >>>> Alternatively, you can set up corosync-qdevice, using a separate system >>>> running qnetd server as a quorum arbitrator. >>>> >>> Any solution that is based on node suicide is prone to complete cluster >>> loss. In particular, in two node cluster with qdevice surviving node >>> will commit suicide is qnetd is not accessible. >> I don't think that what Reid suggested was going for nodes >> that loose quorum to commit suicide right away. >> You can use quorum simply as a means of preventing fence-races >> otherwise inherent to 2-node-clusters. > Can you please show the configuration example how to do it? Sorry, but I > do not understand how is it possible. Simply don't set the 2-node-flag. So just one of the nodes will have quorum and just one of them will attempt fencing. > >>> As long as external stonith is reasonably reliable it is much preferred >>> to any solution based on quorum (unless you have very specific >>> requirements and can tolerate running remaining nodes in "frozen" mode >>> to limit unavailability). >> Well we can name the predominant scenario why one might not want to depend >> on fencing-devices like ipmi: If you want to cover a scenario where the >> nodes don't >> just loose corosync connectivity but as well access from one node to the >> fencing >> device of the other is interrupted you probably won't get around an >> approach that >> involves some kind of arbitrator. > Sure. Which is why I said "reasonably reliable". Still even in this case > one must understand all pros and cons to decide which risk is more > important to mitigate. > Exactly! Which is why I tried to give some flesh to the idea to foster this kind of understanding. You always have to be aware of the failure scenarios you want to cover and what it might cost you elsewhere to cover one specific scenario. Klaus From jfriesse at redhat.com Mon Aug 17 03:38:49 2020 From: jfriesse at redhat.com (Jan Friesse) Date: Mon, 17 Aug 2020 09:38:49 +0200 Subject: [ClusterLabs] Alerts for qdevice/qnetd/booth In-Reply-To: References: <942629ca-1035-32b4-2025-1ddd970e1b1a@redhat.com> <4ad3e321-4204-859f-cc2b-18f67e1faf9e@redhat.com> Message-ID: > Thanks Honza. I have raised these on both upstream projects. Thanks > I will leave upto implementer how best this can be done, considering the > technical limitations you mentioned. > > https://github.com/corosync/corosync-qdevice/issues/13 > https://github.com/ClusterLabs/booth/issues/99 > > Thanks, > Rohit > > On Thu, Aug 13, 2020 at 1:03 PM Jan Friesse wrote: > >> Hi Rohit, >> >>> Hi Honza, >>> Thanks for your reply. Please find the attached image below: >>> >>> [image: image.png] >>> >>> Yes, I am talking about pacemaker alerts only. >>> >>> Please find my suggestions/requirements below: >>> >>> *Booth:* >>> 1. Node5 booth-arbitrator should be able to give event when any of the >>> booth node joins or leaves. booth-ip can be passed in event. >> >> This is not how booth works. Ticket leader (so site booth, never >> arbitrator) executes election and get replies from other >> sites/arbitrator. Follower executes election when leader hasn't for >> configured timeout. >> >> What I want to say is, that there is no "membership" - as in (for >> example) corosync fashion. >> >> The best we could get is the rough estimation based on election >> request/replies. >> >>> 2. Event when booth-arbitrator is up successfully and has started >>> monitoring the booth nodes. >> >> This is basically start of service. I think it's doable with small >> change in unit file (something like >> >> https://northernlightlabs.se/2014-07-05/systemd-status-mail-on-unit-failure.html >> ) >> >>> 2. Geo site booth should be able to give event when its booth peers >>> joins/leaves. For example, Geo site1 gives an event when node5 >>> booth-arbitrator joins/leaves OR site2 booth joins/leaves. booth-ip can >> be >>> passed in event. >>> 3. On ticket movements (revoke/grant), every booth node(Site1/2 and >> node5) >>> should give events. >> >> That would be doable >> >>> >>> Note: pacemaker alerts works in a cluster. Since, arbitrator is a >>> non-cluster node, not sure how exactly it will work there. But this is >> good >>> to have feature. >>> >>> *Qnetd/Qdevice:* >>> This is similar to above. >>> 1. Node5 qnetd should be able to raise an event when any of the cluster >>> node joins/leaves the quorum. >> >> Doable >> >>> 2. Event when qnetd is up successfully and has started monitoring the >>> cluster nodes >> >> Qnetd itself is not monitoring qdevice nodes (it doesn't have list of >> nodes). It monitors node status after node joins (= it would be possible >> to trigger event on leave). So that may be enough. >> >>> 3. Cluster node should be able to give event when any of the quorum node >>> leaves/joins. >> >> You mean qdevice should be able to trigger event when connected to qnetd? >> >>> >>> If you see on high level, then these are kind of node/resource events wrt >>> booth and qnetd/qdevice. >> >> Yeah >> >>> >>> As of today wrt booth/qnetd, I don't see any provision where any of the >>> nodes gives any event when its peer leaves/joins. This makes it difficult >>> to know whether geo sites nodes can see booth-arbitrator or not. This is >> >> Got it. That's exactly what would be really problematic to implement, >> because of no "membership" in booth. It would be, however, possible to >> implement message when ticket was granted/rejected and have a list of >> other booths replies and what was their votes. >> >>> true the other way around also where booth-arbitrator cannot see geo >> booth >>> sites. >>> I am not sure how others are doing it in today's deployment, but I see >> need >>> of monitoring of every other booth/qnet node. So that on basis of event, >>> appropriate alarms can be raised and action can be taken accordingly. >>> >>> Please let me know if you agree on the usecases. I'll raise >> feature-request >> >> I can agree on usecases, but (especially with booth) there are technical >> problems on realizing them. >> >>> on the pacemaker upstream project accordingly. >> >> Please use booth (https://github.com/ClusterLabs/booth) and qdevice >> (https://github.com/corosync/corosync-qdevice) upstream rather than >> pacemaker, because these requests has really nothing to do with pcmk. >> >> Regards, >> honza >> >>> >>> Thanks, >>> Rohit >>> >>> On Wed, Aug 12, 2020 at 8:58 PM Jan Friesse wrote: >>> >>>> Hi Rohit, >>>> >>>> Rohit Saini napsal(a): >>>>> Hi Team, >>>>> >>>>> Question-1: >>>>> Similar to pcs alerts, do we have something similar for qdevice/qnetd? >>>> This >>>> >>>> You mean pacemaker alerts right? >>>> >>>>> is to detect asynchronously if any of the member is >>>> unreachable/joined/left >>>>> and if that member is qdevice or qnetd. >>>> >>>> Nope but actually shouldn't be that hard to implement. What exactly >>>> would you like to see there? >>>> >>>>> >>>>> Question-2: >>>>> Same above question for booth nodes and arbitrator. Is there any way to >>>>> receive events from booth daemon? >>>> >>>> Not directly (again, shouldn't be that hard to implement). But pacemaker >>>> alerts should be triggered when service changes state because of ticket >>>> grant/reject, isn't it? >>>> >>>>> >>>>> My main objective is to see if these daemons give events related to >>>>> their internal state transitions and raise some alarms accordingly. >> For >>>>> example, boothd arbitrator is unreachable, ticket moved from x to y, >> etc. >>>> >>>> I don't think "boothd arbitrator is unreachable" alert is really doable. >>>> Ticket moved from x to y would be probably two alerts - 1. ticket >>>> rejected on X and 2. granted on Y. >>>> >>>> Would you mind to elaborate a bit more on events you would like to see >>>> and potentially open issue for upstream project (or, if you have a RH >>>> subscription try to contact GSS, so I get more time to work on this >> issue). >>>> >>>> Regards, >>>> Honza >>>> >>>>> >>>>> Thanks, >>>>> Rohit >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Manage your subscription: >>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>> >>>>> ClusterLabs home: https://www.clusterlabs.org/ >>>>> >>>> >>>> >>> >> >> > From kwenning at redhat.com Mon Aug 17 03:52:02 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Mon, 17 Aug 2020 09:52:02 +0200 Subject: [ClusterLabs] Clear Pending Fencing Action In-Reply-To: References: <20200803032623.75154600275@iwtm.local> Message-ID: <0f8d1c4f-850f-0c1c-368d-2f668af75ebe@redhat.com> On 8/3/20 7:04 AM, Reid Wahl wrote: > Hi, ????. `stonith_admin --cleanup` doesn't get rid of pending > actions, only failed ones. You might be hitting > https://bugs.clusterlabs.org/show_bug.cgi?id=5401. > > I believe a simultaneous reboot of both nodes will clear the pending > actions. I don't recall whether there's any other way to clear them. Even simultaneous rebooting might be some kind of a challenge. When a node is coming up it will request the history from the running nodes. Thus a simultaneous reboot might not be simultaneous enough so that the nodes aren't still able to pass this list from one to another. To be on the safe side you would have to shut down all nodes and fire them up again. If it is not the bug stated above and we are talking about a pending fence-action that was going on on node that itself just got fenced (and is still down) pacemaker coming up on that node will remove (fail) the pending fence-action. Just for completeness: 'stonith_admin --cleanup' cleans everything that is 'just' history (failed & successful) whereas weather an attempt is still pending does have an effect on how fencing is working. As nobody would expect a history-cleanup to influence behavior, not touching pending actions is a safety measure and not a bug. Klaus > > On Sun, Aug 2, 2020 at 8:26 PM ???? ??????? > wrote: > > Hello! > > ? > > After troubleshooting 2-Node cluster, crm_mon deprecated actions > are displayed in ?Pending Fencing Action:? list. > > How can I delete them. > > ?stonith_admin --cleanup --history=*? does not delete it. > > ? > > ? > > ? ?????????, > ???? ??????? > elias at po-mayak.ru > > ? > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > > > > -- > Regards, > > Reid Wahl, RHCA > Software Maintenance Engineer, Red Hat > CEE - Platform Support Delivery - ClusterHA > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwenning at redhat.com Mon Aug 17 04:09:32 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Mon, 17 Aug 2020 10:09:32 +0200 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: on-fail=demote / no-quorum-policy=demote In-Reply-To: References: Message-ID: On 8/10/20 6:47 PM, Ken Gaillot wrote: > Hi all, > > Looking ahead to the Pacemaker 2.0.5 release expected at the end of > this year, here is a new feature already in the master branch. > > When configuring resource operations, Pacemaker lets you set an "on- > fail" policy to specify whether to restart the resource, fence the > node, etc., if the operation fails. With 2.0.5, a new possible value > will be "demote", which will mean "demote this resource but do not > fully restart it". > > "Demote" will be a valid value only for promote actions, and for > recurring monitors with "role" set to "Master". > > Once the resource is demoted, it will be eligible for promotion again, > so if the promotion scores have not changed, a promote on the same node > may be attempted. If this is not desired, the agent can change the > promotion scores either in the failed monitor or the demote. > > The intended use case is an application where a successful demote > assures a well-functioning service, and a full restart would be > unnecessarily heavyweight. A large database might be an example. > > Similarly, Pacemaker offers the cluster-wide "no-quorum-policy" option > to specify what happens to resources when quorum is lost (the default > being to stop them). With 2.0.5, "demote" will be a possible value here > as well, and will mean "demote all promotable resources and stop all > other resources". When using the new "no-quorum-policy" together with SBD please be sure to use an SBD version that has https://github.com/ClusterLabs/sbd/pull/111. Klaus > > The intended use case is an application that cannot cause any harm > after being demoted, and may be useful in a demoted role even if there > is no quorum. A database that operates read-only when demoted and > doesn't depend on any non-promotable resources might be an example. > > Happy clustering :) From fabbione at fabbione.net Mon Aug 17 05:33:05 2020 From: fabbione at fabbione.net (Fabio M. Di Nitto) Date: Mon, 17 Aug 2020 11:33:05 +0200 Subject: [ClusterLabs] kronosnet v1.19 released Message-ID: All, We are pleased to announce the general availability of kronosnet v1.19 kronosnet (or knet for short) is the new underlying network protocol for Linux HA components (corosync), that features the ability to use multiple links between nodes, active/active and active/passive link failover policies, automatic link recovery, FIPS compliant encryption (nss and/or openssl), automatic PMTUd and in general better performance compared to the old network protocol. Highlights in this release: * Add native support for openssl 3.0 (drop API COMPAT macros). * Code cleanup of public APIs. Lots of lines of code moved around, no functional changes. * Removed kronosnetd unsupported code completely * Removed unused poc-code from the source tree * Make sure to initialize epoll events structures Known issues in this release: * None The source tarballs can be downloaded here: https://www.kronosnet.org/releases/ Upstream resources and contacts: https://kronosnet.org/ https://github.com/kronosnet/kronosnet/ https://ci.kronosnet.org/ https://trello.com/kronosnet (TODO list and activities tracking) https://goo.gl/9ZvkLS (google shared drive with presentations and diagrams) IRC: #kronosnet on Freenode https://lists.kronosnet.org/mailman/listinfo/users https://lists.kronosnet.org/mailman/listinfo/devel https://lists.kronosnet.org/mailman/listinfo/commits Cheers, The knet developer team From fabbione at fabbione.net Mon Aug 17 05:35:13 2020 From: fabbione at fabbione.net (Fabio M. Di Nitto) Date: Mon, 17 Aug 2020 11:35:13 +0200 Subject: [ClusterLabs] kronosnet v1.x series and future support / development Message-ID: <02f339d5-f6e5-28e0-e5d9-68e1ac910bd9@fabbione.net> All, kronosnet (or knet for short) is the new underlying network protocol for Linux HA components (corosync), that features the ability to use multiple links between nodes, active/active and active/passive link failover policies, automatic link recovery, FIPS compliant encryption (nss and/or openssl), automatic PMTUd and in general better performance compared to the old network protocol. After several weeks / months without any major bug reported, starting with v1.19 release, we are going to lock down the 1.x series to only 2 kind of changes: * Bug fixes * Onwire compatibility changes to allow rolling upgrades with the v2.x series (if necessary at all) Upstream will continue to support v1.x for at least 12 months after v2.x will be released (date unknown, no really, we didn't even start the development). If you have any amazing ideas on what v2.x should include, please file issues here: https://github.com/kronosnet/kronosnet/issues Or check the current TODO list here: https://trello.com/kronosnet Cheers, The knet developer team From kgaillot at redhat.com Mon Aug 17 10:54:21 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 17 Aug 2020 09:54:21 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <1954017432.49729402.1597400271575.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1954017432.49729402.1597400271575.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <426bbd8063885706b0e8fdd3dab81be7e0c9f25d.camel@redhat.com> On Fri, 2020-08-14 at 12:17 +0200, Lentes, Bernd wrote: > > ----- On Aug 10, 2020, at 11:59 PM, kgaillot kgaillot at redhat.com > wrote: > > The most recent transition is aborted, but since all its actions > > are > > complete, the only effect is to trigger a new transition. > > > > We should probably rephrase the log message. In fact, the whole > > "transition" terminology is kind of obscure. It's hard to come up > > with > > something better though. > > > > Hi Ken, > > i don't get it. How can s.th. be aborted which is already completed ? I agree the wording is confusing :) >From the code's point of view, the actions in the transition are complete, but the transition itself (as an abstract entity) remains current until the next one starts. However that's academic and meaningless from a user's point of view, so the log messages should be reworded. > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 -- Ken Gaillot From kgaillot at redhat.com Mon Aug 17 11:09:02 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 17 Aug 2020 10:09:02 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> On Fri, 2020-08-14 at 20:37 +0200, Lentes, Bernd wrote: > ----- On Aug 9, 2020, at 10:17 PM, Bernd Lentes > bernd.lentes at helmholtz-muenchen.de wrote: > > > > > So this appears to be the problem. From these logs I would guess > > > the > > > successful stop on ha-idg-1 did not get written to the CIB for > > > some > > > reason. I'd look at the pe input from this transition on ha-idg-2 > > > to > > > confirm that. > > > > > > Without the DC knowing about the stop, it tries to schedule a new > > > one, > > > but the node is shutting down so it can't do it, which means it > > > has to > > > be fenced. > > I checked all relevant pe-files in this time period. > This is what i found out (i just write the important entries): > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input- > 3116 -G transition-3116.xml -D transition-3116.dot > Current cluster status: > ... > vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-1 > Transition Summary: > ... > * Migrate vm_nextcloud ( ha-idg-1 -> ha-idg-2 ) > Executing cluster transition: > * Resource action: vm_nextcloud migrate_from on ha-idg-2 <======= > migrate vm_nextcloud > * Resource action: vm_nextcloud stop on ha-idg-1 > * Pseudo action: vm_nextcloud_start_0 > Revised cluster status: > Node ha-idg-1 (1084777482): standby > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-2 > > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-error- > 48 -G transition-4514.xml -D transition-4514.dot > Current cluster status: > Node ha-idg-1 (1084777482): standby > Online: [ ha-idg-2 ] > ... > vm_nextcloud (ocf::heartbeat:VirtualDomain): FAILED[ ha-idg-2 ha- > idg-1 ] <====== migration failed > Transition Summary: > .. > * Recover vm_nextcloud ( ha-idg-2 ) > Executing cluster transition: > * Resource action: vm_nextcloud stop on ha-idg-2 > * Resource action: vm_nextcloud stop on ha-idg-1 > * Resource action: vm_nextcloud start on ha-idg-2 > * Resource action: vm_nextcloud monitor=30000 on ha-idg-2 > Revised cluster status: > vm_nextcloud (ocf::heartbeat:VirtualDomain): Started ha-idg-2 > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input- > 3117 -G transition-3117.xml -D transition-3117.dot > Current cluster status: > Node ha-idg-1 (1084777482): standby > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): FAILED ha-idg-2 > <====== start on ha-idg-2 failed > Transition Summary: > * Stop vm_nextcloud ( ha-idg-2 ) due to node > availability <==== stop vm_nextcloud (what means due to node > availability ?) "Due to node availability" means no node is allowed to run the resource, so it has to be stopped. > Executing cluster transition: > * Resource action: vm_nextcloud stop on ha-idg-2 > Revised cluster status: > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input- > 3118 -G transition-4516.xml -D transition-4516.dot > Current cluster status: > Node ha-idg-1 (1084777482): standby > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > <============== vm_nextcloud is stopped > Transition Summary: > * Shutdown ha-idg-1 > Executing cluster transition: > * Resource action: vm_nextcloud stop on ha-idg-1 <==== why stop ? > It is already stopped I'm not sure, I'd have to see the pe input. > Revised cluster status: > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-input- > 3545 -G transition-0.xml -D transition-0.dot > Current cluster status: > Node ha-idg-1 (1084777482): pending > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <====== > vm_nextcloud is stopped > Transition Summary: > > Executing cluster transition: > Using the original execution date of: 2020-07-20 15:05:33Z > Revised cluster status: > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > ha-idg-1:~/why-fenced/ha-idg-2/pengine # crm_simulate -S -x pe-warn- > 749 -G transition-1.xml -D transition-1.dot > Current cluster status: > Node ha-idg-1 (1084777482): OFFLINE (standby) > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <======= > vm_nextcloud is stopped > Transition Summary: > * Fence (Off) ha-idg-1 'resource actions are unrunnable' > Executing cluster transition: > * Fencing ha-idg-1 (Off) > * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It is > already stopped ? > Revised cluster status: > Node ha-idg-1 (1084777482): OFFLINE (standby) > Online: [ ha-idg-2 ] > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > I don't understand why the cluster tries to stop a resource which is > already stopped. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot From kgaillot at redhat.com Mon Aug 17 11:19:45 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 17 Aug 2020 10:19:45 -0500 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <1118615335.1895.1597410544148@www> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> Message-ID: <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > Thanks to all your suggestions, I now have the systems with stonith > configured on ipmi. A word of caution: if the IPMI is on-board -- i.e. it shares the same power supply as the computer -- power becomes a single point of failure. If the node loses power, the other node can't fence because the IPMI is also down, and the cluster can't recover. Some on-board IPMI controllers can share an Ethernet port with the main computer, which would be a similar situation. It's best to have a backup fencing method when using IPMI as the primary fencing method. An example would be an intelligent power switch or sbd. > Two questions: > - how can I simulate a stonith situation to check that everything is > ok? > - considering that I have both nodes with stonith against the other > node, once the two nodes can communicate, how can I be sure the two > nodes will not try to stonith each other? > > :) > Thanks! > Gabriele > > > > Sonicle S.r.l. : http://www.sonicle.com > Music: http://www.gabrielebulfon.com > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon > > > > Da: Gabriele Bulfon > A: Cluster Labs - All topics related to open-source clustering > welcomed > Data: 29 luglio 2020 14.22.42 CEST > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing > > > > > > It is a ZFS based illumos system. > > I don't think SBD is an option. > > Is there a reliable ZFS based stonith? > > > > Gabriele > > > > > > > > Sonicle S.r.l. : http://www.sonicle.com > > Music: http://www.gabrielebulfon.com > > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon > > > > > > > > Da: Andrei Borzenkov > > A: Cluster Labs - All topics related to open-source clustering > > welcomed > > Data: 29 luglio 2020 9.46.09 CEST > > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing > > > > > > > > > > > > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon < > > > gbulfon at sonicle.com> wrote: > > > > That one was taken from a specific implementation on Solaris > > > > 11. > > > > The situation is a dual node server with shared storage > > > > controller: both nodes see the same disks concurrently. > > > > Here we must be sure that the two nodes are not going to > > > > import/mount the same zpool at the same time, or we will > > > > encounter data corruption: > > > > > > > > > > > > > ssh based "stonith" cannot guarantee it. > > > > > > > node 1 will be perferred for pool 1, node 2 for pool 2, only in > > > > case one of the node goes down or is taken offline the > > > > resources should be first free by the leaving node and taken by > > > > the other node. > > > > > > > > Would you suggest one of the available stonith in this case? > > > > > > > > > > > > > > > > > IPMI, managed PDU, SBD ... > > > In practice, the only stonith method that works in case of > > > complete node outage including any power supply is SBD. -- Ken Gaillot From kadlecsik.jozsef at wigner.hu Mon Aug 17 06:12:36 2020 From: kadlecsik.jozsef at wigner.hu (=?UTF-8?Q?Kadlecsik_J=C3=B3zsef?=) Date: Mon, 17 Aug 2020 12:12:36 +0200 (CEST) Subject: [ClusterLabs] node utilization attributes are lost during upgrade Message-ID: Hello, At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian stretch to buster, all the node utilization attributes were erased from the configuration. However, the same attributes were kept at the VirtualDomain resources. This resulted that all resources with utilization attributes were stopped. The documentation says: "You can name utilization attributes according to your preferences and define as many name/value pairs as your configuration needs.", so one assumes utilization attributes are kept during upgrades, for nodes and resources as well. The corosync incompatibility made the upgrade more stressful anyway and the stopping of the resources came out of the blue. The resources could not be started of course - and there were no log warning/error messages that the resources are not started because the utilization constrains could not be satisfied. Pacemaker logs a lot (from admin point of view it is too much), but in this case there was no indication why the resources could not be started (or we were unable to find it in the logs?). So we wasted a lot of time with debugging the VirtualDomain agent. Currently we run the cluster with the placement-strategy set to default. In my opinion node attributes should be kept and preserved during an upgrade. Also, it should be logged when a resource must be stopped/cannot be started because the utilization constrains cannot be satisfied. Best regards, Jozsef -- E-mail : kadlecsik.jozsef at wigner.hu PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt Address: Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary From sameerdhiman at gmail.com Mon Aug 17 08:40:22 2020 From: sameerdhiman at gmail.com (Sameer Dhiman) Date: Mon, 17 Aug 2020 18:10:22 +0530 Subject: [ClusterLabs] Beginner Question about VirtualDomain Message-ID: Hi, I am a beginner using pacemaker and corosync. I am trying to set up a cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but in CentOS-8.2. My R&D setup is described below Physical Host running CentOS-8.2 with Nested Virtualization 2 x CentOS-8.2 guest machines as Cluster Node 1 and 2. WinXP as a HA guest. 1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine definitions) 2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD) Question(s): 1. How to prevent guest startup until gfs2 and raw-lv are available? In CentOS-6 Alteeve used autostart=0 in the tag. Is there any similar option in pacemaker because I did not find it in the documentation? 2. Suppose, If I configure constraint order gfs2 and raw-lv then guest machine. Stopping the guest machine would also stop the complete service tree so how can I prevent this? -- Sameer Dhiman -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgaillot at redhat.com Mon Aug 17 16:38:09 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 17 Aug 2020 15:38:09 -0500 Subject: [ClusterLabs] node utilization attributes are lost during upgrade In-Reply-To: References: Message-ID: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik J?zsef wrote: > Hello, > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian > stretch > to buster, all the node utilization attributes were erased from the > configuration. However, the same attributes were kept at the > VirtualDomain > resources. This resulted that all resources with utilization > attributes > were stopped. Ouch :( There are two types of node attributes, transient and permanent. Transient attributes last only until pacemaker is next stopped on the node, while permanent attributes persist between reboots/restarts. If you configured the utilization attributes with crm_attribute -z/ --utilization, it will default to permanent, but it's possible to override that with -l/--lifetime reboot (or equivalently, -t/--type status). Permanent node attributes should definitely not be erased in an upgrade. > > The documentation says: "You can name utilization attributes > according to > your preferences and define as many name/value pairs as your > configuration > needs.", so one assumes utilization attributes are kept during > upgrades, > for nodes and resources as well. > > The corosync incompatibility made the upgrade more stressful anyway > and > the stopping of the resources came out of the blue. The resources > could > not be started of course - and there were no log warning/error > messages > that the resources are not started because the utilization > constrains > could not be satisfied. Pacemaker logs a lot (from admin point of > view it > is too much), but in this case there was no indication why the > resources > could not be started (or we were unable to find it in the logs?). So > we > wasted a lot of time with debugging the VirtualDomain agent. > > Currently we run the cluster with the placement-strategy set to > default. > > In my opinion node attributes should be kept and preserved during an > upgrade. Also, it should be logged when a resource must be > stopped/cannot > be started because the utilization constrains cannot be satisfied. > > Best regards, > Jozsef > -- > E-mail : kadlecsik.jozsef at wigner.hu > PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt > Address: Wigner Research Centre for Physics > H-1525 Budapest 114, POB. 49, Hungary -- Ken Gaillot From jgdr at dalibo.com Mon Aug 17 16:39:19 2020 From: jgdr at dalibo.com (Jehan-Guillaume de Rorthais) Date: Mon, 17 Aug 2020 22:39:19 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> Message-ID: <20200817223919.6187c711@firost> On Mon, 17 Aug 2020 10:19:45 -0500 Ken Gaillot wrote: > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > > Thanks to all your suggestions, I now have the systems with stonith > > configured on ipmi. > > A word of caution: if the IPMI is on-board -- i.e. it shares the same > power supply as the computer -- power becomes a single point of > failure. If the node loses power, the other node can't fence because > the IPMI is also down, and the cluster can't recover. > > Some on-board IPMI controllers can share an Ethernet port with the main > computer, which would be a similar situation. > > It's best to have a backup fencing method when using IPMI as the > primary fencing method. An example would be an intelligent power switch > or sbd. How SBD would be useful in this scenario? Poison pill will not be swallowed by the dead node... Is it just to wait for the watchdog timeout? Regards, From kgaillot at redhat.com Mon Aug 17 17:07:52 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Mon, 17 Aug 2020 16:07:52 -0500 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <20200817223919.6187c711@firost> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> Message-ID: <8a2c89afdc005dc102e35df222056d08e08c21af.camel@redhat.com> On Mon, 2020-08-17 at 22:39 +0200, Jehan-Guillaume de Rorthais wrote: > On Mon, 17 Aug 2020 10:19:45 -0500 > Ken Gaillot wrote: > > > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > > > Thanks to all your suggestions, I now have the systems with > > > stonith > > > configured on ipmi. > > > > A word of caution: if the IPMI is on-board -- i.e. it shares the > > same > > power supply as the computer -- power becomes a single point of > > failure. If the node loses power, the other node can't fence > > because > > the IPMI is also down, and the cluster can't recover. > > > > Some on-board IPMI controllers can share an Ethernet port with the > > main > > computer, which would be a similar situation. > > > > It's best to have a backup fencing method when using IPMI as the > > primary fencing method. An example would be an intelligent power > > switch > > or sbd. > > How SBD would be useful in this scenario? Poison pill will not be > swallowed by > the dead node... Is it just to wait for the watchdog timeout? Right, I meant watchdog-only SBD. Although now that I think about it, I'm not sure of the details of if/how that would work. Klaus Wenninger might have some insight. -- Ken Gaillot From arvidjaar at gmail.com Tue Aug 18 01:49:48 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 18 Aug 2020 08:49:48 +0300 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <20200817223919.6187c711@firost> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> Message-ID: 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: > On Mon, 17 Aug 2020 10:19:45 -0500 > Ken Gaillot wrote: > >> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: >>> Thanks to all your suggestions, I now have the systems with stonith >>> configured on ipmi. >> >> A word of caution: if the IPMI is on-board -- i.e. it shares the same >> power supply as the computer -- power becomes a single point of >> failure. If the node loses power, the other node can't fence because >> the IPMI is also down, and the cluster can't recover. >> >> Some on-board IPMI controllers can share an Ethernet port with the main >> computer, which would be a similar situation. >> >> It's best to have a backup fencing method when using IPMI as the >> primary fencing method. An example would be an intelligent power switch >> or sbd. > > How SBD would be useful in this scenario? Poison pill will not be swallowed by > the dead node... Is it just to wait for the watchdog timeout? > Node is expected to commit suicide if SBD lost access to shared block device. So either node swallowed poison pill and died or node died because it realized it was impossible to see poison pill or node was dead already. After watchdog timeout (twice watchdog timeout for safety) we assume node is dead. From kwenning at redhat.com Tue Aug 18 02:21:50 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Tue, 18 Aug 2020 08:21:50 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> Message-ID: <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> On 8/18/20 7:49 AM, Andrei Borzenkov wrote: > 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: >> On Mon, 17 Aug 2020 10:19:45 -0500 >> Ken Gaillot wrote: >> >>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: >>>> Thanks to all your suggestions, I now have the systems with stonith >>>> configured on ipmi. >>> A word of caution: if the IPMI is on-board -- i.e. it shares the same >>> power supply as the computer -- power becomes a single point of >>> failure. If the node loses power, the other node can't fence because >>> the IPMI is also down, and the cluster can't recover. >>> >>> Some on-board IPMI controllers can share an Ethernet port with the main >>> computer, which would be a similar situation. >>> >>> It's best to have a backup fencing method when using IPMI as the >>> primary fencing method. An example would be an intelligent power switch >>> or sbd. >> How SBD would be useful in this scenario? Poison pill will not be swallowed by >> the dead node... Is it just to wait for the watchdog timeout? >> > Node is expected to commit suicide if SBD lost access to shared block > device. So either node swallowed poison pill and died or node died > because it realized it was impossible to see poison pill or node was > dead already. After watchdog timeout (twice watchdog timeout for safety) > we assume node is dead. Yes, like this a suicide via watchdog will be triggered if there are issues with thedisk. This is why it is important to have a reliable watchdog with SBD even whenusing poison pill. As this alone would make a single shared disk a SPOF, runningwith pacemaker integration (default) a node with SBD will survive despite ofloosing the disk when it has quorum and pacemaker looks healthy. As corosync-quorum in 2-node-mode obviously won't be fit for this purpose SBD will switch to checking for presence of both nodes if 2-node-flag is set. Sorry for the lengthy explanation but the full picture is required to understand whyit is sufficiently reliable and useful if configured correctly. Klaus From Ulrich.Windl at rz.uni-regensburg.de Tue Aug 18 03:04:44 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Tue, 18 Aug 2020 09:04:44 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: why is node fenced ? In-Reply-To: <426bbd8063885706b0e8fdd3dab81be7e0c9f25d.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1954017432.49729402.1597400271575.JavaMail.zimbra@helmholtz-muenchen.de> <426bbd8063885706b0e8fdd3dab81be7e0c9f25d.camel@redhat.com> Message-ID: <5F3B7D8C020000A10003AACF@gwsmtp.uni-regensburg.de> >>> Ken Gaillot schrieb am 17.08.2020 um 16:54 in Nachricht <426bbd8063885706b0e8fdd3dab81be7e0c9f25d.camel at redhat.com>: > On Fri, 2020-08-14 at 12:17 +0200, Lentes, Bernd wrote: >> >> ----- On Aug 10, 2020, at 11:59 PM, kgaillot kgaillot at redhat.com >> wrote: >> > The most recent transition is aborted, but since all its actions >> > are >> > complete, the only effect is to trigger a new transition. >> > >> > We should probably rephrase the log message. In fact, the whole >> > "transition" terminology is kind of obscure. It's hard to come up >> > with >> > something better though. >> > >> >> Hi Ken, >> >> i don't get it. How can s.th. be aborted which is already completed ? > > I agree the wording is confusing :) > > From the code's point of view, the actions in the transition are > complete, but the transition itself (as an abstract entity) remains > current until the next one starts. However that's academic and Hi! So when the "transition" had completed, it became a "state" ;-) Aborting a "state" seems to be a "transition" started in my view. > meaningless from a user's point of view, so the log messages should be > reworded. Another thing that always had confused me in the past was that concept of "synapse"... Good wording in error messages is a valuable resource! Regards, Ulrich > >> Bernd >> Helmholtz Zentrum M?nchen >> >> Helmholtz Zentrum Muenchen >> Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) >> Ingolstaedter Landstr. 1 >> 85764 Neuherberg >> www.helmholtz-muenchen.de >> Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling >> Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin >> Guenther >> Registergericht: Amtsgericht Muenchen HRB 6466 >> USt-IdNr: DE 129521671 > -- > Ken Gaillot > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From Ulrich.Windl at rz.uni-regensburg.de Tue Aug 18 03:10:07 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Tue, 18 Aug 2020 09:10:07 +0200 Subject: [ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing In-Reply-To: <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> Message-ID: <5F3B7ECF020000A10003AAD3@gwsmtp.uni-regensburg.de> >>> Ken Gaillot schrieb am 17.08.2020 um 17:19 in Nachricht <73d6ecf113098a3154a2e7db2e2a59557272024a.camel at redhat.com>: > On Fri, 2020?08?14 at 15:09 +0200, Gabriele Bulfon wrote: >> Thanks to all your suggestions, I now have the systems with stonith >> configured on ipmi. > > A word of caution: if the IPMI is on?board ?? i.e. it shares the same > power supply as the computer ?? power becomes a single point of > failure. If the node loses power, the other node can't fence because > the IPMI is also down, and the cluster can't recover. This may not always be true: We had servers with three(!) power supplies and a BMC (what today is called "light-out management"). You could "power down" the server, while the BMC was still operational (and thus could "power up" the server again). With standard PC architecture these days things seem to be a bit more compicated (meaning "primitive")... > > Some on?board IPMI controllers can share an Ethernet port with the main > computer, which would be a similar situation. > > It's best to have a backup fencing method when using IPMI as the > primary fencing method. An example would be an intelligent power switch > or sbd. > >> Two questions: >> ? how can I simulate a stonith situation to check that everything is >> ok? >> ? considering that I have both nodes with stonith against the other >> node, once the two nodes can communicate, how can I be sure the two >> nodes will not try to stonith each other? >> >> :) >> Thanks! >> Gabriele >> >> >> >> Sonicle S.r.l. : http://www.sonicle.com >> Music: http://www.gabrielebulfon.com >> Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon >> >> >> >> Da: Gabriele Bulfon >> A: Cluster Labs ? All topics related to open?source clustering >> welcomed >> Data: 29 luglio 2020 14.22.42 CEST >> Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing >> >> >> > >> > It is a ZFS based illumos system. >> > I don't think SBD is an option. >> > Is there a reliable ZFS based stonith? >> > >> > Gabriele >> > >> > >> > >> > Sonicle S.r.l. : http://www.sonicle.com >> > Music: http://www.gabrielebulfon.com >> > Quantum Mechanics : http://www.cdbaby.com/cd/gabrielebulfon >> > >> > >> > >> > Da: Andrei Borzenkov >> > A: Cluster Labs ? All topics related to open?source clustering >> > welcomed >> > Data: 29 luglio 2020 9.46.09 CEST >> > Oggetto: Re: [ClusterLabs] Antw: [EXT] Stonith failing >> > >> > >> > > >> > > >> > > On Wed, Jul 29, 2020 at 9:01 AM Gabriele Bulfon < >> > > gbulfon at sonicle.com> wrote: >> > > > That one was taken from a specific implementation on Solaris >> > > > 11. >> > > > The situation is a dual node server with shared storage >> > > > controller: both nodes see the same disks concurrently. >> > > > Here we must be sure that the two nodes are not going to >> > > > import/mount the same zpool at the same time, or we will >> > > > encounter data corruption: >> > > > >> > > >> > > >> > > ssh based "stonith" cannot guarantee it. >> > > >> > > > node 1 will be perferred for pool 1, node 2 for pool 2, only in >> > > > case one of the node goes down or is taken offline the >> > > > resources should be first free by the leaving node and taken by >> > > > the other node. >> > > > >> > > > Would you suggest one of the available stonith in this case? >> > > > >> > > > >> > > >> > > >> > > IPMI, managed PDU, SBD ... >> > > In practice, the only stonith method that works in case of >> > > complete node outage including any power supply is SBD. > ?? > Ken Gaillot > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From Ulrich.Windl at rz.uni-regensburg.de Tue Aug 18 03:21:07 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Tue, 18 Aug 2020 09:21:07 +0200 Subject: [ClusterLabs] Antw: [EXT] node utilization attributes are lost during upgrade In-Reply-To: References: Message-ID: <5F3B8163020000A10003AADB@gwsmtp.uni-regensburg.de> >>> Kadlecsik J?zsef schrieb am 17.08.2020 um 12:12 in Nachricht : > Hello, > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian stretch > to buster, all the node utilization attributes were erased from the > configuration. However, the same attributes were kept at the VirtualDomain > resources. This resulted that all resources with utilization attributes > were stopped. > > The documentation says: "You can name utilization attributes according to > your preferences and define as many name/value pairs as your configuration > needs.", so one assumes utilization attributes are kept during upgrades, > for nodes and resources as well. Now that you mention it, I think we had it in the past with SLES, too. > > The corosync incompatibility made the upgrade more stressful anyway and > the stopping of the resources came out of the blue. The resources could > not be started of course ? and there were no log warning/error messages > that the resources are not started because the utilization constrains > could not be satisfied. Pacemaker logs a lot (from admin point of view it > is too much), but in this case there was no indication why the resources > could not be started (or we were unable to find it in the logs?). So we > wasted a lot of time with debugging the VirtualDomain agent. Also true: It's not very obvious when resources are not started due to utilization constraints. > > Currently we run the cluster with the placement?strategy set to default. > > In my opinion node attributes should be kept and preserved during an > upgrade. Also, it should be logged when a resource must be stopped/cannot > be started because the utilization constrains cannot be satisfied. +1 > > Best regards, > Jozsef > ?? > E?mail : kadlecsik.jozsef at wigner.hu > PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt > Address: Wigner Research Centre for Physics > H?1525 Budapest 114, POB. 49, Hungary > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From arvidjaar at gmail.com Tue Aug 18 03:24:01 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 18 Aug 2020 10:24:01 +0300 Subject: [ClusterLabs] Antw: Re: Antw: [EXT] Stonith failing In-Reply-To: <5F3B7ECF020000A10003AAD3@gwsmtp.uni-regensburg.de> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <5F3B7ECF020000A10003AAD3@gwsmtp.uni-regensburg.de> Message-ID: <83aba38d-c9ea-1dff-e53b-14a9e0623d9d@gmail.com> 18.08.2020 10:10, Ulrich Windl ?????: >>>> Ken Gaillot schrieb am 17.08.2020 um 17:19 in > Nachricht > <73d6ecf113098a3154a2e7db2e2a59557272024a.camel at redhat.com>: >> On Fri, 2020?08?14 at 15:09 +0200, Gabriele Bulfon wrote: >>> Thanks to all your suggestions, I now have the systems with stonith >>> configured on ipmi. >> >> A word of caution: if the IPMI is on?board ?? i.e. it shares the same >> power supply as the computer ?? power becomes a single point of >> failure. If the node loses power, the other node can't fence because >> the IPMI is also down, and the cluster can't recover. > > This may not always be true: We had servers with three(!) power supplies and a > BMC (what today is called "light-out management"). You could "power down" the > server, while the BMC was still operational (and thus could "power up" the > server again). > With standard PC architecture these days things seem to be a bit more > compicated (meaning "primitive")... > BMC is powered by standby voltage. If AC input to all of your power supplies is cut off, there is no standby voltage anymore. Just try to unplug all power cables and see if BMC is still accessible. From Ulrich.Windl at rz.uni-regensburg.de Tue Aug 18 03:35:28 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Tue, 18 Aug 2020 09:35:28 +0200 Subject: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing In-Reply-To: <83aba38d-c9ea-1dff-e53b-14a9e0623d9d@gmail.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <5F3B7ECF020000A10003AAD3@gwsmtp.uni-regensburg.de> <83aba38d-c9ea-1dff-e53b-14a9e0623d9d@gmail.com> Message-ID: <5F3B84C0020000A10003AAE4@gwsmtp.uni-regensburg.de> >>> Andrei Borzenkov schrieb am 18.08.2020 um 09:24 in Nachricht <83aba38d-c9ea-1dff-e53b-14a9e0623d9d at gmail.com>: > 18.08.2020 10:10, Ulrich Windl ?????: >>>>> Ken Gaillot schrieb am 17.08.2020 um 17:19 in >> Nachricht >> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel at redhat.com>: >>> On Fri, 2020?08?14 at 15:09 +0200, Gabriele Bulfon wrote: >>>> Thanks to all your suggestions, I now have the systems with stonith >>>> configured on ipmi. >>> >>> A word of caution: if the IPMI is on?board ?? i.e. it shares the same >>> power supply as the computer ?? power becomes a single point of >>> failure. If the node loses power, the other node can't fence because >>> the IPMI is also down, and the cluster can't recover. >> >> This may not always be true: We had servers with three(!) power supplies and > a >> BMC (what today is called "light-out management"). You could "power down" > the >> server, while the BMC was still operational (and thus could "power up" the >> server again). >> With standard PC architecture these days things seem to be a bit more >> compicated (meaning "primitive")... >> > > BMC is powered by standby voltage. If AC input to all of your power > supplies is cut off, there is no standby voltage anymore. Just try to > unplug all power cables and see if BMC is still accessible. Of course! What I tried to point out is: With a proper BMC, you DON'T need to cut off the server power. Regards, Ulrich > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From arvidjaar at gmail.com Tue Aug 18 03:56:27 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 18 Aug 2020 10:56:27 +0300 Subject: [ClusterLabs] Antw: Re: Antw: Re: Antw: [EXT] Stonith failing In-Reply-To: <5F3B84C0020000A10003AAE4@gwsmtp.uni-regensburg.de> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <5F3B7ECF020000A10003AAD3@gwsmtp.uni-regensburg.de> <83aba38d-c9ea-1dff-e53b-14a9e0623d9d@gmail.com> <5F3B84C0020000A10003AAE4@gwsmtp.uni-regensburg.de> Message-ID: <7ac5e8da-2f4f-2327-8a1f-8b007f5d7f9c@gmail.com> 18.08.2020 10:35, Ulrich Windl ?????: >>>> Andrei Borzenkov schrieb am 18.08.2020 um 09:24 in > Nachricht <83aba38d-c9ea-1dff-e53b-14a9e0623d9d at gmail.com>: >> 18.08.2020 10:10, Ulrich Windl ?????: >>>>>> Ken Gaillot schrieb am 17.08.2020 um 17:19 in >>> Nachricht >>> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel at redhat.com>: >>>> On Fri, 2020?08?14 at 15:09 +0200, Gabriele Bulfon wrote: >>>>> Thanks to all your suggestions, I now have the systems with stonith >>>>> configured on ipmi. >>>> >>>> A word of caution: if the IPMI is on?board ?? i.e. it shares the same >>>> power supply as the computer ?? power becomes a single point of >>>> failure. If the node loses power, the other node can't fence because >>>> the IPMI is also down, and the cluster can't recover. >>> >>> This may not always be true: We had servers with three(!) power supplies > and >> a >>> BMC (what today is called "light-out management"). You could "power down" >> the >>> server, while the BMC was still operational (and thus could "power up" the >>> server again). >>> With standard PC architecture these days things seem to be a bit more >>> compicated (meaning "primitive")... >>> >> >> BMC is powered by standby voltage. If AC input to all of your power >> supplies is cut off, there is no standby voltage anymore. Just try to >> unplug all power cables and see if BMC is still accessible. > > Of course! What I tried to point out is: With a proper BMC, you DON'T need to > cut off the server power. > You seem to completely misunderstand the problem - if external power is cut off, it is impossible to stonith node via IPMI because BMC is not accessible. From jgdr at dalibo.com Tue Aug 18 06:07:08 2020 From: jgdr at dalibo.com (Jehan-Guillaume de Rorthais) Date: Tue, 18 Aug 2020 12:07:08 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> Message-ID: <20200818120708.5b192b54@firost> On Tue, 18 Aug 2020 08:21:50 +0200 Klaus Wenninger wrote: > On 8/18/20 7:49 AM, Andrei Borzenkov wrote: > > 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: > >> On Mon, 17 Aug 2020 10:19:45 -0500 > >> Ken Gaillot wrote: > >> > >>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > >>>> Thanks to all your suggestions, I now have the systems with stonith > >>>> configured on ipmi. > >>> A word of caution: if the IPMI is on-board -- i.e. it shares the same > >>> power supply as the computer -- power becomes a single point of > >>> failure. If the node loses power, the other node can't fence because > >>> the IPMI is also down, and the cluster can't recover. > >>> > >>> Some on-board IPMI controllers can share an Ethernet port with the main > >>> computer, which would be a similar situation. > >>> > >>> It's best to have a backup fencing method when using IPMI as the > >>> primary fencing method. An example would be an intelligent power switch > >>> or sbd. > >> How SBD would be useful in this scenario? Poison pill will not be > >> swallowed by the dead node... Is it just to wait for the watchdog timeout? > >> > > Node is expected to commit suicide if SBD lost access to shared block > > device. So either node swallowed poison pill and died or node died > > because it realized it was impossible to see poison pill or node was > > dead already. After watchdog timeout (twice watchdog timeout for safety) > > we assume node is dead. > Yes, like this a suicide via watchdog will be triggered if there are > issues with thedisk. This is why it is important to have a reliable > watchdog with SBD even whenusing poison pill. As this alone would > make a single shared disk a SPOF, runningwith pacemaker integration > (default) a node with SBD will survive despite ofloosing the disk > when it has quorum and pacemaker looks healthy. As corosync-quorum > in 2-node-mode obviously won't be fit for this purpose SBD will switch > to checking for presence of both nodes if 2-node-flag is set. > > Sorry for the lengthy explanation but the full picture is required > to understand whyit is sufficiently reliable and useful if configured Thank you Andrei and Klaus for the explanation. Regards, From kadlecsik.jozsef at wigner.hu Tue Aug 18 08:35:27 2020 From: kadlecsik.jozsef at wigner.hu (=?UTF-8?Q?Kadlecsik_J=C3=B3zsef?=) Date: Tue, 18 Aug 2020 14:35:27 +0200 (CEST) Subject: [ClusterLabs] node utilization attributes are lost during upgrade In-Reply-To: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> References: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> Message-ID: Hi, On Mon, 17 Aug 2020, Ken Gaillot wrote: > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik J?zsef wrote: > > > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from Debian > > stretch to buster, all the node utilization attributes were erased > > from the configuration. However, the same attributes were kept at the > > VirtualDomain resources. This resulted that all resources with > > utilization attributes were stopped. > > Ouch :( > > There are two types of node attributes, transient and permanent. > Transient attributes last only until pacemaker is next stopped on the > node, while permanent attributes persist between reboots/restarts. > > If you configured the utilization attributes with crm_attribute -z/ > --utilization, it will default to permanent, but it's possible to > override that with -l/--lifetime reboot (or equivalently, -t/--type > status). The attributes were defined by "crm configure edit", simply stating: node 1084762113: atlas0 \ utilization hv_memory=192 cpu=32 \ attributes standby=off ... node 1084762119: atlas6 \ utilization hv_memory=192 cpu=32 \ But I believe now that corosync caused the problem, because the nodes had been renumbered: node 3232245761: atlas0 ... node 3232245767: atlas6 The upgrade process was: for each node do set the "hold" mark on the corosync package put the node standby wait for the resources to be migrated off upgrade from stretch to buster reboot put the node online wait for the resources to be migrated (back) done Up to this point all resources were running fine. In order to upgrade corosync, we followed the next steps: enable maintenance mode stop pacemaker and corosync on all nodes for each node do delete the hold mark and upgrade corosync install new config file (nodeid not specified) restart corosync, start pacemaker done We could see that all resources were running unmanaged. When disabling the maintenance mode, then those were stopped. So I think corosync renumbered the nodes and I suspect the reason for that was that "clear_node_high_bit: yes" was not specified in the new config file. It means it was an admin error then. Best regards, Jozsef -- E-mail : kadlecsik.jozsef at wigner.hu PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt Address: Wigner Research Centre for Physics H-1525 Budapest 114, POB. 49, Hungary From kgaillot at redhat.com Tue Aug 18 10:02:33 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 18 Aug 2020 09:02:33 -0500 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> Message-ID: On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote: > On 8/18/20 7:49 AM, Andrei Borzenkov wrote: > > 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: > > > On Mon, 17 Aug 2020 10:19:45 -0500 > > > Ken Gaillot wrote: > > > > > > > On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: > > > > > Thanks to all your suggestions, I now have the systems with > > > > > stonith > > > > > configured on ipmi. > > > > > > > > A word of caution: if the IPMI is on-board -- i.e. it shares > > > > the same > > > > power supply as the computer -- power becomes a single point of > > > > failure. If the node loses power, the other node can't fence > > > > because > > > > the IPMI is also down, and the cluster can't recover. > > > > > > > > Some on-board IPMI controllers can share an Ethernet port with > > > > the main > > > > computer, which would be a similar situation. > > > > > > > > It's best to have a backup fencing method when using IPMI as > > > > the > > > > primary fencing method. An example would be an intelligent > > > > power switch > > > > or sbd. > > > > > > How SBD would be useful in this scenario? Poison pill will not be > > > swallowed by > > > the dead node... Is it just to wait for the watchdog timeout? > > > > > > > Node is expected to commit suicide if SBD lost access to shared > > block > > device. So either node swallowed poison pill and died or node died > > because it realized it was impossible to see poison pill or node > > was > > dead already. After watchdog timeout (twice watchdog timeout for > > safety) > > we assume node is dead. > > Yes, like this a suicide via watchdog will be triggered if there are > issues with thedisk. This is why it is important to have a reliable > watchdog with SBD even whenusing poison pill. As this alone would > make a single shared disk a SPOF, runningwith pacemaker integration > (default) a node with SBD will survive despite ofloosing the disk > when it has quorum and pacemaker looks healthy. As corosync-quorum > in 2-node-mode obviously won't be fit for this purpose SBD will > switch > to checking for presence of both nodes if 2-node-flag is set. > > Sorry for the lengthy explanation but the full picture is required > to understand whyit is sufficiently reliable and useful if configured > correctly. > > Klaus What I'm not sure about is how watchdog-only sbd would behave as a fail-back method for a regular fence device. Will the cluster wait for the sbd timeout no matter what, or only if the regular fencing fails, or ...? -- Ken Gaillot From kgaillot at redhat.com Tue Aug 18 10:15:49 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 18 Aug 2020 09:15:49 -0500 Subject: [ClusterLabs] node utilization attributes are lost during upgrade In-Reply-To: References: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> Message-ID: <04b9e334c828e1e1a718643a75e60d6e40bd4eb0.camel@redhat.com> On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik J?zsef wrote: > Hi, > > On Mon, 17 Aug 2020, Ken Gaillot wrote: > > > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik J?zsef wrote: > > > > > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from > > > Debian > > > stretch to buster, all the node utilization attributes were > > > erased > > > from the configuration. However, the same attributes were kept at > > > the > > > VirtualDomain resources. This resulted that all resources with > > > utilization attributes were stopped. > > > > Ouch :( > > > > There are two types of node attributes, transient and permanent. > > Transient attributes last only until pacemaker is next stopped on > > the > > node, while permanent attributes persist between reboots/restarts. > > > > If you configured the utilization attributes with crm_attribute > > -z/ > > --utilization, it will default to permanent, but it's possible to > > override that with -l/--lifetime reboot (or equivalently, -t/ > > --type > > status). > > The attributes were defined by "crm configure edit", simply stating: > > node 1084762113: atlas0 \ > utilization hv_memory=192 cpu=32 \ > attributes standby=off > ... > node 1084762119: atlas6 \ > utilization hv_memory=192 cpu=32 \ > > But I believe now that corosync caused the problem, because the nodes > had > been renumbered: Ah yes, that would do it. Pacemaker would consider them different nodes with the same names. The "other" node's attributes would not apply to the "new" node. The upgrade procedure would be similar except that you would start corosync by itself after each upgrade. After all nodes were upgraded, you would modify the CIB on one node (while pacemaker is not running) with: CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify --scope=nodes -X '...' where '...' is a XML entry from the CIB with the "id" value changed to the new ID, and repeat that for each node. Then, start pacemaker on that node and wait for it to come up, then start pacemaker on the other nodes. > > node 3232245761: atlas0 > ... > node 3232245767: atlas6 > > The upgrade process was: > > for each node do > set the "hold" mark on the corosync package > put the node standby > wait for the resources to be migrated off > upgrade from stretch to buster > reboot > put the node online > wait for the resources to be migrated (back) > done > > Up to this point all resources were running fine. > > In order to upgrade corosync, we followed the next steps: > > enable maintenance mode > stop pacemaker and corosync on all nodes > for each node do > delete the hold mark and upgrade corosync > install new config file (nodeid not specified) > restart corosync, start pacemaker > done > > We could see that all resources were running unmanaged. When > disabling the > maintenance mode, then those were stopped. > > So I think corosync renumbered the nodes and I suspect the reason for > that > was that "clear_node_high_bit: yes" was not specified in the new > config > file. It means it was an admin error then. > > Best regards, > Jozsef > -- > E-mail : kadlecsik.jozsef at wigner.hu > PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt > Address: Wigner Research Centre for Physics > H-1525 Budapest 114, POB. 49, Hungary -- Ken Gaillot From bernd.lentes at helmholtz-muenchen.de Tue Aug 18 10:47:54 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Tue, 18 Aug 2020 16:47:54 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> Message-ID: <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> ----- On Aug 17, 2020, at 5:09 PM, kgaillot kgaillot at redhat.com wrote: >> I checked all relevant pe-files in this time period. >> This is what i found out (i just write the important entries): >> Executing cluster transition: >> * Resource action: vm_nextcloud stop on ha-idg-2 >> Revised cluster status: >> vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped >> >> ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe-input- >> 3118 -G transition-4516.xml -D transition-4516.dot >> Current cluster status: >> Node ha-idg-1 (1084777482): standby >> Online: [ ha-idg-2 ] >> vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped >> <============== vm_nextcloud is stopped >> Transition Summary: >> * Shutdown ha-idg-1 >> Executing cluster transition: >> * Resource action: vm_nextcloud stop on ha-idg-1 <==== why stop ? >> It is already stopped > > I'm not sure, I'd have to see the pe input. You find it here: https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29 >> vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <======= >> vm_nextcloud is stopped >> Transition Summary: >> * Fence (Off) ha-idg-1 'resource actions are unrunnable' >> Executing cluster transition: >> * Fencing ha-idg-1 (Off) >> * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It is >> already stopped ? >> Revised cluster status: >> Node ha-idg-1 (1084777482): OFFLINE (standby) >> Online: [ ha-idg-2 ] >> vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped >> >> I don't understand why the cluster tries to stop a resource which is >> already stopped. Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From kgaillot at redhat.com Tue Aug 18 13:30:34 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 18 Aug 2020 12:30:34 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> On Tue, 2020-08-18 at 16:47 +0200, Lentes, Bernd wrote: > > ----- On Aug 17, 2020, at 5:09 PM, kgaillot kgaillot at redhat.com > wrote: > > > > > I checked all relevant pe-files in this time period. > > > This is what i found out (i just write the important entries): > > > > > Executing cluster transition: > > > * Resource action: vm_nextcloud stop on ha-idg-2 > > > Revised cluster status: > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe- > > > input- > > > 3118 -G transition-4516.xml -D transition-4516.dot > > > Current cluster status: > > > Node ha-idg-1 (1084777482): standby > > > Online: [ ha-idg-2 ] > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > <============== vm_nextcloud is stopped > > > Transition Summary: > > > * Shutdown ha-idg-1 > > > Executing cluster transition: > > > * Resource action: vm_nextcloud stop on ha-idg-1 <==== why > > > stop ? > > > It is already stopped > > > > I'm not sure, I'd have to see the pe input. > > You find it here: > https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29 This appears to be a scheduler bug. The scheduler considers a migration to be "dangling" if it has a record of a failed migrate_to on the source node, but no migrate_from on the target node (and no migrate_from or start on the source node, which would indicate a later full restart or reverse migration). In this case, any migrate_from on the target has since been superseded by a failed start and a successful stop, so there is no longer a record of it. Therefore the migration is considered dangling, which requires a full stop on the source node. However in this case we already have a successful stop on the source node after the failed migrate_to, and I believe that should be sufficient to consider it no longer dangling. > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped <======= > > > vm_nextcloud is stopped > > > Transition Summary: > > > * Fence (Off) ha-idg-1 'resource actions are unrunnable' > > > Executing cluster transition: > > > * Fencing ha-idg-1 (Off) > > > * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It is > > > already stopped ? > > > Revised cluster status: > > > Node ha-idg-1 (1084777482): OFFLINE (standby) > > > Online: [ ha-idg-2 ] > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > I don't understand why the cluster tries to stop a resource which > > > is > > > already stopped. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 -- Ken Gaillot From arvidjaar at gmail.com Tue Aug 18 15:07:44 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 18 Aug 2020 22:07:44 +0300 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com> Message-ID: 18.08.2020 17:02, Ken Gaillot ?????: > On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote: >> On 8/18/20 7:49 AM, Andrei Borzenkov wrote: >>> 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: >>>> On Mon, 17 Aug 2020 10:19:45 -0500 >>>> Ken Gaillot wrote: >>>> >>>>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: >>>>>> Thanks to all your suggestions, I now have the systems with >>>>>> stonith >>>>>> configured on ipmi. >>>>> >>>>> A word of caution: if the IPMI is on-board -- i.e. it shares >>>>> the same >>>>> power supply as the computer -- power becomes a single point of >>>>> failure. If the node loses power, the other node can't fence >>>>> because >>>>> the IPMI is also down, and the cluster can't recover. >>>>> >>>>> Some on-board IPMI controllers can share an Ethernet port with >>>>> the main >>>>> computer, which would be a similar situation. >>>>> >>>>> It's best to have a backup fencing method when using IPMI as >>>>> the >>>>> primary fencing method. An example would be an intelligent >>>>> power switch >>>>> or sbd. >>>> >>>> How SBD would be useful in this scenario? Poison pill will not be >>>> swallowed by >>>> the dead node... Is it just to wait for the watchdog timeout? >>>> >>> >>> Node is expected to commit suicide if SBD lost access to shared >>> block >>> device. So either node swallowed poison pill and died or node died >>> because it realized it was impossible to see poison pill or node >>> was >>> dead already. After watchdog timeout (twice watchdog timeout for >>> safety) >>> we assume node is dead. >> >> Yes, like this a suicide via watchdog will be triggered if there are >> issues with thedisk. This is why it is important to have a reliable >> watchdog with SBD even whenusing poison pill. As this alone would >> make a single shared disk a SPOF, runningwith pacemaker integration >> (default) a node with SBD will survive despite ofloosing the disk >> when it has quorum and pacemaker looks healthy. As corosync-quorum >> in 2-node-mode obviously won't be fit for this purpose SBD will >> switch >> to checking for presence of both nodes if 2-node-flag is set. >> >> Sorry for the lengthy explanation but the full picture is required >> to understand whyit is sufficiently reliable and useful if configured >> correctly. >> >> Klaus > > What I'm not sure about is how watchdog-only sbd would behave as a > fail-back method for a regular fence device. Will the cluster wait for > the sbd timeout no matter what, or only if the regular fencing fails, > or ...? > Diskless SBD implicitly creates fencing device ("watchdog"), timeout starts only when this device is selected for fencing. This device appears to be completely invisible to normal stonith_admin operation, I do not know how to query for it. In my testing explicit stonith resource was always called first and only if it failed was "watchdog" self fencing attempted. I tried to set negative priority for CIB stonith resource but it did not change anything. From hunter86_bg at yahoo.com Tue Aug 18 15:45:29 2020 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Tue, 18 Aug 2020 22:45:29 +0300 Subject: [ClusterLabs] node utilization attributes are lost during upgrade In-Reply-To: <04b9e334c828e1e1a718643a75e60d6e40bd4eb0.camel@redhat.com> References: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> <04b9e334c828e1e1a718643a75e60d6e40bd4eb0.camel@redhat.com> Message-ID: Won't it be easier if: - set a node in standby - stop a node - remove the node - add again with the new hostname Best Regards, Strahil Nikolov ?? 18 ?????? 2020 ?. 17:15:49 GMT+03:00, Ken Gaillot ??????: >On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik J?zsef wrote: >> Hi, >> >> On Mon, 17 Aug 2020, Ken Gaillot wrote: >> >> > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik J?zsef wrote: >> > > >> > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from >> > > Debian >> > > stretch to buster, all the node utilization attributes were >> > > erased >> > > from the configuration. However, the same attributes were kept at >> > > the >> > > VirtualDomain resources. This resulted that all resources with >> > > utilization attributes were stopped. >> > >> > Ouch :( >> > >> > There are two types of node attributes, transient and permanent. >> > Transient attributes last only until pacemaker is next stopped on >> > the >> > node, while permanent attributes persist between reboots/restarts. >> > >> > If you configured the utilization attributes with crm_attribute >> > -z/ >> > --utilization, it will default to permanent, but it's possible to >> > override that with -l/--lifetime reboot (or equivalently, -t/ >> > --type >> > status). >> >> The attributes were defined by "crm configure edit", simply stating: >> >> node 1084762113: atlas0 \ >> utilization hv_memory=192 cpu=32 \ >> attributes standby=off >> ... >> node 1084762119: atlas6 \ >> utilization hv_memory=192 cpu=32 \ >> >> But I believe now that corosync caused the problem, because the nodes >> had >> been renumbered: > >Ah yes, that would do it. Pacemaker would consider them different nodes >with the same names. The "other" node's attributes would not apply to >the "new" node. > >The upgrade procedure would be similar except that you would start >corosync by itself after each upgrade. After all nodes were upgraded, >you would modify the CIB on one node (while pacemaker is not running) >with: > >CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify --scope=nodes >-X '...' > >where '...' is a XML entry from the CIB with the "id" value >changed to the new ID, and repeat that for each node. Then, start >pacemaker on that node and wait for it to come up, then start pacemaker >on the other nodes. > >> >> node 3232245761: atlas0 >> ... >> node 3232245767: atlas6 >> >> The upgrade process was: >> >> for each node do >> set the "hold" mark on the corosync package >> put the node standby >> wait for the resources to be migrated off >> upgrade from stretch to buster >> reboot >> put the node online >> wait for the resources to be migrated (back) >> done >> >> Up to this point all resources were running fine. >> >> In order to upgrade corosync, we followed the next steps: >> >> enable maintenance mode >> stop pacemaker and corosync on all nodes >> for each node do >> delete the hold mark and upgrade corosync >> install new config file (nodeid not specified) >> restart corosync, start pacemaker >> done >> >> We could see that all resources were running unmanaged. When >> disabling the >> maintenance mode, then those were stopped. >> >> So I think corosync renumbered the nodes and I suspect the reason for >> that >> was that "clear_node_high_bit: yes" was not specified in the new >> config >> file. It means it was an admin error then. >> >> Best regards, >> Jozsef >> -- >> E-mail : kadlecsik.jozsef at wigner.hu >> PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt >> Address: Wigner Research Centre for Physics >> H-1525 Budapest 114, POB. 49, Hungary >-- >Ken Gaillot > >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/ From kwenning at redhat.com Tue Aug 18 15:49:13 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Tue, 18 Aug 2020 21:49:13 +0200 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com>

Message-ID: On 8/18/20 9:07 PM, Andrei Borzenkov wrote: > 18.08.2020 17:02, Ken Gaillot ?????: >> On Tue, 2020-08-18 at 08:21 +0200, Klaus Wenninger wrote: >>> On 8/18/20 7:49 AM, Andrei Borzenkov wrote: >>>> 17.08.2020 23:39, Jehan-Guillaume de Rorthais ?????: >>>>> On Mon, 17 Aug 2020 10:19:45 -0500 >>>>> Ken Gaillot wrote: >>>>> >>>>>> On Fri, 2020-08-14 at 15:09 +0200, Gabriele Bulfon wrote: >>>>>>> Thanks to all your suggestions, I now have the systems with >>>>>>> stonith >>>>>>> configured on ipmi. >>>>>> A word of caution: if the IPMI is on-board -- i.e. it shares >>>>>> the same >>>>>> power supply as the computer -- power becomes a single point of >>>>>> failure. If the node loses power, the other node can't fence >>>>>> because >>>>>> the IPMI is also down, and the cluster can't recover. >>>>>> >>>>>> Some on-board IPMI controllers can share an Ethernet port with >>>>>> the main >>>>>> computer, which would be a similar situation. >>>>>> >>>>>> It's best to have a backup fencing method when using IPMI as >>>>>> the >>>>>> primary fencing method. An example would be an intelligent >>>>>> power switch >>>>>> or sbd. >>>>> How SBD would be useful in this scenario? Poison pill will not be >>>>> swallowed by >>>>> the dead node... Is it just to wait for the watchdog timeout? >>>>> >>>> Node is expected to commit suicide if SBD lost access to shared >>>> block >>>> device. So either node swallowed poison pill and died or node died >>>> because it realized it was impossible to see poison pill or node >>>> was >>>> dead already. After watchdog timeout (twice watchdog timeout for >>>> safety) >>>> we assume node is dead. >>> Yes, like this a suicide via watchdog will be triggered if there are >>> issues with thedisk. This is why it is important to have a reliable >>> watchdog with SBD even whenusing poison pill. As this alone would >>> make a single shared disk a SPOF, runningwith pacemaker integration >>> (default) a node with SBD will survive despite ofloosing the disk >>> when it has quorum and pacemaker looks healthy. As corosync-quorum >>> in 2-node-mode obviously won't be fit for this purpose SBD will >>> switch >>> to checking for presence of both nodes if 2-node-flag is set. >>> >>> Sorry for the lengthy explanation but the full picture is required >>> to understand whyit is sufficiently reliable and useful if configured >>> correctly. >>> >>> Klaus >> What I'm not sure about is how watchdog-only sbd would behave as a >> fail-back method for a regular fence device. Will the cluster wait for >> the sbd timeout no matter what, or only if the regular fencing fails, >> or ...? >> > Diskless SBD implicitly creates fencing device ("watchdog"), timeout > starts only when this device is selected for fencing. This device > appears to be completely invisible to normal stonith_admin operation, I > do not know how to query for it. In my testing explicit stonith resource > was always called first and only if it failed was "watchdog" self > fencing attempted. I tried to set negative priority for CIB stonith > resource but it did not change anything. > This matches with what I remember from going through the code ... like with lowest prio but not at all if there is a topology defined ... which probably should be overhauled ... If interested there is a branch about having just certain nodes watchdog-fenced on my pacemaker-clone that makes the watchdog device visible. Klaus From kgaillot at redhat.com Tue Aug 18 16:02:07 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 18 Aug 2020 15:02:07 -0500 Subject: [ClusterLabs] node utilization attributes are lost during upgrade In-Reply-To: References: <08e1ec020aef4f25f2eadc8294135d2f611a2b76.camel@redhat.com> <04b9e334c828e1e1a718643a75e60d6e40bd4eb0.camel@redhat.com> Message-ID: On Tue, 2020-08-18 at 22:45 +0300, Strahil Nikolov wrote: > Won't it be easier if: > - set a node in standby > - stop a node > - remove the node > - add again with the new hostname The hostname stays the same, but corosync is changing the numeric node ID as part of the upgrade. If they remove the node, they'll lose its utilization attributes, which is what they want to keep. Looking at it again, I'm guessing there are no explicit node IDs in corosync.conf, and corosync is choosing the IDs. In that case the easiest approach would be to explicitly set the original node IDs in corosync.conf before the upgrade, so they don't change. > > Best Regards, > Strahil Nikolov > > ?? 18 ?????? 2020 ?. 17:15:49 GMT+03:00, Ken Gaillot < > kgaillot at redhat.com> ??????: > > On Tue, 2020-08-18 at 14:35 +0200, Kadlecsik J?zsef wrote: > > > Hi, > > > > > > On Mon, 17 Aug 2020, Ken Gaillot wrote: > > > > > > > On Mon, 2020-08-17 at 12:12 +0200, Kadlecsik J?zsef wrote: > > > > > > > > > > At upgrading a corosync/pacemaker/libvirt/KVM cluster from > > > > > Debian > > > > > stretch to buster, all the node utilization attributes were > > > > > erased > > > > > from the configuration. However, the same attributes were > > > > > kept at > > > > > the > > > > > VirtualDomain resources. This resulted that all resources > > > > > with > > > > > utilization attributes were stopped. > > > > > > > > Ouch :( > > > > > > > > There are two types of node attributes, transient and > > > > permanent. > > > > Transient attributes last only until pacemaker is next stopped > > > > on > > > > the > > > > node, while permanent attributes persist between > > > > reboots/restarts. > > > > > > > > If you configured the utilization attributes with crm_attribute > > > > -z/ > > > > --utilization, it will default to permanent, but it's possible > > > > to > > > > override that with -l/--lifetime reboot (or equivalently, -t/ > > > > --type > > > > status). > > > > > > The attributes were defined by "crm configure edit", simply > > > stating: > > > > > > node 1084762113: atlas0 \ > > > utilization hv_memory=192 cpu=32 \ > > > attributes standby=off > > > ... > > > node 1084762119: atlas6 \ > > > utilization hv_memory=192 cpu=32 \ > > > > > > But I believe now that corosync caused the problem, because the > > > nodes > > > had > > > been renumbered: > > > > Ah yes, that would do it. Pacemaker would consider them different > > nodes > > with the same names. The "other" node's attributes would not apply > > to > > the "new" node. > > > > The upgrade procedure would be similar except that you would start > > corosync by itself after each upgrade. After all nodes were > > upgraded, > > you would modify the CIB on one node (while pacemaker is not > > running) > > with: > > > > CIB_file=/var/lib/pacemaker/cib/cib.xml cibadmin --modify -- > > scope=nodes > > -X '...' > > > > where '...' is a XML entry from the CIB with the "id" value > > changed to the new ID, and repeat that for each node. Then, start > > pacemaker on that node and wait for it to come up, then start > > pacemaker > > on the other nodes. > > > > > > > > node 3232245761: atlas0 > > > ... > > > node 3232245767: atlas6 > > > > > > The upgrade process was: > > > > > > for each node do > > > set the "hold" mark on the corosync package > > > put the node standby > > > wait for the resources to be migrated off > > > upgrade from stretch to buster > > > reboot > > > put the node online > > > wait for the resources to be migrated (back) > > > done > > > > > > Up to this point all resources were running fine. > > > > > > In order to upgrade corosync, we followed the next steps: > > > > > > enable maintenance mode > > > stop pacemaker and corosync on all nodes > > > for each node do > > > delete the hold mark and upgrade corosync > > > install new config file (nodeid not specified) > > > restart corosync, start pacemaker > > > done > > > > > > We could see that all resources were running unmanaged. When > > > disabling the > > > maintenance mode, then those were stopped. > > > > > > So I think corosync renumbered the nodes and I suspect the reason > > > for > > > that > > > was that "clear_node_high_bit: yes" was not specified in the new > > > config > > > file. It means it was an admin error then. > > > > > > Best regards, > > > Jozsef > > > -- > > > E-mail : kadlecsik.jozsef at wigner.hu > > > PGP key: https://wigner.hu/~kadlec/pgp_public_key.txt > > > Address: Wigner Research Centre for Physics > > > H-1525 Budapest 114, POB. 49, Hungary > > > > -- > > Ken Gaillot > > > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > > -- Ken Gaillot From lists at alteeve.ca Wed Aug 19 01:10:08 2020 From: lists at alteeve.ca (Digimer) Date: Wed, 19 Aug 2020 01:10:08 -0400 Subject: [ClusterLabs] Beginner Question about VirtualDomain In-Reply-To: References: Message-ID: On 2020-08-17 8:40 a.m., Sameer Dhiman wrote: > Hi, > > I am a beginner using pacemaker and corosync. I am trying to set up > a?cluster of HA KVM guests as described by Alteeve wiki (CentOS-6) but > in CentOS-8.2. My R&D? setup is described below > > Physical Host running CentOS-8.2 with Nested Virtualization > 2 x CentOS-8.2 guest machines as Cluster Node 1 and 2. > WinXP as a HA guest. > > 1. drbd --> dlm --> lvmlockd --> LVM-activate --> gfs2 (guest machine > definitions) > 2. drbd --> dlm --> lvmlockd --> LVM-activate --> raw-lv (guest machine HDD) > > Question(s): > 1. How to prevent guest startup until gfs2 and raw-lv are available? In > CentOS-6 Alteeve used autostart=0 in the tag. Is there any similar > option in pacemaker because I did not find it in the documentation? > > 2. Suppose, If I configure constraint order gfs2 and raw-lv then guest > machine. Stopping the guest machine would also stop the complete service > tree so how can I prevent this? > > -- > Sameer Dhiman Hi Sameer, I'm the author of that wiki. It's quite out of date, as you noted, and we're actively developing a new release for EL8. Though, it would be ready until near the end if the year. There are a few changes we've made that you might want to consider; 1. We never were too happy with DLM, and so we've reworked things to no longer need it. So we use normal LVM backing DRBD resources. One resource per VM, on volume per virtual disk backed by an LV. Our tools will automate this, but you can easily enough manually create them if your environment is fairly stable. 2. To get around GFS2, we create a /mnt/shared/{provision,definitions,files,archive} directory (note /shared -> /mnt/shared to be more LFS friendly). We'll again automate management of files in Striker, but you can copy the files manually and rsync out changes as needed (again, if your environment doesn't change much). 3. We changed DRBD from v8.4 to 9.0, and this meant a few things had to change. We will integrate support for short-throw DR hosts (async "third node" in DRBD that is outside pacemaker). We run the resources to only allow a single primary normally and enable auto-promote. For live-migration, we temporarily enable live migration, promote the target, migrate, demote the old host and disable dual-primary. This makes it safer as it's far less likely that someone could accidentally start a VM on the passive node (not that it ever happened as our tools prevented it, but it was _possible_, so we wanted to improve that). That handle #3, we've written our own custom RA (ocf:alteeve:server [1]). This RA is smart enough to watch/wait for things to become ready before starting. It also handles the DRBD stuff I mentioned, and the virsh call to do the migration. So it means the pacemaker config is extremely simple. Note though it depends on the rest of our tools so it won't work outside the Anvil!. That said, if you wanted to use it before we release Anvil! M3, you could probably adapt it easily enough. If you have any questions, please let me know and I'll help as best I can. Cheers, digimer (Note: during development, this code base is kept outside of Clusterlabs. We'll move it in when it reaches beta). 1. https://github.com/digimer/anvil/blob/master/ocf/alteeve/server -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein?s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould From arvidjaar at gmail.com Wed Aug 19 02:44:02 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Wed, 19 Aug 2020 09:44:02 +0300 Subject: [ClusterLabs] Antw: [EXT] Stonith failing In-Reply-To: References: <1330096936.11468.1595926619455@www> <5F1FF427020000A10003A47F@gwsmtp.uni-regensburg.de> <1599253492.11645.1595932012959@www> <0557F18C-6718-43B6-9CA0-8FE7AE2A8785@yahoo.com> <603366395.379.1596002482554@www> <68622439.459.1596025362188@www> <1118615335.1895.1597410544148@www> <73d6ecf113098a3154a2e7db2e2a59557272024a.camel@redhat.com> <20200817223919.6187c711@firost> <9fdd714d-ccb7-9cfa-8952-23d0e5336695@redhat.com>

Message-ID: 18.08.2020 22:49, Klaus Wenninger ?????: >>> What I'm not sure about is how watchdog-only sbd would behave as a >>> fail-back method for a regular fence device. Will the cluster wait for >>> the sbd timeout no matter what, or only if the regular fencing fails, >>> or ...? >>> >> Diskless SBD implicitly creates fencing device ("watchdog"), timeout >> starts only when this device is selected for fencing. This device >> appears to be completely invisible to normal stonith_admin operation, I >> do not know how to query for it. In my testing explicit stonith resource >> was always called first and only if it failed was "watchdog" self >> fencing attempted. I tried to set negative priority for CIB stonith >> resource but it did not change anything. >> > This matches with what I remember from going through the code ... > like with lowest prio but not at all if there is a topology defined ... > which probably should be overhauled ... > If interested there is a branch about having just certain nodes > watchdog-fenced on my pacemaker-clone that makes the watchdog > device visible. > Actually I was wrong. fenced indeed looks like it creates "watchdog" device, but it happens too early - at this time fenced still does not have value of watchdog timeout so it always skips this device. That explains why device is "invisible" - it does not exist :) Otherwise what I said applies - watchdog fencing is used as fallback if all other devices failed and timeout starts when pacemaker decided to use watchdog fencing. From kgaillot at redhat.com Wed Aug 19 10:04:11 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Wed, 19 Aug 2020 09:04:11 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> Message-ID: <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> On Tue, 2020-08-18 at 12:30 -0500, Ken Gaillot wrote: > On Tue, 2020-08-18 at 16:47 +0200, Lentes, Bernd wrote: > > > > ----- On Aug 17, 2020, at 5:09 PM, kgaillot kgaillot at redhat.com > > wrote: > > > > > > > > I checked all relevant pe-files in this time period. > > > > This is what i found out (i just write the important entries): > > > > > > > > Executing cluster transition: > > > > * Resource action: vm_nextcloud stop on ha-idg-2 > > > > Revised cluster status: > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > > > ha-idg-1:~/why-fenced/ha-idg-1/pengine # crm_simulate -S -x pe- > > > > input- > > > > 3118 -G transition-4516.xml -D transition-4516.dot > > > > Current cluster status: > > > > Node ha-idg-1 (1084777482): standby > > > > Online: [ ha-idg-2 ] > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > <============== vm_nextcloud is stopped > > > > Transition Summary: > > > > * Shutdown ha-idg-1 > > > > Executing cluster transition: > > > > * Resource action: vm_nextcloud stop on ha-idg-1 <==== why > > > > stop ? > > > > It is already stopped > > > > > > I'm not sure, I'd have to see the pe input. > > > > You find it here: > > https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29 > > This appears to be a scheduler bug. Fix is in master branch and will land in 2.0.5 expected at end of the year https://github.com/ClusterLabs/pacemaker/pull/2146 > The scheduler considers a migration to be "dangling" if it has a > record > of a failed migrate_to on the source node, but no migrate_from on the > target node (and no migrate_from or start on the source node, which > would indicate a later full restart or reverse migration). > > In this case, any migrate_from on the target has since been > superseded > by a failed start and a successful stop, so there is no longer a > record > of it. Therefore the migration is considered dangling, which requires > a > full stop on the source node. > > However in this case we already have a successful stop on the source > node after the failed migrate_to, and I believe that should be > sufficient to consider it no longer dangling. > > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > <======= > > > > vm_nextcloud is stopped > > > > Transition Summary: > > > > * Fence (Off) ha-idg-1 'resource actions are unrunnable' > > > > Executing cluster transition: > > > > * Fencing ha-idg-1 (Off) > > > > * Pseudo action: vm_nextcloud_stop_0 <======= why stop ? It > > > > is > > > > already stopped ? > > > > Revised cluster status: > > > > Node ha-idg-1 (1084777482): OFFLINE (standby) > > > > Online: [ ha-idg-2 ] > > > > vm_nextcloud (ocf::heartbeat:VirtualDomain): Stopped > > > > > > > > I don't understand why the cluster tries to stop a resource > > > > which > > > > is > > > > already stopped. > > > > Bernd > > Helmholtz Zentrum M?nchen > > > > Helmholtz Zentrum Muenchen > > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > > Ingolstaedter Landstr. 1 > > 85764 Neuherberg > > www.helmholtz-muenchen.de > > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, > > Kerstin > > Guenther > > Registergericht: Amtsgericht Muenchen HRB 6466 > > USt-IdNr: DE 129521671 -- Ken Gaillot From bernd.lentes at helmholtz-muenchen.de Wed Aug 19 10:22:23 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Wed, 19 Aug 2020 16:22:23 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> Message-ID: <268993835.55008340.1597846943328.JavaMail.zimbra@helmholtz-muenchen.de> ----- On Aug 18, 2020, at 7:30 PM, kgaillot kgaillot at redhat.com wrote: >> > I'm not sure, I'd have to see the pe input. >> >> You find it here: >> https://hmgubox2.helmholtz-muenchen.de/index.php/s/WJGtodMZ9k7rN29 > > This appears to be a scheduler bug. > > The scheduler considers a migration to be "dangling" if it has a record > of a failed migrate_to on the source node, but no migrate_from on the > target node (and no migrate_from or start on the source node, which > would indicate a later full restart or reverse migration). > > In this case, any migrate_from on the target has since been superseded > by a failed start and a successful stop, so there is no longer a record > of it. Therefore the migration is considered dangling, which requires a > full stop on the source node. > > However in this case we already have a successful stop on the source > node after the failed migrate_to, and I believe that should be > sufficient to consider it no longer dangling. > Thanks for your explananation Ken. For me a Fence i don't understand is the worst that can happen to a HA cluster. Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From bernd.lentes at helmholtz-muenchen.de Wed Aug 19 10:29:32 2020 From: bernd.lentes at helmholtz-muenchen.de (Lentes, Bernd) Date: Wed, 19 Aug 2020 16:29:32 +0200 (CEST) Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> Message-ID: <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> ----- On Aug 19, 2020, at 4:04 PM, kgaillot kgaillot at redhat.com wrote: >> This appears to be a scheduler bug. > > Fix is in master branch and will land in 2.0.5 expected at end of the > year > > https://github.com/ClusterLabs/pacemaker/pull/2146 A principal question: I have SLES 12 and i'm using the pacemaker version provided with the distribution. If this fix is backported depends on Suse. If i install und update pacemaker manually (not the version provided by Suse), i loose my support from them, but have always the most recent code and fixes. If i stay with the version from Suse i have support from them, but maybe not all fixes and not the most recent code. What is your approach ? Recommendations ? Thanks. Bernd Helmholtz Zentrum M?nchen Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 From hunter86_bg at yahoo.com Wed Aug 19 13:41:01 2020 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Wed, 19 Aug 2020 20:41:01 +0300 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <488C2994-6696-4B26-B414-D581747B8BA0@yahoo.com> Hi Bernd, As SLES 12 is in a such a support phase, I guess SUSE will provide fixes only for SLES 15. It will be best if you open them a case and ask them about that. Best Regards, Strahil Nikolov ?? 19 ?????? 2020 ?. 17:29:32 GMT+03:00, "Lentes, Bernd" ??????: > >----- On Aug 19, 2020, at 4:04 PM, kgaillot kgaillot at redhat.com wrote: >>> This appears to be a scheduler bug. >> >> Fix is in master branch and will land in 2.0.5 expected at end of the >> year >> >> https://github.com/ClusterLabs/pacemaker/pull/2146 > >A principal question: >I have SLES 12 and i'm using the pacemaker version provided with the >distribution. >If this fix is backported depends on Suse. > >If i install und update pacemaker manually (not the version provided by >Suse), >i loose my support from them, but have always the most recent code and >fixes. > >If i stay with the version from Suse i have support from them, but >maybe not all fixes and not the most recent code. > >What is your approach ? >Recommendations ? > >Thanks. > >Bernd >Helmholtz Zentrum M?nchen > >Helmholtz Zentrum Muenchen >Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) >Ingolstaedter Landstr. 1 >85764 Neuherberg >www.helmholtz-muenchen.de >Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling >Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin >Guenther >Registergericht: Amtsgericht Muenchen HRB 6466 >USt-IdNr: DE 129521671 > > >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/ From kgaillot at redhat.com Wed Aug 19 13:46:34 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Wed, 19 Aug 2020 12:46:34 -0500 Subject: [ClusterLabs] why is node fenced ? In-Reply-To: <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <9c91355be4a33a5e505e7e37eda2784c065289f3.camel@redhat.com> On Wed, 2020-08-19 at 16:29 +0200, Lentes, Bernd wrote: > ----- On Aug 19, 2020, at 4:04 PM, kgaillot kgaillot at redhat.com > wrote: > > > This appears to be a scheduler bug. > > > > Fix is in master branch and will land in 2.0.5 expected at end of > > the > > year > > > > https://github.com/ClusterLabs/pacemaker/pull/2146 > > A principal question: > I have SLES 12 and i'm using the pacemaker version provided with the > distribution. > If this fix is backported depends on Suse. > > If i install und update pacemaker manually (not the version provided > by Suse), > i loose my support from them, but have always the most recent code > and fixes. > > If i stay with the version from Suse i have support from them, but > maybe not all fixes and not the most recent code. > > What is your approach ? > Recommendations ? I'd recommend sticking with the supported version, and filing a bug report with the distro asking for a specific fix to be backported when you have the need. Regarding the upstream project, running a release is fine, but I wouldn't recommend running the master branch in production. Important details of new features can change before release, and with frequent development it's more likely to have regressions. > > Thanks. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 -- Ken Gaillot From Ulrich.Windl at rz.uni-regensburg.de Thu Aug 20 01:56:32 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Thu, 20 Aug 2020 07:56:32 +0200 Subject: [ClusterLabs] anothe rcompletion bug in crm shell (2.1.2+git132.gbc9fde0 of SLES11) Message-ID: <5F3E1090020000A10003AB41@gwsmtp.uni-regensburg.de> Hi! After having booted a cluster node yesterday, I checked the logs today (as the resources were expected to be rebalanced in the evening, controlled by rules). When trying to check the utilization interactively using crm shell, I realized that completion is wrong: When typing "utilization show " in "crm(live)node#", I see a list of cluster resources, but not the list of utilization attributes. (Here it's easy as all utilization attributes start with "utl_", while primitives and closes start with "prm_" or "cln_", respectively. This is for SLES11 SP4 without LTSS (Long-term support). However this bug is fixed in crmsh-4.1.0+git.1585823743.3acb5567 of SLES12 SP5 already. Regards, Ulrich From Ulrich.Windl at rz.uni-regensburg.de Thu Aug 20 02:10:56 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Thu, 20 Aug 2020 08:10:56 +0200 Subject: [ClusterLabs] Antw: [EXT] anothe rcompletion bug in crm shell (2.1.2+git132.gbc9fde0 of SLES11) In-Reply-To: <5F3E1090020000A10003AB41@gwsmtp.uni-regensburg.de> References: <5F3E1090020000A10003AB41@gwsmtp.uni-regensburg.de> Message-ID: <5F3E13F0020000A10003AB45@gwsmtp.uni-regensburg.de> Hi again! While talking about it, I can imagine two useful extensions to "crm node utilization": 1) Allow leaving out the attribute name, showing all utilization atributes (much like "crm configure show" shows all resources) 2) Allow "*" as valid host name (to fulfill the syntax requirement of having an argument at the position, expanding to all hosts crm node utilization node|* show [attribute] So "crm node utilization node1 show " would show all utilization attributes for node1. So "crm node utilization * show attr" would show all utilization attribute"attr" for all nodes. So "crm node utilization *" would show all utilization attributes for all nodes. Maybe repositioning the arguments would make it even easier: Instead of "crm utilization " using "crm utilization [ []]" would make the syntax even more straight-forward: "crm node utilization show" would show all utilization attributes for all nodes, "crm node utilization show node" would show all utilization attributes for "node", and "crm node utilization show node attr" would show attribute "attr" for node "node". The probably less common case would be "crm node utilization show * attr" to show all attribute "attr" for all nodes... Maybe "[]" could even be "[...]" (a list of attributes to show) Maybe even more clever syntax exists... ;-) Regards, Ulrich >>> "Ulrich Windl" schrieb am 20.08.2020 um 07:56 in Nachricht <5F3E1090020000A10003AB41 at gwsmtp.uni-regensburg.de>: > Hi! > > After having booted a cluster node yesterday, I checked the logs today (as > the resources were expected to be rebalanced in the evening, controlled by > rules). > When trying to check the utilization interactively using crm shell, I > realized that completion is wrong: > > When typing "utilization show " in "crm(live)node#", I see a list of > cluster resources, but not the list of utilization attributes. (Here it's > easy as all utilization attributes start with "utl_", while primitives and > closes start with "prm_" or "cln_", respectively. > > This is for SLES11 SP4 without LTSS (Long?term support). However this bug is > fixed in crmsh?4.1.0+git.1585823743.3acb5567 of SLES12 SP5 already. > > Regards, > Ulrich > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From Ulrich.Windl at rz.uni-regensburg.de Thu Aug 20 02:24:02 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Thu, 20 Aug 2020 08:24:02 +0200 Subject: [ClusterLabs] Antw: Antw: [EXT] anothe rcompletion bug in crm shell (2.1.2+git132.gbc9fde0 of SLES11) In-Reply-To: <5F3E13F0020000A10003AB45@gwsmtp.uni-regensburg.de> References: <5F3E1090020000A10003AB41@gwsmtp.uni-regensburg.de> <5F3E13F0020000A10003AB45@gwsmtp.uni-regensburg.de> Message-ID: <5F3E1702020000A10003AB49@gwsmtp.uni-regensburg.de> Hi another time! I apologize for not having used crm shell a lot recently. After having sent the last message, I realized that the node attributes displayed by the command are the static ("original") values, not the dynamic (current) ones. For example "ptest -LU" would show: Utilization information: Original: n01 capacity: utl_cpu=200 utl_ram=1240 Original: n01 capacity: utl_cpu=200 utl_ram=1240 Original: n03 capacity: utl_cpu=200 utl_ram=1240 ... Remaining: n01 capacity: utl_cpu=140 utl_ram=748 Remaining: n02 capacity: utl_cpu=100 utl_ram=728 Remaining: n03 capacity: utl_cpu=110 utl_ram=584 Still after reading the man poage of crm shell, I could not identify the command to output the current values of the utilization atributes. Browsing the output of "cibadmin -Q" I realized that the "current" utilization values don't seem to be part of the CIB, not even the status section... 8-( Regards, Ulrich >>> "Ulrich Windl" schrieb am 20.08.2020 um 08:10 in Nachricht <5F3E13F0020000A10003AB45 at gwsmtp.uni-regensburg.de>: > Hi again! > > While talking about it, I can imagine two useful extensions to "crm node > utilization": > > 1) Allow leaving out the attribute name, showing all utilization atributes > (much like "crm configure show" shows all resources) > 2) Allow "*" as valid host name (to fulfill the syntax requirement of having > an argument at the position, expanding to all hosts > > crm node utilization node|* show [attribute] > > So "crm node utilization node1 show " would show all utilization attributes > for node1. > So "crm node utilization * show attr" would show all utilization > attribute"attr" for all nodes. > So "crm node utilization *" would show all utilization attributes for all > nodes. > > Maybe repositioning the arguments would make it even easier: > Instead of "crm utilization " using "crm > utilization [ []]" would make the syntax even more > straight-forward: > "crm node utilization show" would show all utilization attributes for all > nodes, > "crm node utilization show node" would show all utilization attributes for > "node", and > "crm node utilization show node attr" would show attribute "attr" for node > "node". > The probably less common case would be "crm node utilization show * attr" to > show all attribute "attr" for all nodes... > > Maybe "[]" could even be "[...]" (a list of attributes > to show) > > Maybe even more clever syntax exists... ;-) > > Regards, > Ulrich > >>>> "Ulrich Windl" schrieb am 20.08.2020 > um > 07:56 in Nachricht <5F3E1090020000A10003AB41 at gwsmtp.uni-regensburg.de>: >> Hi! >> >> After having booted a cluster node yesterday, I checked the logs today (as >> the resources were expected to be rebalanced in the evening, controlled by >> rules). >> When trying to check the utilization interactively using crm shell, I >> realized that completion is wrong: >> >> When typing "utilization show " in "crm(live)node#", I see a list of >> cluster resources, but not the list of utilization attributes. (Here it's >> easy as all utilization attributes start with "utl_", while primitives and >> closes start with "prm_" or "cln_", respectively. >> >> This is for SLES11 SP4 without LTSS (Long?term support). However this bug is > >> fixed in crmsh?4.1.0+git.1585823743.3acb5567 of SLES12 SP5 already. >> >> Regards, >> Ulrich >> >> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From Ulrich.Windl at rz.uni-regensburg.de Thu Aug 20 02:54:40 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Thu, 20 Aug 2020 08:54:40 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: why is node fenced ? In-Reply-To: <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> References: <1894379294.27456141.1596036406000.JavaMail.zimbra@helmholtz-muenchen.de> <197754c0efa9654e23c0d3b25d7d1ac8a7e2f919.camel@redhat.com> <989597611.38099136.1597004244408.JavaMail.zimbra@helmholtz-muenchen.de> <1230152204.50050895.1597430263067.JavaMail.zimbra@helmholtz-muenchen.de> <84a98a74897f0b5523f44b23a415fcde5bff0de6.camel@redhat.com> <453537998.54408252.1597762074665.JavaMail.zimbra@helmholtz-muenchen.de> <104df2890058a7948e93251043ba4fb43ed28446.camel@redhat.com> <61ea6f4310eac7382e19ba655dcb09cac7348b54.camel@redhat.com> <2116742364.55012219.1597847372185.JavaMail.zimbra@helmholtz-muenchen.de> Message-ID: <5F3E1E30020000A10003AB59@gwsmtp.uni-regensburg.de> >>> "Lentes, Bernd" schrieb am 19.08.2020 um 16:29 in Nachricht <2116742364.55012219.1597847372185.JavaMail.zimbra at helmholtz-muenchen.de>: > ----- On Aug 19, 2020, at 4:04 PM, kgaillot kgaillot at redhat.com wrote: >>> This appears to be a scheduler bug. >> >> Fix is in master branch and will land in 2.0.5 expected at end of the >> year >> >> https://github.com/ClusterLabs/pacemaker/pull/2146 > > A principal question: > I have SLES 12 and i'm using the pacemaker version provided with the > distribution. > If this fix is backported depends on Suse. > > If i install und update pacemaker manually (not the version provided by > Suse), > i loose my support from them, but have always the most recent code and > fixes. > > If i stay with the version from Suse i have support from them, but maybe not > all fixes and not the most recent code. > > What is your approach ? > Recommendations ? Convince SUSE that it is a bug ;-) > > Thanks. > > Bernd > Helmholtz Zentrum M?nchen > > Helmholtz Zentrum Muenchen > Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) > Ingolstaedter Landstr. 1 > 85764 Neuherberg > www.helmholtz-muenchen.de > Aufsichtsratsvorsitzende: MinDir.in Prof. Dr. Veronika von Messling > Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Kerstin > Guenther > Registergericht: Amtsgericht Muenchen HRB 6466 > USt-IdNr: DE 129521671 > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From lists at alteeve.ca Thu Aug 20 06:40:16 2020 From: lists at alteeve.ca (Digimer) Date: Thu, 20 Aug 2020 06:40:16 -0400 Subject: [ClusterLabs] Format of '--lifetime' in 'pcs resource move' Message-ID: <743e6d36-c0b8-5352-5ca9-738628e38239@alteeve.ca> Hi all, Reading the pcs man page for the 'move' action, it talks about '--lifetime' switch that appears to control when the location constraint is removed; ==== move [destination node] [--master] [life? time=] [--wait[=n]] Move the resource off the node it is currently running on by creating a -INFINITY location constraint to ban the node. If destination node is specified the resource will be moved to that node by creating an INFINITY loca? tion constraint to prefer the destination node. If --master is used the scope of the command is limited to the master role and you must use the promotable clone id (instead of the resource id). If lifetime is specified then the constraint will expire after that time, other? wise it defaults to infinity and the constraint can be cleared manually with 'pcs resource clear' or 'pcs con? straint delete'. If --wait is specified, pcs will wait up to 'n' seconds for the resource to move and then return 0 on success or 1 on error. If 'n' is not speci? fied it defaults to 60 minutes. If you want the resource to preferably avoid running on some nodes but be able to failover to them use 'pcs constraint location avoids'. ==== I think I want to use this, as we move resources manually for various reasons where the old host is still able to host the resource should a node failure occur. So we'd love to immediately remove the location constraint as soon as the move completes. I tries using '--lifetime=60' as a test, assuming the format was 'seconds', but that was invalid. How is this switch meant to be used? Cheers -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein?s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould From hunter86_bg at yahoo.com Thu Aug 20 12:25:45 2020 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Thu, 20 Aug 2020 19:25:45 +0300 Subject: [ClusterLabs] Format of '--lifetime' in 'pcs resource move' In-Reply-To: <743e6d36-c0b8-5352-5ca9-738628e38239@alteeve.ca> References: <743e6d36-c0b8-5352-5ca9-738628e38239@alteeve.ca> Message-ID: <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7@yahoo.com> Have you tried ISO 8601 format. For example: 'PT20M' The ISo format is described at: https://manpages.debian.org/testing/crmsh/crm.8.en.html Best Regards, Strahil Nikolov ?? 20 ?????? 2020 ?. 13:40:16 GMT+03:00, Digimer ??????: >Hi all, > > Reading the pcs man page for the 'move' action, it talks about >'--lifetime' switch that appears to control when the location >constraint >is removed; > >==== > move [destination node] [--master] [life? > time=] [--wait[=n]] > Move the resource off the node it is currently running > on by creating a -INFINITY location constraint to ban > the node. If destination node is specified the resource > will be moved to that node by creating an INFINITY loca? > tion constraint to prefer the destination node. If > --master is used the scope of the command is limited to > the master role and you must use the promotable clone id > (instead of the resource id). If lifetime is specified > then the constraint will expire after that time, other? > wise it defaults to infinity and the constraint can be > cleared manually with 'pcs resource clear' or 'pcs con? > straint delete'. If --wait is specified, pcs will wait > up to 'n' seconds for the resource to move and then > return 0 on success or 1 on error. If 'n' is not speci? > fied it defaults to 60 minutes. If you want the resource > to preferably avoid running on some nodes but be able to > failover to them use 'pcs constraint location avoids'. >==== > >I think I want to use this, as we move resources manually for various >reasons where the old host is still able to host the resource should a >node failure occur. So we'd love to immediately remove the location >constraint as soon as the move completes. > >I tries using '--lifetime=60' as a test, assuming the format was >'seconds', but that was invalid. How is this switch meant to be used? > >Cheers > >-- >Digimer >Papers and Projects: https://alteeve.com/w/ >"I am, somehow, less interested in the weight and convolutions of >Einstein?s brain than in the near certainty that people of equal talent >have lived and died in cotton fields and sweatshops." - Stephen Jay >Gould >_______________________________________________ >Manage your subscription: >https://lists.clusterlabs.org/mailman/listinfo/users > >ClusterLabs home: https://www.clusterlabs.org/ From Ulrich.Windl at rz.uni-regensburg.de Fri Aug 21 01:56:53 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Fri, 21 Aug 2020 07:56:53 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: Format of '--lifetime' in 'pcs resource move' In-Reply-To: <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7@yahoo.com> References: <743e6d36-c0b8-5352-5ca9-738628e38239@alteeve.ca> <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7@yahoo.com> Message-ID: <5F3F6225020000A10003ABA8@gwsmtp.uni-regensburg.de> >>> Strahil Nikolov schrieb am 20.08.2020 um 18:25 in Nachricht <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7 at yahoo.com>: > Have you tried ISO 8601 format. > For example: 'PT20M' And watch out not to mix Minutes wth Months ;-) > > The ISo format is described at: > https://manpages.debian.org/testing/crmsh/crm.8.en.html > > Best Regards, > Strahil Nikolov > > ?? 20 ?????? 2020 ?. 13:40:16 GMT+03:00, Digimer ??????: >>Hi all, >> >> Reading the pcs man page for the 'move' action, it talks about >>'--lifetime' switch that appears to control when the location >>constraint >>is removed; >> >>==== >> move [destination node] [--master] [life? >> time=] [--wait[=n]] >> Move the resource off the node it is currently running >> on by creating a -INFINITY location constraint to ban >> the node. If destination node is specified the resource >> will be moved to that node by creating an INFINITY loca? >> tion constraint to prefer the destination node. If >> --master is used the scope of the command is limited to >> the master role and you must use the promotable clone id >> (instead of the resource id). If lifetime is specified >> then the constraint will expire after that time, other? >> wise it defaults to infinity and the constraint can be >> cleared manually with 'pcs resource clear' or 'pcs con? >> straint delete'. If --wait is specified, pcs will wait >> up to 'n' seconds for the resource to move and then >> return 0 on success or 1 on error. If 'n' is not speci? >> fied it defaults to 60 minutes. If you want the resource >> to preferably avoid running on some nodes but be able to >> failover to them use 'pcs constraint location avoids'. >>==== >> >>I think I want to use this, as we move resources manually for various >>reasons where the old host is still able to host the resource should a >>node failure occur. So we'd love to immediately remove the location >>constraint as soon as the move completes. >> >>I tries using '--lifetime=60' as a test, assuming the format was >>'seconds', but that was invalid. How is this switch meant to be used? >> >>Cheers >> >>-- >>Digimer >>Papers and Projects: https://alteeve.com/w/ >>"I am, somehow, less interested in the weight and convolutions of >>Einstein?s brain than in the near certainty that people of equal talent >>have lived and died in cotton fields and sweatshops." - Stephen Jay >>Gould >>_______________________________________________ >>Manage your subscription: >>https://lists.clusterlabs.org/mailman/listinfo/users >> >>ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From mark.battistella at outlook.com Fri Aug 21 03:04:37 2020 From: mark.battistella at outlook.com (Mark Battistella) Date: Fri, 21 Aug 2020 07:04:37 +0000 Subject: [ClusterLabs] Active-Active cluster CentOS 8 Message-ID: Hi, I was wondering if I could get some help when following along with the Clusters from Scratch part 9: https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Clusters_from_Scratch/index.html#_install_cluster_filesystem_software The first step is to install dlm which cant be found - but dlm-lib can. Is there any update to the installation or alternative? I'd love to be able to have an active-active filesystem. I've looked through the repo to see if there were any answers but I cant find anything Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgaillot at redhat.com Fri Aug 21 14:16:11 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Fri, 21 Aug 2020 13:16:11 -0500 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd Message-ID: Hi all, Looking ahead to the Pacemaker 2.0.5 release expected toward the end of this year, we will have improvements of interest to anyone running clusters with sbd. Previously at start-up, if sbd was blocked from contacting Pacemaker's CIB in a way that looked like pacemaker wasn't running (SELinux being a good example), pacemaker would run resources without protection from sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it before it will start any resources, so the cluster is protected in this situation. Additionally, sbd will now periodically contact the main pacemaker daemon for a status report. Currently, this is just an immediate response, but it ensures that the main pacemaker daemon is responsive to IPC requests. This is a bit more assurance that pacemaker is not only running, but functioning properly. In future versions, we will have even more in-depth health checks as part of this feature. Previously at shutdown, sbd determined a clean pacemaker shutdown by checking whether any resources were running at shutdown. This would lead to sbd fencing if pacemaker shut down in maintenance mode with resources active. Now, sbd will determine clean shutdowns as part of the status report described above, avoiding that situation. These behaviors will be controlled by a new option in /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This defaults to "no" for backward compatibility when a newer sbd is used with an older pacemaker or vice versa. Distributions may change the value to "yes" since they can ensure both sbd and pacemaker versions support it; users who build their own installations can set it themselves if both versions support it. -- Ken Gaillot From bubble at hoster-ok.com Fri Aug 21 14:55:16 2020 From: bubble at hoster-ok.com (Vladislav Bogdanov) Date: Fri, 21 Aug 2020 21:55:16 +0300 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References: Message-ID: Hi, btw, is sbd is now able to handle cib diffs internally? Last time I tried to use it with frequently changing CIB, it became a CPU hog - it requested full CIB copy on every change. Fri, 21/08/2020 ? 13:16 -0500, Ken Gaillot wrote: > Hi all, > > Looking ahead to the Pacemaker 2.0.5 release expected toward the end of > this year, we will have improvements of interest to anyone running > clusters with sbd. > > Previously at start-up, if sbd was blocked from contacting Pacemaker's > CIB in a way that looked like pacemaker wasn't running (SELinux being a > good example), pacemaker would run resources without protection from > sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it > before it will start any resources, so the cluster is protected in this > situation. > > Additionally, sbd will now periodically contact the main pacemaker > daemon for a status report. Currently, this is just an immediate > response, but it ensures that the main pacemaker daemon is responsive > to IPC requests. This is a bit more assurance that pacemaker is not > only running, but functioning properly. In future versions, we will > have even more in-depth health checks as part of this feature. > > Previously at shutdown, sbd determined a clean pacemaker shutdown by > checking whether any resources were running at shutdown. This would > lead to sbd fencing if pacemaker shut down in maintenance mode with > resources active. Now, sbd will determine clean shutdowns as part of > the status report described above, avoiding that situation. > > These behaviors will be controlled by a new option in > /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This > defaults to "no" for backward compatibility when a newer sbd is used > with an older pacemaker or vice versa. Distributions may change the > value to "yes" since they can ensure both sbd and pacemaker versions > support it; users who build their own installations can set it > themselves if both versions support it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgaillot at redhat.com Fri Aug 21 16:35:45 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Fri, 21 Aug 2020 15:35:45 -0500 Subject: [ClusterLabs] Active-Active cluster CentOS 8 In-Reply-To: References: Message-ID: <33db3777fa23969b31a93b505a5aba46416f4a7c.camel@redhat.com> On Fri, 2020-08-21 at 07:04 +0000, Mark Battistella wrote: > Hi, > > I was wondering if I could get some help when following along with > the Clusters from Scratch part 9: > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Clusters_from_Scratch/index.html#_install_cluster_filesystem_software > > The first step is to install dlm which cant be found - but dlm-lib > can. > > Is there any update to the installation or alternative? I'd love to > be able to have an active-active filesystem. > > I've looked through the repo to see if there were any answers but I > cant find anything > > Thanks, > Mark Looks like it's a known CentOS packaging issue: https://bugs.centos.org/view.php?id=16939 -- Ken Gaillot From arvidjaar at gmail.com Sat Aug 22 05:02:00 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Sat, 22 Aug 2020 12:02:00 +0300 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References: Message-ID: 21.08.2020 21:16, Ken Gaillot ?????: > > Previously at shutdown, sbd determined a clean pacemaker shutdown by > checking whether any resources were running at shutdown. This would > lead to sbd fencing if pacemaker shut down in maintenance mode with > resources active. What conditions lead to it? I tried to reproduce it, but so far I was able to set node in maintenance mode and stop pacemaker on it without any ill effects. This is openSUSE Tumebleweed with sbd at commit 4b617a1b8f16033ac464cf2d562f0646082c40fa. From kwenning at redhat.com Sun Aug 23 04:45:25 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Sun, 23 Aug 2020 10:45:25 +0200 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References:

Message-ID: <8ad9b08b-abb4-c18b-8965-45fc4be3f2a6@redhat.com> On 8/22/20 11:02 AM, Andrei Borzenkov wrote: > 21.08.2020 21:16, Ken Gaillot ?????: >> Previously at shutdown, sbd determined a clean pacemaker shutdown by >> checking whether any resources were running at shutdown. This would >> lead to sbd fencing if pacemaker shut down in maintenance mode with >> resources active. > > What conditions lead to it? I tried to reproduce it, but so far I was > able to set node in maintenance mode and stop pacemaker on it without > any ill effects. This is openSUSE Tumebleweed with sbd at commit > 4b617a1b8f16033ac464cf2d562f0646082c40fa. You shouldn't be able to reproduce this as there was an individual fix for this condition: commit 824fe834c67fb7bae7feb87607381f9fa8fa2945 Author: Klaus Wenninger Date:?? Fri Jun 7 19:09:06 2019 +0200 ??? Fix: sbd-pacemaker: assume graceful exit if leftovers are unmanged Point is that sbd now doesn't need to make complicated assumptions anymore if a pacemaker-shutdown was graceful or not as we've introduced a channel via which pacemaker is explicitly telling sbd that it has successfully reached a clean shutdown state. Klaus > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From kwenning at redhat.com Sun Aug 23 06:08:22 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Sun, 23 Aug 2020 12:08:22 +0200 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References:

Message-ID: On 8/21/20 8:55 PM, Vladislav Bogdanov wrote: > Hi, > > btw, is sbd is now able to handle cib diffs internally? > Last time I tried to use it with frequently changing CIB, it became a > CPU hog - it requested full CIB copy on every change. Actually sbd should have been able to handle cib-diffs since ever. Are you sure it requested a full copy of the CIB with every change? Atm it should request a full update roughly twice every watchdog-timeout and in between just noop-pings to the cib-api - as long as imbedding the diffs goes OK of course. In general we need full cib-updates as otherwise loss of a cib-diff would mean possibly missing node-state updates. What it on top does is convert the cib to a cluster-state roughly every second or with every 10th cib-diff. The latter might impose some cpu-usage when cib is updating at a high rate of course and might not be really needed. With the new pacemakerd-API we don't need the cib-diffs anymore for graceful-shutdown-detection. Thus easiest might be to disable diff-handling completely when pacemakerd-API is used. > > > Fri, 21/08/2020 ? 13:16 -0500, Ken Gaillot wrote: >> Hi all, >> Looking ahead to the Pacemaker 2.0.5 release expected toward the end of >> this year, we will have improvements of interest to anyone running >> clusters with sbd. >> Previously at start-up, if sbd was blocked from contacting Pacemaker's >> CIB in a way that looked like pacemaker wasn't running (SELinux being a >> good example), pacemaker would run resources without protection from >> sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it >> before it will start any resources, so the cluster is protected in this >> situation. >> Additionally, sbd will now periodically contact the main pacemaker >> daemon for a status report. Currently, this is just an immediate >> response, but it ensures that the main pacemaker daemon is responsive >> to IPC requests. This is a bit more assurance that pacemaker is not >> only running, but functioning properly. In future versions, we will >> have even more in-depth health checks as part of this feature. >> Previously at shutdown, sbd determined a clean pacemaker shutdown by >> checking whether any resources were running at shutdown. This would >> lead to sbd fencing if pacemaker shut down in maintenance mode with >> resources active. Now, sbd will determine clean shutdowns as part of >> the status report described above, avoiding that situation. >> These behaviors will be controlled by a new option in >> /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This >> defaults to "no" for backward compatibility when a newer sbd is used >> with an older pacemaker or vice versa. Distributions may change the >> value to "yes" since they can ensure both sbd and pacemaker versions >> support it; users who build their own installations can set it >> themselves if both versions support it. > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hunter86_bg at yahoo.com Sun Aug 23 06:09:25 2020 From: hunter86_bg at yahoo.com (Strahil Nikolov) Date: Sun, 23 Aug 2020 10:09:25 +0000 (UTC) Subject: [ClusterLabs] Active-Active cluster CentOS 8 In-Reply-To: References: Message-ID: <1562701513.4604011.1598177365088@mail.yahoo.com> There is a topic about that at?https://bugs.centos.org/view.php?id=16939 Based on the comments you can obtain it from?https://koji.mbox.centos.org/koji/buildinfo?buildID=4801 , but I haven' tested it. Best Regards, Strahil Nikolov ? ?????, 21 ?????? 2020 ?., 18:30:31 ???????+3, Mark Battistella ??????: ?? Hi, I was wondering?if I could get some help when following along with the Clusters from Scratch part 9:?https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Clusters_from_Scratch/index.html#_install_cluster_filesystem_software The first step is to install dlm which cant be found - but dlm-lib can. Is there any update to the installation or alternative? I'd love to be able to have an active-active filesystem. I've looked through the repo to see if there were any answers but I cant find anything Thanks, Mark _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/ From bubble at hoster-ok.com Sun Aug 23 06:39:43 2020 From: bubble at hoster-ok.com (Vladislav Bogdanov) Date: Sun, 23 Aug 2020 13:39:43 +0300 Subject: [ClusterLabs] Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References:

Message-ID: <1741ae70998.27ed.79e33407a6dbe711362be892215840ad@hoster-ok.com> Good to know it is now not needed. You are correct about logic, yes, I just forgot details. I just recall that sbd added too much load during cluster start and recoveries. Thank you! On August 23, 2020 1:23:37 PM Klaus Wenninger wrote: > On 8/21/20 8:55 PM, Vladislav Bogdanov wrote: >> Hi, >> >> btw, is sbd is now able to handle cib diffs internally? >> Last time I tried to use it with frequently changing CIB, it became a CPU >> hog - it requested full CIB copy on every change. > Actually sbd should have been able to handle cib-diffs since ever. > Are you sure it requested a full copy of the CIB with every change? > Atm it should request a full update roughly twice every watchdog-timeout > and in between just noop-pings to the cib-api - as long as imbedding the > diffs goes OK of course. > In general we need full cib-updates as otherwise loss of a cib-diff > would mean possibly missing node-state updates. > What it on top does is convert the cib to a cluster-state roughly > every second or with every 10th cib-diff. The latter might impose > some cpu-usage when cib is updating at a high rate of course and > might not be really needed. > With the new pacemakerd-API we don't need the cib-diffs anymore > for graceful-shutdown-detection. Thus easiest might be to disable > diff-handling completely when pacemakerd-API is used. >> >> >> Fri, 21/08/2020 ? 13:16 -0500, Ken Gaillot wrote: >>> Hi all, >>> >>> Looking ahead to the Pacemaker 2.0.5 release expected toward the end of >>> this year, we will have improvements of interest to anyone running >>> clusters with sbd. >>> >>> Previously at start-up, if sbd was blocked from contacting Pacemaker's >>> CIB in a way that looked like pacemaker wasn't running (SELinux being a >>> good example), pacemaker would run resources without protection from >>> sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it >>> before it will start any resources, so the cluster is protected in this >>> situation. >>> >>> Additionally, sbd will now periodically contact the main pacemaker >>> daemon for a status report. Currently, this is just an immediate >>> response, but it ensures that the main pacemaker daemon is responsive >>> to IPC requests. This is a bit more assurance that pacemaker is not >>> only running, but functioning properly. In future versions, we will >>> have even more in-depth health checks as part of this feature. >>> >>> Previously at shutdown, sbd determined a clean pacemaker shutdown by >>> checking whether any resources were running at shutdown. This would >>> lead to sbd fencing if pacemaker shut down in maintenance mode with >>> resources active. Now, sbd will determine clean shutdowns as part of >>> the status report described above, avoiding that situation. >>> >>> These behaviors will be controlled by a new option in >>> /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This >>> defaults to "no" for backward compatibility when a newer sbd is used >>> with an older pacemaker or vice versa. Distributions may change the >>> value to "yes" since they can ensure both sbd and pacemaker versions >>> support it; users who build their own installations can set it >>> themselves if both versions support it. >> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: >> https://www.clusterlabs.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ulrich.Windl at rz.uni-regensburg.de Mon Aug 24 01:56:58 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Mon, 24 Aug 2020 07:56:58 +0200 Subject: [ClusterLabs] Antw: [EXT] Active-Active cluster CentOS 8 In-Reply-To: References: Message-ID: <5F4356AA020000A10003AC39@gwsmtp.uni-regensburg.de> >>> Mark Battistella schrieb am 21.08.2020 um 09:04 in Nachricht : > Hi, > > I was wondering if I could get some help when following along with the > Clusters from Scratch part 9: > https://clusterlabs.org/pacemaker/doc/en?US/Pacemaker/2.0/html?single/Cluster > s_from_Scratch/index.html#_install_cluster_filesystem_software > > The first step is to install dlm which cant be found ? but dlm?lib can. > > Is there any update to the installation or alternative? I'd love to be able > to have an active?active filesystem. In SLES DLM seems to be a part of the kernel, with only dlm_controld.pcmk being known in user-land, which in turn is part of package libdlm. > > I've looked through the repo to see if there were any answers but I cant > find anything > > Thanks, > Mark From Ulrich.Windl at rz.uni-regensburg.de Mon Aug 24 02:04:26 2020 From: Ulrich.Windl at rz.uni-regensburg.de (Ulrich Windl) Date: Mon, 24 Aug 2020 08:04:26 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: References:

Message-ID: <5F43586A020000A10003AC3D@gwsmtp.uni-regensburg.de> >>> Vladislav Bogdanov schrieb am 21.08.2020 um 20:55 in Nachricht : > Hi, > > btw, is sbd is now able to handle cib diffs internally? > Last time I tried to use it with frequently changing CIB, it became a > CPU hog - it requested full CIB copy on every change. Hi! I also wonder whether sbd is a tool to fence hosts, or a node-quorum maker that controls the cluster. I think the cluster should control sbd, not the other way 'round. Regards, Ulrich > > > Fri, 21/08/2020 ? 13:16 -0500, Ken Gaillot wrote: >> Hi all, >> >> Looking ahead to the Pacemaker 2.0.5 release expected toward the end of >> this year, we will have improvements of interest to anyone running >> clusters with sbd. >> >> Previously at start-up, if sbd was blocked from contacting Pacemaker's >> CIB in a way that looked like pacemaker wasn't running (SELinux being a >> good example), pacemaker would run resources without protection from >> sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it >> before it will start any resources, so the cluster is protected in this >> situation. >> >> Additionally, sbd will now periodically contact the main pacemaker >> daemon for a status report. Currently, this is just an immediate >> response, but it ensures that the main pacemaker daemon is responsive >> to IPC requests. This is a bit more assurance that pacemaker is not >> only running, but functioning properly. In future versions, we will >> have even more in-depth health checks as part of this feature. >> >> Previously at shutdown, sbd determined a clean pacemaker shutdown by >> checking whether any resources were running at shutdown. This would >> lead to sbd fencing if pacemaker shut down in maintenance mode with >> resources active. Now, sbd will determine clean shutdowns as part of >> the status report described above, avoiding that situation. >> >> These behaviors will be controlled by a new option in >> /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This >> defaults to "no" for backward compatibility when a newer sbd is used >> with an older pacemaker or vice versa. Distributions may change the >> value to "yes" since they can ensure both sbd and pacemaker versions >> support it; users who build their own installations can set it >> themselves if both versions support it. From kwenning at redhat.com Mon Aug 24 03:52:19 2020 From: kwenning at redhat.com (Klaus Wenninger) Date: Mon, 24 Aug 2020 09:52:19 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: Coming in Pacemaker 2.0.5: better start-up/shutdown coordination with sbd In-Reply-To: <5F43586A020000A10003AC3D@gwsmtp.uni-regensburg.de> References:

<5F43586A020000A10003AC3D@gwsmtp.uni-regensburg.de> Message-ID: <20c772b6-8573-f1db-4a1d-50f0a0fb8c2a@redhat.com> On 8/24/20 8:04 AM, Ulrich Windl wrote: >>>> Vladislav Bogdanov schrieb am 21.08.2020 um 20:55 > in > Nachricht : >> Hi, >> >> btw, is sbd is now able to handle cib diffs internally? >> Last time I tried to use it with frequently changing CIB, it became a >> CPU hog - it requested full CIB copy on every change. > Hi! > > I also wonder whether sbd is a tool to fence hosts, or a node-quorum maker > that controls the cluster. > I think the cluster should control sbd, not the other way 'round. None of those I suppose - at least not solely ;-) sbd definitely has no influence on quorum - so no quorum-maker. In its purest way of operation with 3 shared disks and no cluster-awarenesssbd probably is close to what you envision. Unfortunately 3 disks are quite some extra effort and you may want to go with a single disk that doesn't become a spof. For certain scenariospure watchdog-fencing may become interesting as well. For all of these cases sbd needs to observe the local node health and ifit is part of a quorate cluster-partition (sees the peer in the 2-node case). And as we are not living in a perfect world we can of course not relyon this supervision never to be stuck or something. Fortunately sbd is simpleenough that the main supervision is done in a simple loop that can beeasily supervised by a hardware watchdog. Of course we could have done the hardware watchdog supervision in pacemaker. But sbd & corosync need such a thing as well so that wouldhave meant to pull that all into pacemaker - loosing that simplicitymentioned. (For completeness: There meanwhile is a heartbeat between corosync and sbd as well to compensate for the hardware-watchdog-interface corosync would offer alternatively but which would hog the hardware watchdog.) We could have used one of the mechanisms that provide multiple watchdogs supervised by a hardware watchdog (e.g. systemd - that per design makes it hard to name a time when you actually can assume a node to be rebooted when misbehaving) but there is actually nothing you find on every Linux platform. So you see that the architecture is kind of natural and makes sense. Introduction of the pacemakerd-API and it being used in sbd definitely goes in the direction of moving intelligence out of sbd. Not saying everything is perfect and no improvement possible - of course ;-) Klaus > > Regards, > Ulrich > >> >> Fri, 21/08/2020 ? 13:16 -0500, Ken Gaillot wrote: >>> Hi all, >>> >>> Looking ahead to the Pacemaker 2.0.5 release expected toward the end of >>> this year, we will have improvements of interest to anyone running >>> clusters with sbd. >>> >>> Previously at start-up, if sbd was blocked from contacting Pacemaker's >>> CIB in a way that looked like pacemaker wasn't running (SELinux being a >>> good example), pacemaker would run resources without protection from >>> sbd. Now, if sbd is running, pacemaker will wait until sbd contacts it >>> before it will start any resources, so the cluster is protected in this >>> situation. >>> >>> Additionally, sbd will now periodically contact the main pacemaker >>> daemon for a status report. Currently, this is just an immediate >>> response, but it ensures that the main pacemaker daemon is responsive >>> to IPC requests. This is a bit more assurance that pacemaker is not >>> only running, but functioning properly. In future versions, we will >>> have even more in-depth health checks as part of this feature. >>> >>> Previously at shutdown, sbd determined a clean pacemaker shutdown by >>> checking whether any resources were running at shutdown. This would >>> lead to sbd fencing if pacemaker shut down in maintenance mode with >>> resources active. Now, sbd will determine clean shutdowns as part of >>> the status report described above, avoiding that situation. >>> >>> These behaviors will be controlled by a new option in >>> /etc/sysconfig/sbd or /etc/default/sbd, SBD_SYNC_RESOURCE_STARTUP. This >>> defaults to "no" for backward compatibility when a newer sbd is used >>> with an older pacemaker or vice versa. Distributions may change the >>> value to "yes" since they can ensure both sbd and pacemaker versions >>> support it; users who build their own installations can set it >>> themselves if both versions support it. > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From rohitsaini111.forum at gmail.com Tue Aug 25 02:58:53 2020 From: rohitsaini111.forum at gmail.com (Rohit Saini) Date: Tue, 25 Aug 2020 12:28:53 +0530 Subject: [ClusterLabs] Behavior of corosync kill Message-ID: Hi All, I am seeing the following behavior. Can someone clarify if this is intended behavior. If yes, then why so? Please let me know if logs are needed for better clarity. 1. Without Stonith: Continuous corosync kill on master causes switchover and makes another node as master. But as soon as this corosync recovers, it becomes master again. Shouldn't it become slave now? 2. With Stonith: Sometimes, on corosync kill, that node gets shooted by stonith but sometimes not. Not able to understand this fluctuating behavior. Does it have to do anything with faster recovery of corosync, which stonith fails to detect? I am using corosync-2.4.5-4.el7.x86_64 pacemaker-1.1.19-8.el7.x86_64 centos 7.6.1810 Thanks, Rohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From arvidjaar at gmail.com Tue Aug 25 04:28:49 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Tue, 25 Aug 2020 11:28:49 +0300 Subject: [ClusterLabs] Behavior of corosync kill In-Reply-To: References: Message-ID: On Tue, Aug 25, 2020 at 10:00 AM Rohit Saini wrote: > > Hi All, > I am seeing the following behavior. Can someone clarify if this is intended behavior. If yes, then why so? Please let me know if logs are needed for better clarity. > > 1. Without Stonith: > Continuous corosync kill on master causes switchover and makes another node as master. But as soon as this corosync recovers, it becomes master again. Shouldn't it become slave now? It is rather unclear what you are asking. Nodes cannot be master or slave. Do you mean specific master/slave resource in pacemaker configuration? > > 2. With Stonith: > Sometimes, on corosync kill, that node gets shooted by stonith but sometimes not. Not able to understand this fluctuating behavior. Does it have to do anything with faster recovery of corosync, which stonith fails to detect? This could be, but logs in both cases may give more hints. > > I am using > corosync-2.4.5-4.el7.x86_64 > pacemaker-1.1.19-8.el7.x86_64 > centos 7.6.1810 > > Thanks, > Rohit > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ From tojeline at redhat.com Tue Aug 25 04:50:14 2020 From: tojeline at redhat.com (Tomas Jelinek) Date: Tue, 25 Aug 2020 10:50:14 +0200 Subject: [ClusterLabs] Antw: [EXT] Re: Format of '--lifetime' in 'pcs resource move' In-Reply-To: <5F3F6225020000A10003ABA8@gwsmtp.uni-regensburg.de> References: <743e6d36-c0b8-5352-5ca9-738628e38239@alteeve.ca> <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7@yahoo.com> <5F3F6225020000A10003ABA8@gwsmtp.uni-regensburg.de> Message-ID: <8bd5fada-f11f-50e4-94dc-4d6c8e429674@redhat.com> Hi all, The lifetime value is indeed expected to be ISO 8601 duration. I updated pcs documentation to clarify that: https://github.com/ClusterLabs/pcs/commit/1e9650a8fd5b8a0a22911ddca1010de582684971 Please note constraints are not removed from CIB when their lifetime expires. They are rendered ineffective but still preserved in CIB. See the following bugzilla for more details: https://bugzilla.redhat.com/show_bug.cgi?id=1442116 Regards, Tomas Dne 21. 08. 20 v 7:56 Ulrich Windl napsal(a): >>>> Strahil Nikolov schrieb am 20.08.2020 um 18:25 in > Nachricht <329B5D02-2BCB-4A2C-BC2B-CA3030E6A3B7 at yahoo.com>: >> Have you tried ISO 8601 format. >> For example: 'PT20M' > > And watch out not to mix Minutes wth Months ;-) > >> >> The ISo format is described at: >> https://manpages.debian.org/testing/crmsh/crm.8.en.html >> >> Best Regards, >> Strahil Nikolov >> >> ?? 20 ?????? 2020 ?. 13:40:16 GMT+03:00, Digimer ??????: >>> Hi all, >>> >>> Reading the pcs man page for the 'move' action, it talks about >>> '--lifetime' switch that appears to control when the location >>> constraint >>> is removed; >>> >>> ==== >>> move [destination node] [--master] [life? >>> time=] [--wait[=n]] >>> Move the resource off the node it is currently running >>> on by creating a -INFINITY location constraint to ban >>> the node. If destination node is specified the resource >>> will be moved to that node by creating an INFINITY loca? >>> tion constraint to prefer the destination node. If >>> --master is used the scope of the command is limited to >>> the master role and you must use the promotable clone id >>> (instead of the resource id). If lifetime is specified >>> then the constraint will expire after that time, other? >>> wise it defaults to infinity and the constraint can be >>> cleared manually with 'pcs resource clear' or 'pcs con? >>> straint delete'. If --wait is specified, pcs will wait >>> up to 'n' seconds for the resource to move and then >>> return 0 on success or 1 on error. If 'n' is not speci? >>> fied it defaults to 60 minutes. If you want the resource >>> to preferably avoid running on some nodes but be able to >>> failover to them use 'pcs constraint location avoids'. >>> ==== >>> >>> I think I want to use this, as we move resources manually for various >>> reasons where the old host is still able to host the resource should a >>> node failure occur. So we'd love to immediately remove the location >>> constraint as soon as the move completes. >>> >>> I tries using '--lifetime=60' as a test, assuming the format was >>> 'seconds', but that was invalid. How is this switch meant to be used? >>> >>> Cheers >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.com/w/ >>> "I am, somehow, less interested in the weight and convolutions of >>> Einstein?s brain than in the near certainty that people of equal talent >>> have lived and died in cotton fields and sweatshops." - Stephen Jay >>> Gould >>> _______________________________________________ >>> Manage your subscription: >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> ClusterLabs home: https://www.clusterlabs.org/ >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > From kgaillot at redhat.com Tue Aug 25 10:36:50 2020 From: kgaillot at redhat.com (Ken Gaillot) Date: Tue, 25 Aug 2020 09:36:50 -0500 Subject: [ClusterLabs] Behavior of corosync kill In-Reply-To: References: Message-ID: On Tue, 2020-08-25 at 12:28 +0530, Rohit Saini wrote: > Hi All, > I am seeing the following behavior. Can someone clarify if this is > intended behavior. If yes, then why so? Please let me know if logs > are needed for better clarity. > > 1. Without Stonith: > Continuous corosync kill on master causes switchover and makes > another node as master. But as soon as this corosync recovers, it > becomes master again. Shouldn't it become slave now? Where resources are active or take on the master role depends on the cluster configuration, not past node issues. You may be interested in the resource-stickiness property: https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes > 2. With Stonith: > Sometimes, on corosync kill, that node gets shooted by stonith but > sometimes not. Not able to understand this fluctuating behavior. Does > it have to do anything with faster recovery of corosync, which > stonith fails to detect? It's not failing to detect it, but recovering satisfactorily without fencing. At any given time, one of the cluster nodes is elected the designated controller (DC). When new events occur, such as a node leaving the corosync ring unexpectedly, the DC runs pacemaker's scheduler to see what needs to be done about it. In the case of a lost node, it will also erase the node's resource history, to indicate that the state of resources on the node is no longer accurately known. If no further events happened during that time, the scheduler would schedule fencing, and the cluster would carry it out. However, systemd monitors corosync and will restart it if it dies. If systemd respawns corosync fast enough (it often is sub-second), the node will rejoin the cluster before the scheduler completes its calculations and fencing is initiated. Rejoining the cluster includes re-sync'ing its resource history with the other nodes. The node join is considered new information, so the former scheduler run is cancelled (the "transition" is "aborted") and a new one is started. Since the node is now happily part of the cluster, and the resource history tells us the state of all resources on the node, no fencing is needed. > I am using > corosync-2.4.5-4.el7.x86_64 > pacemaker-1.1.19-8.el7.x86_64 > centos 7.6.1810 > > Thanks, > Rohit > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot From rohitsaini111.forum at gmail.com Tue Aug 25 10:45:47 2020 From: rohitsaini111.forum at gmail.com (Rohit Saini) Date: Tue, 25 Aug 2020 20:15:47 +0530 Subject: [ClusterLabs] Behavior of corosync kill In-Reply-To: References: Message-ID: Thanks Ken. Let me check resource-stickiness property at my end. Regards, Rohit On Tue, Aug 25, 2020 at 8:07 PM Ken Gaillot wrote: > On Tue, 2020-08-25 at 12:28 +0530, Rohit Saini wrote: > > Hi All, > > I am seeing the following behavior. Can someone clarify if this is > > intended behavior. If yes, then why so? Please let me know if logs > > are needed for better clarity. > > > > 1. Without Stonith: > > Continuous corosync kill on master causes switchover and makes > > another node as master. But as soon as this corosync recovers, it > > becomes master again. Shouldn't it become slave now? > > Where resources are active or take on the master role depends on the > cluster configuration, not past node issues. > > You may be interested in the resource-stickiness property: > > > https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html-single/Pacemaker_Explained/index.html#_resource_meta_attributes > > > > 2. With Stonith: > > Sometimes, on corosync kill, that node gets shooted by stonith but > > sometimes not. Not able to understand this fluctuating behavior. Does > > it have to do anything with faster recovery of corosync, which > > stonith fails to detect? > > It's not failing to detect it, but recovering satisfactorily without > fencing. > > At any given time, one of the cluster nodes is elected the designated > controller (DC). When new events occur, such as a node leaving the > corosync ring unexpectedly, the DC runs pacemaker's scheduler to see > what needs to be done about it. In the case of a lost node, it will > also erase the node's resource history, to indicate that the state of > resources on the node is no longer accurately known. > > If no further events happened during that time, the scheduler would > schedule fencing, and the cluster would carry it out. > > However, systemd monitors corosync and will restart it if it dies. If > systemd respawns corosync fast enough (it often is sub-second), the > node will rejoin the cluster before the scheduler completes its > calculations and fencing is initiated. Rejoining the cluster includes > re-sync'ing its resource history with the other nodes. > > The node join is considered new information, so the former scheduler > run is cancelled (the "transition" is "aborted") and a new one is > started. Since the node is now happily part of the cluster, and the > resource history tells us the state of all resources on the node, no > fencing is needed. > > > > I am using > > corosync-2.4.5-4.el7.x86_64 > > pacemaker-1.1.19-8.el7.x86_64 > > centos 7.6.1810 > > > > Thanks, > > Rohit > > _______________________________________________ > > Manage your subscription: > > https://lists.clusterlabs.org/mailman/listinfo/users > > > > ClusterLabs home: https://www.clusterlabs.org/ > -- > Ken Gaillot > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From citron_vert at hotmail.com Wed Aug 26 09:24:04 2020 From: citron_vert at hotmail.com (Citron Vert) Date: Wed, 26 Aug 2020 15:24:04 +0200 Subject: [ClusterLabs] Resources restart when a node joins in Message-ID: Hello, I am contacting you because I have a problem with my cluster and I cannot find (nor understand) any information that can help me. I have a 2 nodes cluster (pacemaker, corosync, pcs) installed on CentOS 7 with a set of configuration. Everything seems to works fine, but here is what happens: * Node1 and Node2 are running well with Node1 as primary * I reboot Node2 wich is passive (no changes on Node1) * Node2 comes back in the cluster as passive * corosync logs shows resources getting started then stopped on Node2 * "crm_mon" command shows some ressources on Node1 getting restarted I don't understand how it should work. If a node comes back, and becomes passive (since Node1 is running primary), there is no reason for the resources to be started then stopped on the new passive node ? One of my resources becomes unstable because it gets started and then stoped too quickly on Node2, wich seems to make it restart on Node1 without a failover. I tried several things and solution proposed by different sites and forums but without success. Is there a way so that the node, which joins the cluster as passive, does not start its own resources ? thanks in advance Here are some information just in case : $?rpm?-qa?|?grep?-E "corosync|pacemaker|pcs" corosync-2.4.5-4.el7.x86_64 pacemaker-cli-1.1.21-4.el7.x86_64 pacemaker-1.1.21-4.el7.x86_64 pcs-0.9.168-4.el7.centos.x86_64 corosynclib-2.4.5-4.el7.x86_64 pacemaker-libs-1.1.21-4.el7.x86_64 pacemaker-cluster-libs-1.1.21-4.el7.x86_64 ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? -------------- next part -------------- An HTML attachment was scrubbed... URL: From nwahl at redhat.com Wed Aug 26 14:16:53 2020 From: nwahl at redhat.com (Reid Wahl) Date: Wed, 26 Aug 2020 11:16:53 -0700 Subject: [ClusterLabs] Resources restart when a node joins in In-Reply-To: References: Message-ID: Hi, Citron. Based on your description, it sounds like some resources **might** be moving from node 1 to node 2, failing on node 2, and then moving back to node 1. If that's what's happening (and even if it's not), then it's probably smart to set some resource stickiness as a resource default. The below command sets a resource stickiness score of 1. # pcs resource defaults resource-stickiness=1 Also note that the "default-resource-stickiness" cluster property is deprecated and should not be used. Finally, an explicit default resource stickiness score of 0 can interfere with the placement of cloned resource instances. If you don't want any stickiness, then it's better to leave stickiness unset. That way, primitives will have a stickiness of 0, but clone instances will have a stickiness of 1. If adding stickiness does not resolve the issue, can you share your cluster configuration and some logs that show the issue happening? Off the top of my head I'm not sure why resources would start and stop on node 2 without moving away from node1, unless they're clone instances that are starting and then failing a monitor operation on node 2. On Wed, Aug 26, 2020 at 8:42 AM Citron Vert wrote: > Hello, > I am contacting you because I have a problem with my cluster and I cannot > find (nor understand) any information that can help me. > > I have a 2 nodes cluster (pacemaker, corosync, pcs) installed on CentOS 7 > with a set of configuration. > Everything seems to works fine, but here is what happens: > > - Node1 and Node2 are running well with Node1 as primary > - I reboot Node2 wich is passive (no changes on Node1) > - Node2 comes back in the cluster as passive > - corosync logs shows resources getting started then stopped on Node2 > - "crm_mon" command shows some ressources on Node1 getting restarted > > I don't understand how it should work. > If a node comes back, and becomes passive (since Node1 is running > primary), there is no reason for the resources to be started then stopped > on the new passive node ? > > One of my resources becomes unstable because it gets started and then > stoped too quickly on Node2, wich seems to make it restart on Node1 without > a failover. > > I tried several things and solution proposed by different sites and forums > but without success. > > > Is there a way so that the node, which joins the cluster as passive, does > not start its own resources ? > > > thanks in advance > > > Here are some information just in case : > $ rpm -qa | grep -E "corosync|pacemaker|pcs" > corosync-2.4.5-4.el7.x86_64 > pacemaker-cli-1.1.21-4.el7.x86_64 > pacemaker-1.1.21-4.el7.x86_64 > pcs-0.9.168-4.el7.centos.x86_64 > corosynclib-2.4.5-4.el7.x86_64 > pacemaker-libs-1.1.21-4.el7.x86_64 > pacemaker-cluster-libs-1.1.21-4.el7.x86_64 > > > "stonith-enabled" value="false"/> > "no-quorum-policy" value="ignore"/> > value="120s"/> > "have-watchdog" value="false"/> > value="1.1.21-4.el7-f14e36fd43"/> > "cluster-infrastructure" value="corosync"/> > "cluster-name" value="CLUSTER"/> > "last-lrm-refresh" value="1598446314"/> > name="default-resource-stickiness" value="0"/> > > > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl, RHCA Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA -------------- next part -------------- An HTML attachment was scrubbed... URL: From arvidjaar at gmail.com Thu Aug 27 01:34:58 2020 From: arvidjaar at gmail.com (Andrei Borzenkov) Date: Thu, 27 Aug 2020 08:34:58 +0300 Subject: [ClusterLabs] [ClusterLabs Developers] Fencing with a Quorum Device In-Reply-To: References: Message-ID: <2dedd93d-3ebd-8aba-4049-0f06aef6bcd6@gmail.com> I changed list to users because it is general usage question, not development topic. 26.08.2020 23:33, Hayden Pfeiffer ?????: > Hello, > > > I am in the process of configuring fencing in an AWS cluster of two > hosts. I have done so and nodes are correctly fenced when > communication is broken with ifconfig down tests. > It is highly recommended to test split brain by blocking communication between nodes (either on external LAN equipment or e.g. using iptables on host). Setting interface down does not always do what one expects with corosync. > > However, I would also like to use a quorum device in my setup. I have Why? What exactly do you need it for? > configured a third host to act as a tiebreaker, but when I try to > break communication on one of the two pacemaker hosts, the affected > host is marked as "unclean" and is not fenced. > It is unclear what you did. Do you corosync-qdevice? Post actual configuration (corosync.conf and pacemaker config) and your corosync/pacemaker versions. > Is it not recommended to use fencing devices and a quorum device in > the same cluster? I cannot find any documentation relating to this > issue. Thanks, > corosync-qdevice just affects quorate state of individual nodes. What happens when node goes out of quorum is determined entirely by pacemaker configuration. For a long time I have been using two node cluster with no-quorum-policy=ignore where corosync-device is entirely useless. From nwahl at redhat.com Thu Aug 27 03:56:29 2020 From: nwahl at redhat.com (Reid Wahl) Date: Thu, 27 Aug 2020 00:56:29 -0700 Subject: [ClusterLabs] Resources restart when a node joins in In-Reply-To: References:

Message-ID: Hi, Quentin. Thanks for the logs! I see you highlighted the fact that SERVICE1 was in "Stopping" state on both node 1 and node 2 when node 1 was rejoining the cluster. I also noted the following later in the logs, as well as some similar messages earlier: Aug 27 08:47:02 [1330] NODE2 pengine: info: determine_op_status: Operation monitor found resource SERVICE1 active on NODE1 Aug 27 08:47:02 [1330] NODE2 pengine: info: determine_op_status: Operation monitor found resource SERVICE1 active on NODE1 Aug 27 08:47:02 [1330] NODE2 pengine: info: determine_op_status: Operation monitor found resource SERVICE4 active on NODE2 Aug 27 08:47:02 [1330] NODE2 pengine: info: determine_op_status: Operation monitor found resource SERVICE1 active on NODE2 ... Aug 27 08:47:02 [1330] NODE2 pengine: info: common_print: 1 : NODE1 Aug 27 08:47:02 [1330] NODE2 pengine: info: common_print: 2 : NODE2 ... Aug 27 08:47:02 [1330] NODE2 pengine: error: native_create_actions: Resource SERVICE1 is active on 2 nodes (attempting recovery) Aug 27 08:47:02 [1330] NODE2 pengine: notice: native_create_actions: See https://wiki.clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information Can you make sure that all the cluster-managed systemd services are disabled from starting at boot (i.e., `systemctl is-enabled service1`, and the same for all the others) on both nodes? If they are enabled, disable them. On Thu, Aug 27, 2020 at 12:46 AM Citron Vert wrote: > Hi, > > Sorry for using this email adress, my name is Quentin. Thank you for your > reply. > > I have already tried the stickiness solution (with the deprecated value). > I tried the one you gave me, and it does not change anything. > > Resources don't seem to move from node to node (i don't see the changes > with crm_mon command). > > > In the logs i found this line *"error: native_create_actions: > Resource SERVICE1 is active on 2 nodes*" > > Which led me to contact you to understand and learn a little more about > this cluster. And why there are running resources on the passive node. > > > You will find attached the logs during the reboot of the passive node and > my cluster configuration. > > I think I'm missing out on something in the configuration / logs that I > don't understand.. > > > Thank you in advance for your help, > > Quentin > > > Le 26/08/2020 ? 20:16, Reid Wahl a ?crit : > > Hi, Citron. > > Based on your description, it sounds like some resources **might** be > moving from node 1 to node 2, failing on node 2, and then moving back to > node 1. If that's what's happening (and even if it's not), then it's > probably smart to set some resource stickiness as a resource default. The > below command sets a resource stickiness score of 1. > > # pcs resource defaults resource-stickiness=1 > > Also note that the "default-resource-stickiness" cluster property is > deprecated and should not be used. > > Finally, an explicit default resource stickiness score of 0 can interfere > with the placement of cloned resource instances. If you don't want any > stickiness, then it's better to leave stickiness unset. That way, > primitives will have a stickiness of 0, but clone instances will have a > stickiness of 1. > > If adding stickiness does not resolve the issue, can you share your > cluster configuration and some logs that show the issue happening? Off the > top of my head I'm not sure why resources would start and stop on node 2 > without moving away from node1, unless they're clone instances that are > starting and then failing a monitor operation on node 2. > > On Wed, Aug 26, 2020 at 8:42 AM Citron Vert > wrote: > >> Hello, >> I am contacting you because I have a problem with my cluster and I cannot >> find (nor understand) any information that can help me. >> >> I have a 2 nodes cluster (pacemaker, corosync, pcs) installed on CentOS 7 >> with a set of configuration. >> Everything seems to works fine, but here is what happens: >> >> - Node1 and Node2 are running well with Node1 as primary >> - I reboot Node2 wich is passive (no changes on Node1) >> - Node2 comes back in the cluster as passive >> - corosync logs shows resources getting started then stopped on Node2 >> - "crm_mon" command shows some ressources on Node1 getting restarted >> >> I don't understand how it should work. >> If a node comes back, and becomes passive (since Node1 is running >> primary), there is no reason for the resources to be started then stopped >> on the new passive node ? >> >> One of my resources becomes unstable because it gets started and then >> stoped too quickly on Node2, wich seems to make it restart on Node1 without >> a failover. >> >> I tried several things and solution proposed by different sites and >> forums but without success. >> >> >> Is there a way so that the node, which joins the cluster as passive, does >> not start its own resources ? >> >> >> thanks in advance >> >> >> Here are some information just in case : >> $ rpm -qa | grep -E "corosync|pacemaker|pcs" >> corosync-2.4.5-4.el7.x86_64 >> pacemaker-cli-1.1.21-4.el7.x86_64 >> pacemaker-1.1.21-4.el7.x86_64 >> pcs-0.9.168-4.el7.centos.x86_64 >> corosynclib-2.4.5-4.el7.x86_64 >> pacemaker-libs-1.1.21-4.el7.x86_64 >> pacemaker-cluster-libs-1.1.21-4.el7.x86_64 >> >> >> > "stonith-enabled" value="false"/> >> > "no-quorum-policy" value="ignore"/> >> > value="120s"/> >> > "have-watchdog" value="false"/> >> > value="1.1.21-4.el7-f14e36fd43"/> >> > "cluster-infrastructure" value="corosync"/> >> > "cluster-name" value="CLUSTER"/> >> > "last-lrm-refresh" value="1598446314"/> >> > name="default-resource-stickiness" value="0"/> >> >> >> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> > > > -- > Regards, > > Reid Wahl, RHCA > Software Maintenance Engineer, Red Hat > CEE - Platform Support Delivery - ClusterHA > > -- Regards, Reid Wahl, RHCA Software Maintenance Engineer, Red Hat CEE - Platform Support Delivery - ClusterHA -------------- next part -------------- An HTML attachment was scrubbed... URL: From citron_vert at hotmail.com Thu Aug 27 03:46:14 2020 From: citron_vert at hotmail.com (Citron Vert) Date: Thu, 27 Aug 2020 09:46:14 +0200 Subject: [ClusterLabs] Resources restart when a node joins in In-Reply-To: References:

Message-ID: Hi, Sorry for using this email adress, my name is Quentin. Thank you for your reply. I have already tried the stickiness solution (with the deprecated? value). I tried the one you gave me, and it does not change anything. Resources don't seem to move from node to node (i don't see the changes with crm_mon command). In the logs i found this line /"error: native_create_actions:???? Resource SERVICE1 is active on 2 nodes/" Which led me to contact you to understand and learn a little more about this cluster. And why there are running resources on the passive node. You will find attached the logs during the reboot of the passive node and my cluster configuration. I think I'm missing out on something in the configuration / logs that I don't understand.. Thank you in advance for your help, Quentin Le 26/08/2020 ? 20:16, Reid Wahl a ?crit?: > Hi, Citron. > > Based on your description, it sounds like some resources **might** be > moving from node 1 to node 2, failing on node 2, and then moving back > to node 1. If that's what's happening (and even if it's not), then > it's probably smart to set some resource stickiness as a resource > default. The below command sets a resource stickiness score of 1. > > ??? # pcs resource defaults resource-stickiness=1 > > Also note that the "default-resource-stickiness" cluster property is > deprecated and should not be used. > > Finally, an explicit default resource stickiness score of 0 can > interfere with the placement of cloned resource instances. If you > don't want any stickiness, then it's better to leave stickiness unset. > That way, primitives will have a stickiness of 0, but clone instances > will have a stickiness of 1. > > If adding stickiness does not resolve the issue, can you share your > cluster configuration and some logs that show the issue happening? Off > the top of my head I'm not sure why resources would start and stop on > node 2 without moving away from node1, unless they're clone instances > that are starting and then failing a monitor operation on node 2. > > On Wed, Aug 26, 2020 at 8:42 AM Citron Vert > wrote: > > Hello, > I am contacting you because I have a problem with my cluster and I > cannot find (nor understand) any information that can help me. > > I have a 2 nodes cluster (pacemaker, corosync, pcs) installed on > CentOS 7 with a set of configuration. > Everything seems to works fine, but here is what happens: > > * Node1 and Node2 are running well with Node1 as primary > * I reboot Node2 wich is passive (no changes on Node1) > * Node2 comes back in the cluster as passive > * corosync logs shows resources getting started then stopped on > Node2 > * "crm_mon" command shows some ressources on Node1 getting > restarted > > I don't understand how it should work. > If a node comes back, and becomes passive (since Node1 is running > primary), there is no reason for the resources to be started then > stopped on the new passive node ? > > One of my resources becomes unstable because it gets started and > then stoped too quickly on Node2, wich seems to make it restart on > Node1 without a failover. > > I tried several things and solution proposed by different sites > and forums but without success. > > > Is there a way so that the node, which joins the cluster as > passive, does not start its own resources ? > > > thanks in advance > > > Here are some information just in case : > > $?rpm?-qa?|?grep?-E "corosync|pacemaker|pcs" > corosync-2.4.5-4.el7.x86_64 > pacemaker-cli-1.1.21-4.el7.x86_64 > pacemaker-1.1.21-4.el7.x86_64 > pcs-0.9.168-4.el7.centos.x86_64 > corosynclib-2.4.5-4.el7.x86_64 > pacemaker-libs-1.1.21-4.el7.x86_64 > pacemaker-cluster-libs-1.1.21-4.el7.x86_64 > > > ???????? > ???????? > ????????