[ClusterLabs] shutdown and restart of complete cluster due to power outage with UPS
Lentes, Bernd
bernd.lentes at helmholtz-muenchen.de
Tue Jan 22 10:52:54 EST 2019
Hi,
we have a new UPS which has enough charge to provide our 2-node cluster with the periphery (SAN, switches ...) for a resonable time.
I'm currently thinking of the shutdown- and restart-procedure of the complete cluster when the power is lost and does not come back soon.
Then cluster is provided via UPS, but that does not work infinite. So i have to shutdown the complete cluster.
I have the possibility to run scripts on each node which are triggered by the UPS.
My shutdown procedure is:
crm -w node standby node1
resources are migrated to node2
systemctl stop pacemaker
stops also corosync
node is not fenced ! (because of standby ?)
systemctl poweroff
clean shutdown of node1
crm -w node standby node2
clean stop of resources
systemctl stop pacmeaker
systemctl poweroff
The scripts would be executed form node2, via ssh for node1.
What do you think about it ?
Now the restart, which makes me trouble.
Currently i want to restart the cluster manually, because i'm not completly familiar with pacemaker and a bit afraid of getting constellations
due to automotization i didn't think of before.
I can do that from anywhere because both nodes have ILO-cards.
I start e.g. node1 with power button.
systemctl start corosync
systemctl start pacemaker
corosync and pacemaker don't start automatically, i read that several times as a recommendation.
Now my first problem. Let's assume the other node is broken. But i still want to get
resources running. My no-quorum-policy is ignore. That should be fine. But i have this setup now and don't get the resources running automatically.
crm_mon says:
========================================================================
Stack: corosync
Current DC: ha-idg-1 (version 1.1.19+20180928.0d2680780-1.8-1.1.19+20180928.0d2680780) - partition WITHOUT quorum
Last updated: Tue Jan 22 15:34:19 2019
Last change: Tue Jan 22 13:39:14 2019 by root via crm_attribute on ha-idg-1
2 nodes configured
13 resources configured
Node ha-idg-1: online
Node ha-idg-2: UNCLEAN (offline)
Inactive resources:
fence_ha-idg-2 (stonith:fence_ilo2): Stopped
fence_ha-idg-1 (stonith:fence_ilo4): Stopped
Clone Set: cl_share [gr_share]
Stopped: [ ha-idg-1 ha-idg-2 ]
vm_mausdb (ocf::heartbeat:VirtualDomain): Stopped
vm_sim (ocf::heartbeat:VirtualDomain): Stopped
vm_geneious (ocf::heartbeat:VirtualDomain): Stopped
Clone Set: cl_SNMP [SNMP]
Stopped: [ ha-idg-1 ha-idg-2 ]
Node Attributes:
* Node ha-idg-1:
+ maintenance : off
Migration Summary:
* Node ha-idg-1:
Failed Fencing Actions:
* Off of ha-idg-2 failed: delegate=, client=crmd.9938, origin=ha-idg-1,
last-failed='Tue Jan 22 15:34:17 2019'
Negative Location Constraints:
loc_fence_ha-idg-1 prevents fence_ha-idg-1 from running on ha-idg-1
loc_fence_ha-idg-2 prevents fence_ha-idg-2 from running on ha-idg-2
=====================================================================
Cluster does not have quorum but that shouldn't be a problem. corosync and pacemaker are started.
Why do the resources don't start automatically ? All target-roles are set to "started".
Is it because the fencing didn't succeed ? The status of ha-idg-2 isn't clear for the cluster ?
If yes, what can i do ?
Bernd
--
Bernd Lentes
Systemadministration
Institut für Entwicklungsgenetik
Gebäude 35.34 - Raum 208
HelmholtzZentrum münchen
[ mailto:bernd.lentes at helmholtz-muenchen.de | bernd.lentes at helmholtz-muenchen.de ]
phone: +49 89 3187 1241
fax: +49 89 3187 2294
[ http://www.helmholtz-muenchen.de/idg | http://www.helmholtz-muenchen.de/idg ]
wer Fehler macht kann etwas lernen
wer nichts macht kann auch nichts lernen
Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDirig'in Petra Steiner-Hoffmann
Stellv.Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrer: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671
More information about the Users
mailing list