[ClusterLabs] [Linux-HA] fence_ec2 agent
Dejan Muhamedagic
dejanmm at fastmail.fm
Thu Sep 24 15:04:35 UTC 2015
Hi Kazuhiko-san,
On Wed, Mar 25, 2015 at 10:47:01AM +0900, 東一彦 wrote:
> Hi Markus,
>
> I implemented it for trial.
>
> [diff from http://hg.linux-ha.org/glue/rev/9da0680bc9c0 ]
> 50d49
> < port_default=""
> 60c59
> < ec2_tag=${tag}
> ---
> > [ -n "$tag" ] && ec2_tag="$tag"
> 63d61
> < : ${port=${port_default}}
> 97c95
> < <parameter name="port" unique="1" required="1">
> ---
> > <parameter name="port" unique="1" required="0">
> 105c103
> < <parameter name="tag" unique="0" required="1">
> ---
> > <parameter name="tag" unique="0" required="0">
> 132c130
> < <parameter name="port" unique="1" required="1">
> ---
> > <parameter name="port" unique="1" required="0">
> 142c140
> < <parameter name="tag" unique="0" required="1">
> ---
> > <parameter name="tag" unique="0" required="0">
> 221a220,224
> > function monitor()
> > {
> > # Is the device ok?
> > aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> > }
> 267a271
> > [ -n "$2" ] && node_to_fence=$2
> 326a331,334
> > if [ -z "$port" ]; then
> > port="$node_to_fence"
> > fi
> >
> 379,380c387
> < # Is the device ok?
> < aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> ---
> > monitor
> 391c398
> < instance_status $instance > /dev/null
> ---
> > monitor
>
>
>
> It works fine on my environment with 2 patterns settings below.
>
> [pattern No.1]
> Without "port" and "tag" parameters.
> And instances has "Name=<uname>" tag.
>
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
> params \
> pcmk_off_timeout="120s" \
> op start interval="0s" timeout="60s" \
> op monitor interval="3600s" timeout="60s" \
> op stop interval="0s" timeout="60s"
> ----
>
>
> [pattern No.2]
> With only "tag" parameter.(Without "port" parameter.)
> And, The 1st instance(node01) has "Cluster1=node01" tag.
> The 2nd instance(node02) has "Cluster1=node02" tag.
>
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
> params \
> pcmk_off_timeout="120s" \
> tag="Cluster1" \
> op start interval="0s" timeout="60s" \
> op monitor interval="3600s" timeout="60s" \
> op stop interval="0s" timeout="60s"
> ----
Sounds good. Sorry for the delay, but would it be possible that
you provide a patch as unified diff or similar so that we can
apply it.
Cheers,
Dejan
>
> Regards,
> Kazuhiko Higashi
>
>
> On 2015/03/24 20:48, 東一彦 wrote:
> >Hi Markus,
> >
> >Thank you for the comment.
> >
> > > Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
> >I think that your idea is good.
> >
> >So, I tries to implement it.
> >I'm going to change the fence_ec2(ec2) the following points.
> >
> > - the "tag" and the "port" options will be "not" required.
> >
> > - if the "port" option is not set, the 2nd argument of ec2 will use as the "port".
> > - the 2nd argument of ec2 is "node to fence".
> >
> > - the "stat" and "status" action will be same the "monitor" action.
> > (for do not use the "port" parameter in "stat" action.)
> >
> >
> >By the above modifications, If it is described uname in the Name tag,
> >the setting of the "tag" and "port" parameters are no longer necessary.
> >
> >----
> >primitive prmStonith1-2 stonith:external/ec2 \
> > params \
> > pcmk_off_timeout="120s" \
> > op start interval="0s" timeout="60s" \
> > op monitor interval="3600s" timeout="60s" \
> > op stop interval="0s" timeout="60s"
> >----
> >
> >
> >You can use "tag" parameter like your "Clustername" tag.
> >If cluster nodes(instances) have "Cluster1" tag, and uname is described in that tag,
> >it works just like you to expect.
> >
> >----
> >primitive prmStonith1-2 stonith:external/ec2 \
> > params \
> > pcmk_off_timeout="120s" \
> > tag="Cluster1" \
> > op start interval="0s" timeout="60s" \
> > op monitor interval="3600s" timeout="60s" \
> > op stop interval="0s" timeout="60s"
> >----
> >
> >The 1st instance have "Cluster1=node01" tag-key.
> >The 2nd instance have "Cluster1=node02" tag-key.
> >The 3rd instance have "Cluster1=node03" tag-key.
> >...
> >The prmStonith1-2 can fence node01 , node02 and node03.
> >
> >
> >If you like above, I will implement that.
> >
> >
> >Regards,
> >Kazuhiko Higashi
> >
> >
> >On 2015/03/19 1:03, Markus Guertler wrote:
> >>Hi Kazuhiko, Dejan,
> >>
> >>the new resource agent is very good. Since there were a couple of days between my original question and the answer from
> >>Kazuhiko, I also have written a stonith agent proof of concept (attached to this email) in order to continue in my
> >>project. However, I think that your fence_ec2 agent is better from a development perspective and it doesn't make sense
> >>to have two different agents for the same use case.
> >>
> >>Nevertheless, I've implemented an idea, that is very useful in EC2 environments with clusters that have more than two
> >>nodes: All EC2 instances that belong to a cluster get a unique cluster name via an EC2 instance tag. The agent uses this
> >>tag to determine all cluster nodes that belong to his own cluster
> >>
> >>--- SNIP ---
> >> gethosts)
> >> # List of hostnames of this cluster
> >> init_agent
> >> ec2-describe-instances --filter "tag-key=Clustername" --filter "tag-value=$clustername" | grep "^TAG" |grep
> >>"Hostname" | awk '{ print $5 }' | sort -u
> >>--- SNIP ---
> >>
> >>The advantage of this method is, that you just need one configuration snippet for all nodes. This allows to dynamically
> >>add or remove EC2 instances / cluster nodes to/from a cluster without having to need to touch the cluster configuration.
> >>Dynamically adding or removing nodes (compute instances) is a very common scenario in a cloud.
> >>
> >>Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
> >>
> >>Cheers,
> >>Markus
> >>
> >>>>>東一彦 <higashi.kazuhiko at lab.ntt.co.jp> 3/12/2015 10:44 AM >>>
> >>Hi Dejan
> >>
> >>Thank you for add it and the fix some issues !
> >>
> >>
> >> > I was not able to test it, hope it works :)
> >>I confirmed that it works fine in my AWS environment :)
> >>
> >>
> >>Regards,
> >>Kazuhiko Higashi
> >>
> >>On 2015/03/11 21:27, Dejan Muhamedagic wrote:
> >>>Hi Kazuhiko-san,
> >>>
> >>>On Wed, Mar 11, 2015 at 02:36:43PM +0900, 東一彦 wrote:
> >>>>Hi, Dejan
> >>>>
> >>>>Thank you for the comment.
> >>>>
> >>>>I'd like to contribute it as glue stonith agents.
> >>>>
> >>>>So, I rename it to just "ec2".
> >>>>
> >>>>Would you please add it to glue repository (http://hg.linux-ha.org/glue/) ?
> >>>
> >>>I just added your stonith agent. There were this change in the
> >>>initial changeset:
> >>>
> >>>- replaced '-' which is not allowed in identifiers with '_' in
> >>> function getinfo_xml().
> >>>
> >>>There were other smaller changes. You can find them in the
> >>>repository.
> >>>
> >>>I was not able to test it, hope it works :)
> >>>
> >>>Many thanks for the contribution.
> >>>
> >>>Cheers,
> >>>
> >>>Dejan
> >>>
> >>>>Regards,
> >>>>Kazuhiko Higashi
> >>>>
> >>>>On 2015/03/06 2:38, Dejan Muhamedagic wrote:
> >>>>>Hi,
> >>>>>
> >>>>>On Tue, Mar 03, 2015 at 05:13:49PM +0900, 東一彦 wrote:
> >>>>>>Dear Markus,
> >>>>>>
> >>>>>>I was also thinking the same thing.
> >>>>>>So, Already I've created a new one.
> >>>>>
> >>>>>Perhaps you'd like to then contribute it upstream? Either to
> >>>>>glue stonith agents or RHT fencing agents. It appears that the
> >>>>>agent is using the stonith interface, but the name reflects the
> >>>>>fencing agents naming scheme.
> >>>>>
> >>>>>Cheers,
> >>>>>
> >>>>>Dejan
> >>>>>
> >>>>>>[ChangeSet]
> >>>>>>- An API to be used was changed from "Amazon EC2 CLI" to "AWS CLI".
> >>>>>> -- "AWS CLI" is based Python. So, CPU load might be reduced.
> >>>>>>
> >>>>>>- The "--private-key" and "--cert" options are deprecated in AWS CLI.
> >>>>>> So, I add a new option "--profile". Use a specific profile from that credential file.
> >>>>>> default is ""
> >>>>>>
> >>>>>>
> >>>>>>[How to use]
> >>>>>>- Plaese install the "AWS CLI".
> >>>>>> http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
> >>>>>>
> >>>>>>- Please copy the fence_ec2 in /usr/lib64/stonith/plugins/external/.
> >>>>>> And , Please set the permissions to 755.
> >>>>>>
> >>>>>>- Please set crm settings as in this example.
> >>>>>> - The instance that have been set as "node01" in the "Name" tag are fence.
> >>>>>> ------
> >>>>>> primitive prmStonith1-2 stonith:external/fence_ec2 \
> >>>>>> params \
> >>>>>> pcmk_off_timeout="300s" \
> >>>>>> port="node01" \
> >>>>>> tag="Name"
> >>\
> >>>>>> op start interval="0s" timeout="60s" \
> >>>>>> op monitor interval="3600s" timeout="60s" \
> >>>>>> op stop interval="0s" timeout="60s"
> >>>>>> ------
> >>>>>>
> >>>>>>
> >>>>>>Regards,
> >>>>>>Kazuhiko Higashi
> >>>>>>
> >>>>>>On 2015/02/25 7:22, Markus Guertler wrote:
> >>>>>>>Dear list,
> >>>>>>>I was just trying to configure the fence_ec2 stonith agent from 2012, written by Andrew Beekhof. It looks like,
> >>that this one not working anymore with newer stonith / cluster versions. Is there any other EC2 agent, that is still
> >>maintained?
> >>>>>>>
> >>>>>>>If not, I'll write one myself. However, I'd like to check all options first.
> >>>>>>>
> >>>>>>>Cheers,
> >>>>>>>Markus
> >>>>>>>
> >>>>>>>_______________________________________________
> >>>>>>>Linux-HA mailing list
> >>>>>>>Linux-HA at lists.linux-ha.org
> >>>>>>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>>>>>See also: http://linux-ha.org/ReportingProblems
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>
> >>>>#!/bin/bash
> >>>>
> >>>>description="
> >>>>fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
> >>>>
> >>>>API functions used by this agent:
> >>>>- aws ec2 describe-tags
> >>>>- aws ec2 describe-instances
> >>>>- aws ec2 stop-instances
> >>>>- aws ec2 start-instances
> >>>>- aws ec2 reboot-instances
> >>>>
> >>>>If the uname used by the cluster node is any of:
> >>>> - Public DNS name (or part there of),
> >>>> - Private DNS name (or part there of),
> >>>> - Instance ID (eg. i-4f15a839)
> >>>> - Contents of tag associated with the instance
> >>>>then the agent should be able to automatically discover the instances it can control.
> >>>>
> >>>>If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
> >>>>"
> >>>>
> >>>>#
> >>>># Copyright (c) 2011-2013 Andrew Beekhof
> >>>># Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> >>>># All Rights Reserved.
> >>>>#
> >>>># This program is free software; you can redistribute it and/or modify
> >>>># it under the terms of version 2 of the GNU General Public License as
> >>>># published by the Free Software Foundation.
> >>>>#
> >>>># This program is distributed in the hope that it would be useful, but
> >>>># WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>># MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >>>>#
> >>>># Further, this software is distributed without any warranty that it is
> >>>># free of the rightful claim of any third person regarding infringement
> >>>># or the like. Any license provided herein, whether implied or
> >>>># otherwise, applies only to this software file. Patent licenses, if
> >>>># any, provided herein do not apply to combinations of this program with
> >>>># other software, or any other product whatsoever.
> >>>>#
> >>>># You should have received a copy of the GNU General Public License
> >>>># along with this program; if not, write the Free Software Foundation,
> >>>># Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> >>>>#
> >>>>#######################################################################
> >>>>
> >>>>quiet=0
> >>>>port_default=""
> >>>>
> >>>>instance_not_found=0
> >>>>unknown_are_stopped=0
> >>>>
> >>>>action_default="reset" # Default fence action
> >>>>ec2_tag_default="Name" # EC2 Tag containing the instance's uname
> >>>>
> >>>>sleep_time="1"
> >>>>
> >>>>ec2_tag=${tag}
> >>>>
> >>>>: ${ec2_tag=${ec2_tag_default}}
> >>>>: ${port=${port_default}}
> >>>>
> >>>>function usage()
> >>>>{
> >>>>cat <<EOF
> >>>>`basename $0` - A fencing agent for Amazon EC2 instances
> >>>>
> >>>>$description
> >>>>
> >>>>Usage: `basename $0` -o|--action [-n|--port] [options]
> >>>>Options:
> >>>> -h, --help This text
> >>>> -V, --version Version information
> >>>> -q, --quiet Reduced output mode
> >>>>
> >>>>Commands:
> >>>> -o, --action Action to perform: on|off|reboot|status|monitor
> >>>> -n, --port The name of a machine/instance to control/check
> >>>>
> >>>>Additional Options:
> >>>> -p, --profile Use a specific profile from your credential file.
> >>>> -t, --tag Name of the tag containing the instance's uname
> >>>>
> >>>>Dangerous options:
> >>>> -U, --unknown-are-stopped Assume any unknown instance is safely stopped
> >>>>
> >>>>EOF
> >>>>
> >> exit 0;
> >>>>}
> >>>>
> >>>>function getinfo-xml()
> >>>>{
> >>>> cat <<EOF
> >>>><parameters>
> >>>> <parameter name="port" unique="1" required="1">
> >>>> <content type="string" />
> >>>> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="profile" unique="0" required="0">
> >>>> <content type="string" default="default" />
> >>>> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="tag" unique="0" required="1">
> >>>> <content type="string" default="Name" />
> >>>> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="unknown_are_stopped" unique="0" required="0">
> >>>> <content type="string" default="false" />
> >>>> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> >>>> </parameter>
> >>>></parameters>
> >>>>EOF
> >>>> exit 0;
> >>>>}
> >>>>
> >>>>function metadata()
> >>>>{
> >>>> cat <<EOF
> >>>><?xml version="1.0" ?>
> >>>><resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
> >>>> <longdesc>
> >>>>$description
> >>>> </longdesc>
> >>>> <parameters>
> >>>> <parameter name="action" unique="0" required="1">
> >>>> <getopt mixed="-o, --action=[action]" />
> >>>> <content type="string" default="reboot" />
> >>>> <shortdesc lang="en">Fencing Action</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="port" unique="1" required="1">
> >>>> <getopt mixed="-n, --port=[port]" />
> >>>> <content type="string" />
> >>>> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="profile" unique="0" required="0">
> >>>> <getopt mixed="-p, --profile=[profile]" />
> >>>> <content type="string" default="default" />
> >>>> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="tag" unique="0" required="1">
> >>>> <getopt mixed="-t, --tag=[tag]" />
> >>>> <content type="string" default="Name" />
> >>>> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> >>>> </parameter>
> >>>> <parameter name="unknown-are-stopped" unique="0" required="0">
> >>>> <getopt mixed="-U, --unknown-are-stopped" />
> >>>> <content type="string" default="false" />
> >>>> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> >>>> </parameter>
> >>>> </parameters>
> >>>> <actions>
> >>>> <action name="on" />
> >>>> <action name="off" />
> >>>> <action name="reboot" />
> >>>> <action name="status" />
> >>>> <action name="list" />
> >>>> <action name="monitor" />
> >>>> <action name="metadata" />
> >>>> </actions>
> >>>></resource-agent>
> >>>>EOF
> >>>> exit 0;
> >>>>}
> >>>>
> >>>>function instance_for_port()
> >>>>{
> >>>> local port=$1
> >>>> local instance=""
> >>>>
> >>>> # Look for port name -n in the INSTANCE data
> >>>> instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk
> >>'{print $8}'`
> >>>> if [ -z $instance ]; then
> >>>> # Look for port name -n in the Name TAG
> >>>> instance=`aws ec2 describe-tags $options | grep
> >>"^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
> >>>> fi
> >>>>
> >>>> if [ -z $instance ]; then
> >>>> instance_not_found=1
> >>>> instance=$port
> >>>> fi
> >>>>
> >>>> echo $instance
> >>>>}
> >>>>
> >>>>function instance_on()
> >>>>{
> >>>> aws ec2 start-instances $options --instance-ids $instance
> >>>>}
> >>>>
> >>>>function instance_off()
> >>>>{
> >>>> if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
> >>>> : nothing to do
> >>>> ha_log.sh info "Assuming unknown instance $instance is already off"
> >>>> else
> >>>> aws ec2 stop-instances $options --instance-ids $instance --force
> >>>> fi
> >>>>}
> >>>>
> >>>>function instance_status()
> >>>>{
> >>>> local instance=$1
> >>>> local status="unknown"
> >>>> local rc=1
> >>>>
> >>>> # List of instances and their current status
> >>>> if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
> >>>> ha_log.sh info "$instance stopped (unknown)"
> >>>> else
> >>>> status=`aws ec2 describe-instance
> >>s $options --instance-ids $instance | awk '{
> >>>> if (/^STATE¥t/) { printf "%s", $3 }
> >>>> }'`
> >>>> rc=$?
> >>>> fi
> >>>> ha_log.sh info "status check for $instance is $status"
> >>>> echo $status
> >>>> return $rc
> >>>>}
> >>>>
> >>>>
> >>>>TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped ¥
> >>>> -n 'fence_ec2' -- "$@"`
> >>>>
> >>>>if [ $? != 0 ];then
> >>>> usage
> >>>> exit 1
> >>>>fi
> >>>>
> >>>># Note the quotes around `$TEMP': they are essential!
> >>>>eval set -- "$TEMP"
> >>>>
> >>>>if [ -z $1 ]; then
> >>>> # If there are no command line args, look for options from stdin
> >>>> while read line; do
> >>>> case $line in
> >>>> option=*|action=*) action=`echo $line | sed s/.*=//`;;
> >>>> port=*) port=`echo $line | sed s/.*=//`;;
> >>>> profile=*) ec2_profile=`echo $line | sed s/.*=//`;;
> >>>> tag=*) ec2_tag=`echo $line | sed s/.*=//`;;
> >>>> quiet*) quiet=1;;
> >>>> unknown-are-stopped*) unknown_are_stopped=1;;
> >>>> --);;
> >>>> *) ha_log.sh err "Invalid command: $line";;
> >>>> esac
> >>>> done
> >>>>fi
> >>>>
> >>>>while true ; do
> >>>> case "$1" in
> >>>> -o|--action|--option) action=$2; shift; shift;;
> >>>> -n|--port) port=$2; shift; shift;;
> >>>> -p|--profile) ec2_profile=$2; shift; shift;;
> >>>> -t|--tag) ec2_tag=$2; shift; shift;;
> >>>> -U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
> >>>> -q|--quiet) quiet=1; shift;;
> >>>> -V|--version) echo "1.0.0"; exit 0;;
> >>>> --help|-h)
> >>>> usage;
> >>>> exit 0;;
> >>>> --) shift ; break ;;
> >>>> *) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
> >>>> esac
> >>>>done
> >>>>
> >>>>[ -n "$1" ] && action=$1
> >>>>
> >>>>if [ -z "$ec2_profile"]; then
> >>>> options="--output text --profile default"
> >>>>else
> >>>> options="--output text --profile $ec2_profile "
> >>>>fi
> >>>>
> >>>>action=`echo $action | tr 'A-Z' 'a-z'`
> >>>>
> >>>>case $action in
> >>>> metadata)
> >>>> metadata
> >>>> ;;
> >>>> getinfo-xml)
> >>>> getinfo-xml
> >>>> ;;
> >>>> getconfignames)
> >>>> for i in profile port tag
> >>>> do
> >>>> echo $i
> >>>> done
> >>>> exit 0
> >>>> ;;
> >>>> getinfo-devid)
> >>>> echo "EC2 STONITH device"
> >>>> exit 0
> >>>> ;;
> >>>> getinfo-devname)
> >>>> echo "EC2 STONITH external device"
> >>>> exit 0
> >>>> ;;
> >>>> getinfo-devdescr)
> >>>> echo "fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
> >>>> exit 0
> >>>> ;;
> >>>> getinfo-devurl)
> >>>> echo ""
> >>>> exit 0
> >>>> ;;
> >>>>esac
> >>>>
> >>>># get my instance id
> >>>>myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
> >>>>
> >>>># check my status.
> >>>># When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
> >>>># While the OS stop processing, Pacemaker can execute the STONITH processing.
> >>>># So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
> >>>># in split-brain, I don't fence other node.
> >>>>if [ -z "$myinstance" ]; then
> >>>> ha_log.sh err "Failed to get My Instance ID. so can not check my status."
> >>>> exit 1
> >>>>fi
> >>>>mystatus=`instance_status $myinstance`
> >>>>if [ "$mystatus" != "running" ]; then #do not fence
> >>>> ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
> >>>> exit 1
> >>>>fi
> >>>>
> >>>># get target's instance id
> >>>>instance=""
> >>>>if [ ! -z "$port" ]; then
> >>>> instance=`instance_for_port $port $options`
> >>>>fi
> >>>>
> >>>>case $action in
> >>>> reboot|reset)
> >>>> status=`instance_status $instance`
> >>>> if [ "$status" != "stopped" ]; then
> >>>> instance_off
> >>>> fi
> >>>> while true;
> >>>> do
> >>>> status=`instance_status $instance`
> >>>> if [ "$status" = "stopped" ]; then
> >>>> break
> >>>> fi
> >>>> sleep $sleep_time
> >>>> done
> >>>> instance_on
> >>>> while true;
> >>>> do
> >>>> status=`instance_status $instance`
> >>>> if [ "$status" = "running" ]; then
> >>>> break
> >>>> fi
> >>>> sleep $sleep_time
> >>>> done
> >>>> ;;
> >>>> poweron|on)
> >>>> instance_on
> >>>> while true;
> >>>> do
> >>>> status=`instance_status $instance`
> >>>> if [ "$
> >>status" = "running" ]; then
> >>>> break
> >>>> fi
> >>>> done
> >>>> ;;
> >>>> poweroff|off)
> >>>> instance_off
> >>>> while true;
> >>>> do
> >>>> status=`instance_status $instance`
> >>>> if [ "$status" = "stopped" ]; then
> >>>> break
> >>>> fi
> >>>> sleep $sleep_time
> >>>> done
> >>>> ;;
> >>>> monitor)
> >>>> # Is the device ok?
> >>>> aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> >>>> ;;
> >>>> gethosts|hostlist|list)
> >>>> # List of names we know about
> >>>> a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS¥t$ec2_tag¥t" -F '¥t' '{
> >>>> if (/^INSTANCES/) { printf "%s¥n", $8 }
> >>>> else if ( $1"¥t"$2"¥t" ‾ tag_pat ) { printf "%s¥n", $3 }
> >>>> }' | sort -u`
> >>>> echo $a
> >>>> ;;
> >>>> stat|status)
> >>>> instance_status $instance > /dev/null
> >>>> ;;
> >>>> *) ha_log.sh err "Unknown action: $action"; exit 1;;
> >>>>esac
> >>>>
> >>>>status=$?
> >>>>
> >>>>if [ $quiet -eq 1 ]; then
> >>>> : nothing
> >>>>elif [ $status -eq 0 ]; then
> >>>> ha_log.sh info "Operation $action passed"
> >>>>else
> >>>> ha_log.sh err "Operation $action failed: $status"
> >>>>fi
> >>>>exit $status
> >>>
> >>>>_______________________________________________
> >>>>Linux-HA mailing list
> >>>>Linux-HA at lists.linux-ha.org
> >>>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>>See also: http://linux-ha.org/ReportingProblems
> >>>
> >>>_______________________________________________
> >>>Linux-HA mailing list
> >>>Linux-HA at lists.linux-ha.org
> >>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>See also: http://linux-ha.org/ReportingProblems
> >>>
> >>>
> >>
> >>
> >>
> >>
> >>_______________________________________________
> >>Users mailing list: Users at clusterlabs.org
> >>http://clusterlabs.org/mailman/listinfo/users
> >>
> >>Project Home: http://www.clusterlabs.org
> >>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
>
>
> #!/bin/bash
>
> description="
> fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
>
> API functions used by this agent:
> - aws ec2 describe-tags
> - aws ec2 describe-instances
> - aws ec2 stop-instances
> - aws ec2 start-instances
> - aws ec2 reboot-instances
>
> If the uname used by the cluster node is any of:
> - Public DNS name (or part there of),
> - Private DNS name (or part there of),
> - Instance ID (eg. i-4f15a839)
> - Contents of tag associated with the instance
> then the agent should be able to automatically discover the instances it can control.
>
> If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
> "
>
> #
> # Copyright (c) 2011-2013 Andrew Beekhof
> # Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> # All Rights Reserved.
> #
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of version 2 of the GNU General Public License as
> # published by the Free Software Foundation.
> #
> # This program is distributed in the hope that it would be useful, but
> # WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> #
> # Further, this software is distributed without any warranty that it is
> # free of the rightful claim of any third person regarding infringement
> # or the like. Any license provided herein, whether implied or
> # otherwise, applies only to this software file. Patent licenses, if
> # any, provided herein do not apply to combinations of this program with
> # other software, or any other product whatsoever.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write the Free Software Foundation,
> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> #
> #######################################################################
>
> quiet=0
>
> instance_not_found=0
> unknown_are_stopped=0
>
> action_default="reset" # Default fence action
> ec2_tag_default="Name" # EC2 Tag containing the instance's uname
>
> sleep_time="1"
>
> [ -n "$tag" ] && ec2_tag="$tag"
>
> : ${ec2_tag=${ec2_tag_default}}
>
> function usage()
> {
> cat <<EOF
> `basename $0` - A fencing agent for Amazon EC2 instances
>
> $description
>
> Usage: `basename $0` -o|--action [-n|--port] [options]
> Options:
> -h, --help This text
> -V, --version Version information
> -q, --quiet Reduced output mode
>
> Commands:
> -o, --action Action to perform: on|off|reboot|status|monitor
> -n, --port The name of a machine/instance to control/check
>
> Additional Options:
> -p, --profile Use a specific profile from your credential file.
> -t, --tag Name of the tag containing the instance's uname
>
> Dangerous options:
> -U, --unknown-are-stopped Assume any unknown instance is safely stopped
>
> EOF
> exit 0;
> }
>
> function getinfo_xml()
> {
> cat <<EOF
> <parameters>
> <parameter name="port" unique="1" required="0">
> <content type="string" />
> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> </parameter>
> <parameter name="profile" unique="0" required="0">
> <content type="string" default="default" />
> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> </parameter>
> <parameter name="tag" unique="0" required="0">
> <content type="string" default="Name" />
> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> </parameter>
> <parameter name="unknown_are_stopped" unique="0" required="0">
> <content type="string" default="false" />
> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> </parameter>
> </parameters>
> EOF
> exit 0;
> }
>
> function metadata()
> {
> cat <<EOF
> <?xml version="1.0" ?>
> <resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
> <longdesc>
> $description
> </longdesc>
> <parameters>
> <parameter name="action" unique="0" required="1">
> <getopt mixed="-o, --action=[action]" />
> <content type="string" default="reboot" />
> <shortdesc lang="en">Fencing Action</shortdesc>
> </parameter>
> <parameter name="port" unique="1" required="0">
> <getopt mixed="-n, --port=[port]" />
> <content type="string" />
> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> </parameter>
> <parameter name="profile" unique="0" required="0">
> <getopt mixed="-p, --profile=[profile]" />
> <content type="string" default="default" />
> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> </parameter>
> <parameter name="tag" unique="0" required="0">
> <getopt mixed="-t, --tag=[tag]" />
> <content type="string" default="Name" />
> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> </parameter>
> <parameter name="unknown-are-stopped" unique="0" required="0">
> <getopt mixed="-U, --unknown-are-stopped" />
> <content type="string" default="false" />
> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> </parameter>
> </parameters>
> <actions>
> <action name="on" />
> <action name="off" />
> <action name="reboot" />
> <action name="status" />
> <action name="list" />
> <action name="monitor" />
> <action name="metadata" />
> </actions>
> </resource-agent>
> EOF
> exit 0;
> }
>
> function instance_for_port()
> {
> local port=$1
> local instance=""
>
> # Look for port name -n in the INSTANCE data
> instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk '{print $8}'`
> if [ -z $instance ]; then
> # Look for port name -n in the Name TAG
> instance=`aws ec2 describe-tags $options | grep "^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
> fi
>
> if [ -z $instance ]; then
> instance_not_found=1
> instance=$port
> fi
>
> echo $instance
> }
>
> function instance_on()
> {
> aws ec2 start-instances $options --instance-ids $instance
> }
>
> function instance_off()
> {
> if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
> : nothing to do
> ha_log.sh info "Assuming unknown instance $instance is already off"
> else
> aws ec2 stop-instances $options --instance-ids $instance --force
> fi
> }
>
> function instance_status()
> {
> local instance=$1
> local status="unknown"
> local rc=1
>
> # List of instances and their current status
> if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
> ha_log.sh info "$instance stopped (unknown)"
> else
> status=`aws ec2 describe-instances $options --instance-ids $instance | awk '{
> if (/^STATE¥t/) { printf "%s", $3 }
> }'`
> rc=$?
> fi
> ha_log.sh info "status check for $instance is $status"
> echo $status
> return $rc
> }
>
> function monitor()
> {
> # Is the device ok?
> aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> }
>
> TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped ¥
> -n 'fence_ec2' -- "$@"`
>
> if [ $? != 0 ];then
> usage
> exit 1
> fi
>
> # Note the quotes around `$TEMP': they are essential!
> eval set -- "$TEMP"
>
> if [ -z $1 ]; then
> # If there are no command line args, look for options from stdin
> while read line; do
> case $line in
> option=*|action=*) action=`echo $line | sed s/.*=//`;;
> port=*) port=`echo $line | sed s/.*=//`;;
> profile=*) ec2_profile=`echo $line | sed s/.*=//`;;
> tag=*) ec2_tag=`echo $line | sed s/.*=//`;;
> quiet*) quiet=1;;
> unknown-are-stopped*) unknown_are_stopped=1;;
> --);;
> *) ha_log.sh err "Invalid command: $line";;
> esac
> done
> fi
>
> while true ; do
> case "$1" in
> -o|--action|--option) action=$2; shift; shift;;
> -n|--port) port=$2; shift; shift;;
> -p|--profile) ec2_profile=$2; shift; shift;;
> -t|--tag) ec2_tag=$2; shift; shift;;
> -U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
> -q|--quiet) quiet=1; shift;;
> -V|--version) echo "1.0.0"; exit 0;;
> --help|-h)
> usage;
> exit 0;;
> --) shift ; break ;;
> *) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
> esac
> done
>
> [ -n "$1" ] && action=$1
> [ -n "$2" ] && node_to_fence=$2
>
> if [ -z "$ec2_profile"]; then
> options="--output text --profile default"
> else
> options="--output text --profile $ec2_profile "
> fi
>
> action=`echo $action | tr 'A-Z' 'a-z'`
>
> case $action in
> metadata)
> metadata
> ;;
> getinfo-xml)
> getinfo_xml
> ;;
> getconfignames)
> for i in profile port tag unknown_are_stopped
> do
> echo $i
> done
> exit 0
> ;;
> getinfo-devid)
> echo "EC2 STONITH device"
> exit 0
> ;;
> getinfo-devname)
> echo "EC2 STONITH external device"
> exit 0
> ;;
> getinfo-devdescr)
> echo "ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
> exit 0
> ;;
> getinfo-devurl)
> echo ""
> exit 0
> ;;
> esac
>
> # get my instance id
> myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
>
> # check my status.
> # When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
> # While the OS stop processing, Pacemaker can execute the STONITH processing.
> # So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
> # in split-brain, I don't fence other node.
> if [ -z "$myinstance" ]; then
> ha_log.sh err "Failed to get My Instance ID. so can not check my status."
> exit 1
> fi
> mystatus=`instance_status $myinstance`
> if [ "$mystatus" != "running" ]; then #do not fence
> ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
> exit 1
> fi
>
> if [ -z "$port" ]; then
> port="$node_to_fence"
> fi
>
> # get target's instance id
> instance=""
> if [ ! -z "$port" ]; then
> instance=`instance_for_port $port $options`
> fi
>
> case $action in
> reboot|reset)
> status=`instance_status $instance`
> if [ "$status" != "stopped" ]; then
> instance_off
> fi
> while true;
> do
> status=`instance_status $instance`
> if [ "$status" = "stopped" ]; then
> break
> fi
> sleep $sleep_time
> done
> instance_on
> while true;
> do
> status=`instance_status $instance`
> if [ "$status" = "running" ]; then
> break
> fi
> sleep $sleep_time
> done
> ;;
> poweron|on)
> instance_on
> while true;
> do
> status=`instance_status $instance`
> if [ "$status" = "running" ]; then
> break
> fi
> done
> ;;
> poweroff|off)
> instance_off
> while true;
> do
> status=`instance_status $instance`
> if [ "$status" = "stopped" ]; then
> break
> fi
> sleep $sleep_time
> done
> ;;
> monitor)
> monitor
> ;;
> gethosts|hostlist|list)
> # List of names we know about
> a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS¥t$ec2_tag¥t" -F '¥t' '{
> if (/^INSTANCES/) { printf "%s¥n", $8 }
> else if ( $1"¥t"$2"¥t" ‾ tag_pat ) { printf "%s¥n", $3 }
> }' | sort -u`
> echo $a
> ;;
> stat|status)
> monitor
> ;;
> *) ha_log.sh err "Unknown action: $action"; exit 1;;
> esac
>
> status=$?
>
> if [ $quiet -eq 1 ]; then
> : nothing
> elif [ $status -eq 0 ]; then
> ha_log.sh info "Operation $action passed"
> else
> ha_log.sh err "Operation $action failed: $status"
> fi
> exit $status
> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
More information about the Users
mailing list