[ClusterLabs] [Linux-HA] fence_ec2 agent
東一彦
higashi.kazuhiko at lab.ntt.co.jp
Wed Mar 25 01:47:01 UTC 2015
Hi Markus,
I implemented it for trial.
[diff from http://hg.linux-ha.org/glue/rev/9da0680bc9c0 ]
50d49
< port_default=""
60c59
< ec2_tag=${tag}
---
> [ -n "$tag" ] && ec2_tag="$tag"
63d61
< : ${port=${port_default}}
97c95
< <parameter name="port" unique="1" required="1">
---
> <parameter name="port" unique="1" required="0">
105c103
< <parameter name="tag" unique="0" required="1">
---
> <parameter name="tag" unique="0" required="0">
132c130
< <parameter name="port" unique="1" required="1">
---
> <parameter name="port" unique="1" required="0">
142c140
< <parameter name="tag" unique="0" required="1">
---
> <parameter name="tag" unique="0" required="0">
221a220,224
> function monitor()
> {
> # Is the device ok?
> aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> }
267a271
> [ -n "$2" ] && node_to_fence=$2
326a331,334
> if [ -z "$port" ]; then
> port="$node_to_fence"
> fi
>
379,380c387
< # Is the device ok?
< aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
---
> monitor
391c398
< instance_status $instance > /dev/null
---
> monitor
It works fine on my environment with 2 patterns settings below.
[pattern No.1]
Without "port" and "tag" parameters.
And instances has "Name=<uname>" tag.
----
primitive prmStonith1-2 stonith:external/ec2 \
params \
pcmk_off_timeout="120s" \
op start interval="0s" timeout="60s" \
op monitor interval="3600s" timeout="60s" \
op stop interval="0s" timeout="60s"
----
[pattern No.2]
With only "tag" parameter.(Without "port" parameter.)
And, The 1st instance(node01) has "Cluster1=node01" tag.
The 2nd instance(node02) has "Cluster1=node02" tag.
----
primitive prmStonith1-2 stonith:external/ec2 \
params \
pcmk_off_timeout="120s" \
tag="Cluster1" \
op start interval="0s" timeout="60s" \
op monitor interval="3600s" timeout="60s" \
op stop interval="0s" timeout="60s"
----
Regards,
Kazuhiko Higashi
On 2015/03/24 20:48, 東一彦 wrote:
> Hi Markus,
>
> Thank you for the comment.
>
> > Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
> I think that your idea is good.
>
> So, I tries to implement it.
> I'm going to change the fence_ec2(ec2) the following points.
>
> - the "tag" and the "port" options will be "not" required.
>
> - if the "port" option is not set, the 2nd argument of ec2 will use as the "port".
> - the 2nd argument of ec2 is "node to fence".
>
> - the "stat" and "status" action will be same the "monitor" action.
> (for do not use the "port" parameter in "stat" action.)
>
>
> By the above modifications, If it is described uname in the Name tag,
> the setting of the "tag" and "port" parameters are no longer necessary.
>
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
> params \
> pcmk_off_timeout="120s" \
> op start interval="0s" timeout="60s" \
> op monitor interval="3600s" timeout="60s" \
> op stop interval="0s" timeout="60s"
> ----
>
>
> You can use "tag" parameter like your "Clustername" tag.
> If cluster nodes(instances) have "Cluster1" tag, and uname is described in that tag,
> it works just like you to expect.
>
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
> params \
> pcmk_off_timeout="120s" \
> tag="Cluster1" \
> op start interval="0s" timeout="60s" \
> op monitor interval="3600s" timeout="60s" \
> op stop interval="0s" timeout="60s"
> ----
>
> The 1st instance have "Cluster1=node01" tag-key.
> The 2nd instance have "Cluster1=node02" tag-key.
> The 3rd instance have "Cluster1=node03" tag-key.
> ...
> The prmStonith1-2 can fence node01 , node02 and node03.
>
>
> If you like above, I will implement that.
>
>
> Regards,
> Kazuhiko Higashi
>
>
> On 2015/03/19 1:03, Markus Guertler wrote:
>> Hi Kazuhiko, Dejan,
>>
>> the new resource agent is very good. Since there were a couple of days between my original question and the answer from
>> Kazuhiko, I also have written a stonith agent proof of concept (attached to this email) in order to continue in my
>> project. However, I think that your fence_ec2 agent is better from a development perspective and it doesn't make sense
>> to have two different agents for the same use case.
>>
>> Nevertheless, I've implemented an idea, that is very useful in EC2 environments with clusters that have more than two
>> nodes: All EC2 instances that belong to a cluster get a unique cluster name via an EC2 instance tag. The agent uses this
>> tag to determine all cluster nodes that belong to his own cluster
>>
>> --- SNIP ---
>> gethosts)
>> # List of hostnames of this cluster
>> init_agent
>> ec2-describe-instances --filter "tag-key=Clustername" --filter "tag-value=$clustername" | grep "^TAG" |grep
>> "Hostname" | awk '{ print $5 }' | sort -u
>> --- SNIP ---
>>
>> The advantage of this method is, that you just need one configuration snippet for all nodes. This allows to dynamically
>> add or remove EC2 instances / cluster nodes to/from a cluster without having to need to touch the cluster configuration.
>> Dynamically adding or removing nodes (compute instances) is a very common scenario in a cloud.
>>
>> Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
>>
>> Cheers,
>> Markus
>>
>>>>> 東一彦 <higashi.kazuhiko at lab.ntt.co.jp> 3/12/2015 10:44 AM >>>
>> Hi Dejan
>>
>> Thank you for add it and the fix some issues !
>>
>>
>> > I was not able to test it, hope it works :)
>> I confirmed that it works fine in my AWS environment :)
>>
>>
>> Regards,
>> Kazuhiko Higashi
>>
>> On 2015/03/11 21:27, Dejan Muhamedagic wrote:
>>> Hi Kazuhiko-san,
>>>
>>> On Wed, Mar 11, 2015 at 02:36:43PM +0900, 東一彦 wrote:
>>>> Hi, Dejan
>>>>
>>>> Thank you for the comment.
>>>>
>>>> I'd like to contribute it as glue stonith agents.
>>>>
>>>> So, I rename it to just "ec2".
>>>>
>>>> Would you please add it to glue repository (http://hg.linux-ha.org/glue/) ?
>>>
>>> I just added your stonith agent. There were this change in the
>>> initial changeset:
>>>
>>> - replaced '-' which is not allowed in identifiers with '_' in
>>> function getinfo_xml().
>>>
>>> There were other smaller changes. You can find them in the
>>> repository.
>>>
>>> I was not able to test it, hope it works :)
>>>
>>> Many thanks for the contribution.
>>>
>>> Cheers,
>>>
>>> Dejan
>>>
>>>> Regards,
>>>> Kazuhiko Higashi
>>>>
>>>> On 2015/03/06 2:38, Dejan Muhamedagic wrote:
>>>>> Hi,
>>>>>
>>>>> On Tue, Mar 03, 2015 at 05:13:49PM +0900, 東一彦 wrote:
>>>>>> Dear Markus,
>>>>>>
>>>>>> I was also thinking the same thing.
>>>>>> So, Already I've created a new one.
>>>>>
>>>>> Perhaps you'd like to then contribute it upstream? Either to
>>>>> glue stonith agents or RHT fencing agents. It appears that the
>>>>> agent is using the stonith interface, but the name reflects the
>>>>> fencing agents naming scheme.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Dejan
>>>>>
>>>>>> [ChangeSet]
>>>>>> - An API to be used was changed from "Amazon EC2 CLI" to "AWS CLI".
>>>>>> -- "AWS CLI" is based Python. So, CPU load might be reduced.
>>>>>>
>>>>>> - The "--private-key" and "--cert" options are deprecated in AWS CLI.
>>>>>> So, I add a new option "--profile". Use a specific profile from that credential file.
>>>>>> default is ""
>>>>>>
>>>>>>
>>>>>> [How to use]
>>>>>> - Plaese install the "AWS CLI".
>>>>>> http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
>>>>>>
>>>>>> - Please copy the fence_ec2 in /usr/lib64/stonith/plugins/external/.
>>>>>> And , Please set the permissions to 755.
>>>>>>
>>>>>> - Please set crm settings as in this example.
>>>>>> - The instance that have been set as "node01" in the "Name" tag are fence.
>>>>>> ------
>>>>>> primitive prmStonith1-2 stonith:external/fence_ec2 \
>>>>>> params \
>>>>>> pcmk_off_timeout="300s" \
>>>>>> port="node01" \
>>>>>> tag="Name"
>> \
>>>>>> op start interval="0s" timeout="60s" \
>>>>>> op monitor interval="3600s" timeout="60s" \
>>>>>> op stop interval="0s" timeout="60s"
>>>>>> ------
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Kazuhiko Higashi
>>>>>>
>>>>>> On 2015/02/25 7:22, Markus Guertler wrote:
>>>>>>> Dear list,
>>>>>>> I was just trying to configure the fence_ec2 stonith agent from 2012, written by Andrew Beekhof. It looks like,
>> that this one not working anymore with newer stonith / cluster versions. Is there any other EC2 agent, that is still
>> maintained?
>>>>>>>
>>>>>>> If not, I'll write one myself. However, I'd like to check all options first.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Markus
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Linux-HA mailing list
>>>>>>> Linux-HA at lists.linux-ha.org
>>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>>>>> See also: http://linux-ha.org/ReportingProblems
>>>>>>>
>>>>>>
>>>>>>
>>>
>>>> #!/bin/bash
>>>>
>>>> description="
>>>> fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
>>>>
>>>> API functions used by this agent:
>>>> - aws ec2 describe-tags
>>>> - aws ec2 describe-instances
>>>> - aws ec2 stop-instances
>>>> - aws ec2 start-instances
>>>> - aws ec2 reboot-instances
>>>>
>>>> If the uname used by the cluster node is any of:
>>>> - Public DNS name (or part there of),
>>>> - Private DNS name (or part there of),
>>>> - Instance ID (eg. i-4f15a839)
>>>> - Contents of tag associated with the instance
>>>> then the agent should be able to automatically discover the instances it can control.
>>>>
>>>> If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
>>>> "
>>>>
>>>> #
>>>> # Copyright (c) 2011-2013 Andrew Beekhof
>>>> # Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>>>> # All Rights Reserved.
>>>> #
>>>> # This program is free software; you can redistribute it and/or modify
>>>> # it under the terms of version 2 of the GNU General Public License as
>>>> # published by the Free Software Foundation.
>>>> #
>>>> # This program is distributed in the hope that it would be useful, but
>>>> # WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>>> #
>>>> # Further, this software is distributed without any warranty that it is
>>>> # free of the rightful claim of any third person regarding infringement
>>>> # or the like. Any license provided herein, whether implied or
>>>> # otherwise, applies only to this software file. Patent licenses, if
>>>> # any, provided herein do not apply to combinations of this program with
>>>> # other software, or any other product whatsoever.
>>>> #
>>>> # You should have received a copy of the GNU General Public License
>>>> # along with this program; if not, write the Free Software Foundation,
>>>> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
>>>> #
>>>> #######################################################################
>>>>
>>>> quiet=0
>>>> port_default=""
>>>>
>>>> instance_not_found=0
>>>> unknown_are_stopped=0
>>>>
>>>> action_default="reset" # Default fence action
>>>> ec2_tag_default="Name" # EC2 Tag containing the instance's uname
>>>>
>>>> sleep_time="1"
>>>>
>>>> ec2_tag=${tag}
>>>>
>>>> : ${ec2_tag=${ec2_tag_default}}
>>>> : ${port=${port_default}}
>>>>
>>>> function usage()
>>>> {
>>>> cat <<EOF
>>>> `basename $0` - A fencing agent for Amazon EC2 instances
>>>>
>>>> $description
>>>>
>>>> Usage: `basename $0` -o|--action [-n|--port] [options]
>>>> Options:
>>>> -h, --help This text
>>>> -V, --version Version information
>>>> -q, --quiet Reduced output mode
>>>>
>>>> Commands:
>>>> -o, --action Action to perform: on|off|reboot|status|monitor
>>>> -n, --port The name of a machine/instance to control/check
>>>>
>>>> Additional Options:
>>>> -p, --profile Use a specific profile from your credential file.
>>>> -t, --tag Name of the tag containing the instance's uname
>>>>
>>>> Dangerous options:
>>>> -U, --unknown-are-stopped Assume any unknown instance is safely stopped
>>>>
>>>> EOF
>>>>
>> exit 0;
>>>> }
>>>>
>>>> function getinfo-xml()
>>>> {
>>>> cat <<EOF
>>>> <parameters>
>>>> <parameter name="port" unique="1" required="1">
>>>> <content type="string" />
>>>> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
>>>> </parameter>
>>>> <parameter name="profile" unique="0" required="0">
>>>> <content type="string" default="default" />
>>>> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
>>>> </parameter>
>>>> <parameter name="tag" unique="0" required="1">
>>>> <content type="string" default="Name" />
>>>> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
>>>> </parameter>
>>>> <parameter name="unknown_are_stopped" unique="0" required="0">
>>>> <content type="string" default="false" />
>>>> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
>>>> </parameter>
>>>> </parameters>
>>>> EOF
>>>> exit 0;
>>>> }
>>>>
>>>> function metadata()
>>>> {
>>>> cat <<EOF
>>>> <?xml version="1.0" ?>
>>>> <resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
>>>> <longdesc>
>>>> $description
>>>> </longdesc>
>>>> <parameters>
>>>> <parameter name="action" unique="0" required="1">
>>>> <getopt mixed="-o, --action=[action]" />
>>>> <content type="string" default="reboot" />
>>>> <shortdesc lang="en">Fencing Action</shortdesc>
>>>> </parameter>
>>>> <parameter name="port" unique="1" required="1">
>>>> <getopt mixed="-n, --port=[port]" />
>>>> <content type="string" />
>>>> <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
>>>> </parameter>
>>>> <parameter name="profile" unique="0" required="0">
>>>> <getopt mixed="-p, --profile=[profile]" />
>>>> <content type="string" default="default" />
>>>> <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
>>>> </parameter>
>>>> <parameter name="tag" unique="0" required="1">
>>>> <getopt mixed="-t, --tag=[tag]" />
>>>> <content type="string" default="Name" />
>>>> <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
>>>> </parameter>
>>>> <parameter name="unknown-are-stopped" unique="0" required="0">
>>>> <getopt mixed="-U, --unknown-are-stopped" />
>>>> <content type="string" default="false" />
>>>> <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
>>>> </parameter>
>>>> </parameters>
>>>> <actions>
>>>> <action name="on" />
>>>> <action name="off" />
>>>> <action name="reboot" />
>>>> <action name="status" />
>>>> <action name="list" />
>>>> <action name="monitor" />
>>>> <action name="metadata" />
>>>> </actions>
>>>> </resource-agent>
>>>> EOF
>>>> exit 0;
>>>> }
>>>>
>>>> function instance_for_port()
>>>> {
>>>> local port=$1
>>>> local instance=""
>>>>
>>>> # Look for port name -n in the INSTANCE data
>>>> instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk
>> '{print $8}'`
>>>> if [ -z $instance ]; then
>>>> # Look for port name -n in the Name TAG
>>>> instance=`aws ec2 describe-tags $options | grep
>> "^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
>>>> fi
>>>>
>>>> if [ -z $instance ]; then
>>>> instance_not_found=1
>>>> instance=$port
>>>> fi
>>>>
>>>> echo $instance
>>>> }
>>>>
>>>> function instance_on()
>>>> {
>>>> aws ec2 start-instances $options --instance-ids $instance
>>>> }
>>>>
>>>> function instance_off()
>>>> {
>>>> if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
>>>> : nothing to do
>>>> ha_log.sh info "Assuming unknown instance $instance is already off"
>>>> else
>>>> aws ec2 stop-instances $options --instance-ids $instance --force
>>>> fi
>>>> }
>>>>
>>>> function instance_status()
>>>> {
>>>> local instance=$1
>>>> local status="unknown"
>>>> local rc=1
>>>>
>>>> # List of instances and their current status
>>>> if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
>>>> ha_log.sh info "$instance stopped (unknown)"
>>>> else
>>>> status=`aws ec2 describe-instance
>> s $options --instance-ids $instance | awk '{
>>>> if (/^STATE¥t/) { printf "%s", $3 }
>>>> }'`
>>>> rc=$?
>>>> fi
>>>> ha_log.sh info "status check for $instance is $status"
>>>> echo $status
>>>> return $rc
>>>> }
>>>>
>>>>
>>>> TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped ¥
>>>> -n 'fence_ec2' -- "$@"`
>>>>
>>>> if [ $? != 0 ];then
>>>> usage
>>>> exit 1
>>>> fi
>>>>
>>>> # Note the quotes around `$TEMP': they are essential!
>>>> eval set -- "$TEMP"
>>>>
>>>> if [ -z $1 ]; then
>>>> # If there are no command line args, look for options from stdin
>>>> while read line; do
>>>> case $line in
>>>> option=*|action=*) action=`echo $line | sed s/.*=//`;;
>>>> port=*) port=`echo $line | sed s/.*=//`;;
>>>> profile=*) ec2_profile=`echo $line | sed s/.*=//`;;
>>>> tag=*) ec2_tag=`echo $line | sed s/.*=//`;;
>>>> quiet*) quiet=1;;
>>>> unknown-are-stopped*) unknown_are_stopped=1;;
>>>> --);;
>>>> *) ha_log.sh err "Invalid command: $line";;
>>>> esac
>>>> done
>>>> fi
>>>>
>>>> while true ; do
>>>> case "$1" in
>>>> -o|--action|--option) action=$2; shift; shift;;
>>>> -n|--port) port=$2; shift; shift;;
>>>> -p|--profile) ec2_profile=$2; shift; shift;;
>>>> -t|--tag) ec2_tag=$2; shift; shift;;
>>>> -U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
>>>> -q|--quiet) quiet=1; shift;;
>>>> -V|--version) echo "1.0.0"; exit 0;;
>>>> --help|-h)
>>>> usage;
>>>> exit 0;;
>>>> --) shift ; break ;;
>>>> *) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
>>>> esac
>>>> done
>>>>
>>>> [ -n "$1" ] && action=$1
>>>>
>>>> if [ -z "$ec2_profile"]; then
>>>> options="--output text --profile default"
>>>> else
>>>> options="--output text --profile $ec2_profile "
>>>> fi
>>>>
>>>> action=`echo $action | tr 'A-Z' 'a-z'`
>>>>
>>>> case $action in
>>>> metadata)
>>>> metadata
>>>> ;;
>>>> getinfo-xml)
>>>> getinfo-xml
>>>> ;;
>>>> getconfignames)
>>>> for i in profile port tag
>>>> do
>>>> echo $i
>>>> done
>>>> exit 0
>>>> ;;
>>>> getinfo-devid)
>>>> echo "EC2 STONITH device"
>>>> exit 0
>>>> ;;
>>>> getinfo-devname)
>>>> echo "EC2 STONITH external device"
>>>> exit 0
>>>> ;;
>>>> getinfo-devdescr)
>>>> echo "fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
>>>> exit 0
>>>> ;;
>>>> getinfo-devurl)
>>>> echo ""
>>>> exit 0
>>>> ;;
>>>> esac
>>>>
>>>> # get my instance id
>>>> myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
>>>>
>>>> # check my status.
>>>> # When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
>>>> # While the OS stop processing, Pacemaker can execute the STONITH processing.
>>>> # So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
>>>> # in split-brain, I don't fence other node.
>>>> if [ -z "$myinstance" ]; then
>>>> ha_log.sh err "Failed to get My Instance ID. so can not check my status."
>>>> exit 1
>>>> fi
>>>> mystatus=`instance_status $myinstance`
>>>> if [ "$mystatus" != "running" ]; then #do not fence
>>>> ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
>>>> exit 1
>>>> fi
>>>>
>>>> # get target's instance id
>>>> instance=""
>>>> if [ ! -z "$port" ]; then
>>>> instance=`instance_for_port $port $options`
>>>> fi
>>>>
>>>> case $action in
>>>> reboot|reset)
>>>> status=`instance_status $instance`
>>>> if [ "$status" != "stopped" ]; then
>>>> instance_off
>>>> fi
>>>> while true;
>>>> do
>>>> status=`instance_status $instance`
>>>> if [ "$status" = "stopped" ]; then
>>>> break
>>>> fi
>>>> sleep $sleep_time
>>>> done
>>>> instance_on
>>>> while true;
>>>> do
>>>> status=`instance_status $instance`
>>>> if [ "$status" = "running" ]; then
>>>> break
>>>> fi
>>>> sleep $sleep_time
>>>> done
>>>> ;;
>>>> poweron|on)
>>>> instance_on
>>>> while true;
>>>> do
>>>> status=`instance_status $instance`
>>>> if [ "$
>> status" = "running" ]; then
>>>> break
>>>> fi
>>>> done
>>>> ;;
>>>> poweroff|off)
>>>> instance_off
>>>> while true;
>>>> do
>>>> status=`instance_status $instance`
>>>> if [ "$status" = "stopped" ]; then
>>>> break
>>>> fi
>>>> sleep $sleep_time
>>>> done
>>>> ;;
>>>> monitor)
>>>> # Is the device ok?
>>>> aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
>>>> ;;
>>>> gethosts|hostlist|list)
>>>> # List of names we know about
>>>> a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS¥t$ec2_tag¥t" -F '¥t' '{
>>>> if (/^INSTANCES/) { printf "%s¥n", $8 }
>>>> else if ( $1"¥t"$2"¥t" ‾ tag_pat ) { printf "%s¥n", $3 }
>>>> }' | sort -u`
>>>> echo $a
>>>> ;;
>>>> stat|status)
>>>> instance_status $instance > /dev/null
>>>> ;;
>>>> *) ha_log.sh err "Unknown action: $action"; exit 1;;
>>>> esac
>>>>
>>>> status=$?
>>>>
>>>> if [ $quiet -eq 1 ]; then
>>>> : nothing
>>>> elif [ $status -eq 0 ]; then
>>>> ha_log.sh info "Operation $action passed"
>>>> else
>>>> ha_log.sh err "Operation $action failed: $status"
>>>> fi
>>>> exit $status
>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA at lists.linux-ha.org
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list: Users at clusterlabs.org
>> http://clusterlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>>
>
>
-------------- next part --------------
#!/bin/bash
description="
fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
API functions used by this agent:
- aws ec2 describe-tags
- aws ec2 describe-instances
- aws ec2 stop-instances
- aws ec2 start-instances
- aws ec2 reboot-instances
If the uname used by the cluster node is any of:
- Public DNS name (or part there of),
- Private DNS name (or part there of),
- Instance ID (eg. i-4f15a839)
- Contents of tag associated with the instance
then the agent should be able to automatically discover the instances it can control.
If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
"
#
# Copyright (c) 2011-2013 Andrew Beekhof
# Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
# All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like. Any license provided herein, whether implied or
# otherwise, applies only to this software file. Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
#######################################################################
quiet=0
instance_not_found=0
unknown_are_stopped=0
action_default="reset" # Default fence action
ec2_tag_default="Name" # EC2 Tag containing the instance's uname
sleep_time="1"
[ -n "$tag" ] && ec2_tag="$tag"
: ${ec2_tag=${ec2_tag_default}}
function usage()
{
cat <<EOF
`basename $0` - A fencing agent for Amazon EC2 instances
$description
Usage: `basename $0` -o|--action [-n|--port] [options]
Options:
-h, --help This text
-V, --version Version information
-q, --quiet Reduced output mode
Commands:
-o, --action Action to perform: on|off|reboot|status|monitor
-n, --port The name of a machine/instance to control/check
Additional Options:
-p, --profile Use a specific profile from your credential file.
-t, --tag Name of the tag containing the instance's uname
Dangerous options:
-U, --unknown-are-stopped Assume any unknown instance is safely stopped
EOF
exit 0;
}
function getinfo_xml()
{
cat <<EOF
<parameters>
<parameter name="port" unique="1" required="0">
<content type="string" />
<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
</parameter>
<parameter name="profile" unique="0" required="0">
<content type="string" default="default" />
<shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
</parameter>
<parameter name="tag" unique="0" required="0">
<content type="string" default="Name" />
<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
</parameter>
<parameter name="unknown_are_stopped" unique="0" required="0">
<content type="string" default="false" />
<shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
</parameter>
</parameters>
EOF
exit 0;
}
function metadata()
{
cat <<EOF
<?xml version="1.0" ?>
<resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
<longdesc>
$description
</longdesc>
<parameters>
<parameter name="action" unique="0" required="1">
<getopt mixed="-o, --action=[action]" />
<content type="string" default="reboot" />
<shortdesc lang="en">Fencing Action</shortdesc>
</parameter>
<parameter name="port" unique="1" required="0">
<getopt mixed="-n, --port=[port]" />
<content type="string" />
<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
</parameter>
<parameter name="profile" unique="0" required="0">
<getopt mixed="-p, --profile=[profile]" />
<content type="string" default="default" />
<shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
</parameter>
<parameter name="tag" unique="0" required="0">
<getopt mixed="-t, --tag=[tag]" />
<content type="string" default="Name" />
<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
</parameter>
<parameter name="unknown-are-stopped" unique="0" required="0">
<getopt mixed="-U, --unknown-are-stopped" />
<content type="string" default="false" />
<shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
</parameter>
</parameters>
<actions>
<action name="on" />
<action name="off" />
<action name="reboot" />
<action name="status" />
<action name="list" />
<action name="monitor" />
<action name="metadata" />
</actions>
</resource-agent>
EOF
exit 0;
}
function instance_for_port()
{
local port=$1
local instance=""
# Look for port name -n in the INSTANCE data
instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk '{print $8}'`
if [ -z $instance ]; then
# Look for port name -n in the Name TAG
instance=`aws ec2 describe-tags $options | grep "^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
fi
if [ -z $instance ]; then
instance_not_found=1
instance=$port
fi
echo $instance
}
function instance_on()
{
aws ec2 start-instances $options --instance-ids $instance
}
function instance_off()
{
if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
: nothing to do
ha_log.sh info "Assuming unknown instance $instance is already off"
else
aws ec2 stop-instances $options --instance-ids $instance --force
fi
}
function instance_status()
{
local instance=$1
local status="unknown"
local rc=1
# List of instances and their current status
if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
ha_log.sh info "$instance stopped (unknown)"
else
status=`aws ec2 describe-instances $options --instance-ids $instance | awk '{
if (/^STATE\t/) { printf "%s", $3 }
}'`
rc=$?
fi
ha_log.sh info "status check for $instance is $status"
echo $status
return $rc
}
function monitor()
{
# Is the device ok?
aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
}
TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped \
-n 'fence_ec2' -- "$@"`
if [ $? != 0 ];then
usage
exit 1
fi
# Note the quotes around `$TEMP': they are essential!
eval set -- "$TEMP"
if [ -z $1 ]; then
# If there are no command line args, look for options from stdin
while read line; do
case $line in
option=*|action=*) action=`echo $line | sed s/.*=//`;;
port=*) port=`echo $line | sed s/.*=//`;;
profile=*) ec2_profile=`echo $line | sed s/.*=//`;;
tag=*) ec2_tag=`echo $line | sed s/.*=//`;;
quiet*) quiet=1;;
unknown-are-stopped*) unknown_are_stopped=1;;
--);;
*) ha_log.sh err "Invalid command: $line";;
esac
done
fi
while true ; do
case "$1" in
-o|--action|--option) action=$2; shift; shift;;
-n|--port) port=$2; shift; shift;;
-p|--profile) ec2_profile=$2; shift; shift;;
-t|--tag) ec2_tag=$2; shift; shift;;
-U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
-q|--quiet) quiet=1; shift;;
-V|--version) echo "1.0.0"; exit 0;;
--help|-h)
usage;
exit 0;;
--) shift ; break ;;
*) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
esac
done
[ -n "$1" ] && action=$1
[ -n "$2" ] && node_to_fence=$2
if [ -z "$ec2_profile"]; then
options="--output text --profile default"
else
options="--output text --profile $ec2_profile "
fi
action=`echo $action | tr 'A-Z' 'a-z'`
case $action in
metadata)
metadata
;;
getinfo-xml)
getinfo_xml
;;
getconfignames)
for i in profile port tag unknown_are_stopped
do
echo $i
done
exit 0
;;
getinfo-devid)
echo "EC2 STONITH device"
exit 0
;;
getinfo-devname)
echo "EC2 STONITH external device"
exit 0
;;
getinfo-devdescr)
echo "ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
exit 0
;;
getinfo-devurl)
echo ""
exit 0
;;
esac
# get my instance id
myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
# check my status.
# When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
# While the OS stop processing, Pacemaker can execute the STONITH processing.
# So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
# in split-brain, I don't fence other node.
if [ -z "$myinstance" ]; then
ha_log.sh err "Failed to get My Instance ID. so can not check my status."
exit 1
fi
mystatus=`instance_status $myinstance`
if [ "$mystatus" != "running" ]; then #do not fence
ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
exit 1
fi
if [ -z "$port" ]; then
port="$node_to_fence"
fi
# get target's instance id
instance=""
if [ ! -z "$port" ]; then
instance=`instance_for_port $port $options`
fi
case $action in
reboot|reset)
status=`instance_status $instance`
if [ "$status" != "stopped" ]; then
instance_off
fi
while true;
do
status=`instance_status $instance`
if [ "$status" = "stopped" ]; then
break
fi
sleep $sleep_time
done
instance_on
while true;
do
status=`instance_status $instance`
if [ "$status" = "running" ]; then
break
fi
sleep $sleep_time
done
;;
poweron|on)
instance_on
while true;
do
status=`instance_status $instance`
if [ "$status" = "running" ]; then
break
fi
done
;;
poweroff|off)
instance_off
while true;
do
status=`instance_status $instance`
if [ "$status" = "stopped" ]; then
break
fi
sleep $sleep_time
done
;;
monitor)
monitor
;;
gethosts|hostlist|list)
# List of names we know about
a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS\t$ec2_tag\t" -F '\t' '{
if (/^INSTANCES/) { printf "%s\n", $8 }
else if ( $1"\t"$2"\t" ~ tag_pat ) { printf "%s\n", $3 }
}' | sort -u`
echo $a
;;
stat|status)
monitor
;;
*) ha_log.sh err "Unknown action: $action"; exit 1;;
esac
status=$?
if [ $quiet -eq 1 ]; then
: nothing
elif [ $status -eq 0 ]; then
ha_log.sh info "Operation $action passed"
else
ha_log.sh err "Operation $action failed: $status"
fi
exit $status
More information about the Users
mailing list