[ClusterLabs] [Linux-HA] fence_ec2 agent

Dejan Muhamedagic dejanmm at fastmail.fm
Thu Sep 24 11:04:35 EDT 2015


Hi Kazuhiko-san,

On Wed, Mar 25, 2015 at 10:47:01AM +0900, 東一彦 wrote:
> Hi Markus,
> 
> I implemented it for trial.
> 
> [diff from http://hg.linux-ha.org/glue/rev/9da0680bc9c0 ]
> 50d49
> < port_default=""
> 60c59
> < ec2_tag=${tag}
> ---
> > [ -n "$tag" ] && ec2_tag="$tag"
> 63d61
> < : ${port=${port_default}}
> 97c95
> <       <parameter name="port" unique="1" required="1">
> ---
> >       <parameter name="port" unique="1" required="0">
> 105c103
> <       <parameter name="tag" unique="0" required="1">
> ---
> >       <parameter name="tag" unique="0" required="0">
> 132c130
> <       <parameter name="port" unique="1" required="1">
> ---
> >       <parameter name="port" unique="1" required="0">
> 142c140
> <       <parameter name="tag" unique="0" required="1">
> ---
> >       <parameter name="tag" unique="0" required="0">
> 221a220,224
> > function monitor()
> > {
> >               # Is the device ok?
> >               aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> > }
> 267a271
> > [ -n "$2" ] && node_to_fence=$2
> 326a331,334
> > if [ -z "$port" ]; then
> >       port="$node_to_fence"
> > fi
> >
> 379,380c387
> <               # Is the device ok?
> <               aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> ---
> >               monitor
> 391c398
> <               instance_status $instance > /dev/null
> ---
> >               monitor
> 
> 
> 
> It works fine on my environment with 2 patterns settings below.
> 
> [pattern No.1]
> Without "port" and "tag" parameters.
> And instances has "Name=<uname>" tag.
> 
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
>          params \
>                  pcmk_off_timeout="120s" \
>          op start interval="0s" timeout="60s" \
>          op monitor interval="3600s" timeout="60s" \
>          op stop interval="0s" timeout="60s"
> ----
> 
> 
> [pattern No.2]
> With only "tag" parameter.(Without "port" parameter.)
> And, The 1st instance(node01) has "Cluster1=node01" tag.
> The 2nd instance(node02) has "Cluster1=node02" tag.
> 
> ----
> primitive prmStonith1-2 stonith:external/ec2 \
>          params \
>                  pcmk_off_timeout="120s" \
>                  tag="Cluster1" \
>          op start interval="0s" timeout="60s" \
>          op monitor interval="3600s" timeout="60s" \
>          op stop interval="0s" timeout="60s"
> ----

Sounds good. Sorry for the delay, but would it be possible that
you provide a patch as unified diff or similar so that we can
apply it.

Cheers,

Dejan

> 
> Regards,
> Kazuhiko Higashi
> 
> 
> On 2015/03/24 20:48, 東一彦 wrote:
> >Hi Markus,
> >
> >Thank you for the comment.
> >
> > > Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
> >I think that your idea is good.
> >
> >So, I tries to implement it.
> >I'm going to change the fence_ec2(ec2) the following points.
> >
> >  - the "tag" and the "port" options will be "not" required.
> >
> >  - if the "port" option is not set, the 2nd argument of ec2 will use as the "port".
> >    - the 2nd argument of ec2 is "node to fence".
> >
> >  - the "stat" and "status" action will be same the "monitor" action.
> >    (for do not use the "port" parameter in "stat" action.)
> >
> >
> >By the above modifications, If it is described uname in the Name tag,
> >the setting of the "tag" and "port" parameters are no longer necessary.
> >
> >----
> >primitive prmStonith1-2 stonith:external/ec2 \
> >         params \
> >                 pcmk_off_timeout="120s" \
> >         op start interval="0s" timeout="60s" \
> >         op monitor interval="3600s" timeout="60s" \
> >         op stop interval="0s" timeout="60s"
> >----
> >
> >
> >You can use "tag" parameter like your "Clustername" tag.
> >If cluster nodes(instances) have "Cluster1" tag, and uname is described in that tag,
> >it works just like you to expect.
> >
> >----
> >primitive prmStonith1-2 stonith:external/ec2 \
> >         params \
> >                 pcmk_off_timeout="120s" \
> >                 tag="Cluster1" \
> >         op start interval="0s" timeout="60s" \
> >         op monitor interval="3600s" timeout="60s" \
> >         op stop interval="0s" timeout="60s"
> >----
> >
> >The 1st instance have "Cluster1=node01" tag-key.
> >The 2nd instance have "Cluster1=node02" tag-key.
> >The 3rd instance have "Cluster1=node03" tag-key.
> >...
> >The prmStonith1-2 can fence node01 , node02 and node03.
> >
> >
> >If you like above, I will implement that.
> >
> >
> >Regards,
> >Kazuhiko Higashi
> >
> >
> >On 2015/03/19 1:03, Markus Guertler wrote:
> >>Hi Kazuhiko, Dejan,
> >>
> >>the new resource agent is very good. Since there were a couple of days between my original question and the answer from
> >>Kazuhiko, I also have written a stonith agent proof of concept (attached to this email) in order to continue in my
> >>project. However, I think that your fence_ec2 agent is better from a development perspective and it doesn't make sense
> >>to have two different agents for the same use case.
> >>
> >>Nevertheless, I've implemented an idea, that is very useful in EC2 environments with clusters that have more than two
> >>nodes: All EC2 instances that belong to a cluster get a unique cluster name via an EC2 instance tag. The agent uses this
> >>tag to determine all cluster nodes that belong to his own cluster
> >>
> >>--- SNIP ---
> >>     gethosts)
> >>         # List of hostnames of this cluster
> >>         init_agent
> >>         ec2-describe-instances --filter "tag-key=Clustername" --filter "tag-value=$clustername" | grep "^TAG" |grep
> >>"Hostname" | awk '{ print $5 }' | sort -u
> >>--- SNIP ---
> >>
> >>The advantage of this method is, that you just need one configuration snippet for all nodes. This allows to dynamically
> >>add or remove EC2 instances / cluster nodes to/from a cluster without having to need to touch the cluster configuration.
> >>Dynamically adding or removing nodes (compute instances) is a very common scenario in a cloud.
> >>
> >>Would it be possible, to implement this idea as an additional configuration method to the fence_ec2 agent?
> >>
> >>Cheers,
> >>Markus
> >>
> >>>>>東一彦 <higashi.kazuhiko at lab.ntt.co.jp> 3/12/2015 10:44 AM >>>
> >>Hi Dejan
> >>
> >>Thank you for add it and the fix some issues !
> >>
> >>
> >>  > I was not able to test it, hope it works :)
> >>I confirmed that it works fine in my AWS environment :)
> >>
> >>
> >>Regards,
> >>Kazuhiko Higashi
> >>
> >>On 2015/03/11 21:27, Dejan Muhamedagic wrote:
> >>>Hi Kazuhiko-san,
> >>>
> >>>On Wed, Mar 11, 2015 at 02:36:43PM +0900, 東一彦 wrote:
> >>>>Hi, Dejan
> >>>>
> >>>>Thank you for the comment.
> >>>>
> >>>>I'd like to contribute it as glue stonith agents.
> >>>>
> >>>>So, I rename it to just "ec2".
> >>>>
> >>>>Would you please add it to glue repository (http://hg.linux-ha.org/glue/) ?
> >>>
> >>>I just added your stonith agent. There were this change in the
> >>>initial changeset:
> >>>
> >>>- replaced '-' which is not allowed in identifiers with '_' in
> >>>    function getinfo_xml().
> >>>
> >>>There were other smaller changes. You can find them in the
> >>>repository.
> >>>
> >>>I was not able to test it, hope it works :)
> >>>
> >>>Many thanks for the contribution.
> >>>
> >>>Cheers,
> >>>
> >>>Dejan
> >>>
> >>>>Regards,
> >>>>Kazuhiko Higashi
> >>>>
> >>>>On 2015/03/06 2:38, Dejan Muhamedagic wrote:
> >>>>>Hi,
> >>>>>
> >>>>>On Tue, Mar 03, 2015 at 05:13:49PM +0900, 東一彦 wrote:
> >>>>>>Dear Markus,
> >>>>>>
> >>>>>>I was also thinking the same thing.
> >>>>>>So, Already I've created a new one.
> >>>>>
> >>>>>Perhaps you'd like to then contribute it upstream? Either to
> >>>>>glue stonith agents or RHT fencing agents. It appears that the
> >>>>>agent is using the stonith interface, but the name reflects the
> >>>>>fencing agents naming scheme.
> >>>>>
> >>>>>Cheers,
> >>>>>
> >>>>>Dejan
> >>>>>
> >>>>>>[ChangeSet]
> >>>>>>- An API to be used was changed from "Amazon EC2 CLI" to "AWS CLI".
> >>>>>>    -- "AWS CLI" is based Python. So, CPU load might be reduced.
> >>>>>>
> >>>>>>- The "--private-key" and "--cert" options are deprecated in AWS CLI.
> >>>>>>    So, I add a new option "--profile". Use a specific profile from that credential file.
> >>>>>>    default is ""
> >>>>>>
> >>>>>>
> >>>>>>[How to use]
> >>>>>>- Plaese install the "AWS CLI".
> >>>>>>    http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
> >>>>>>
> >>>>>>- Please copy the fence_ec2 in /usr/lib64/stonith/plugins/external/.
> >>>>>>    And , Please set the permissions to 755.
> >>>>>>
> >>>>>>- Please set crm settings as in this example.
> >>>>>>    - The instance that have been set as "node01" in the "Name" tag are fence.
> >>>>>>    ------
> >>>>>>    primitive prmStonith1-2 stonith:external/fence_ec2 \
> >>>>>>    params \
> >>>>>>        pcmk_off_timeout="300s" \
> >>>>>>        port="node01" \
> >>>>>>        tag="Name"
> >>\
> >>>>>>    op start interval="0s" timeout="60s" \
> >>>>>>    op monitor interval="3600s" timeout="60s" \
> >>>>>>    op stop interval="0s" timeout="60s"
> >>>>>>    ------
> >>>>>>
> >>>>>>
> >>>>>>Regards,
> >>>>>>Kazuhiko Higashi
> >>>>>>
> >>>>>>On 2015/02/25 7:22, Markus Guertler wrote:
> >>>>>>>Dear list,
> >>>>>>>I was just trying to configure the fence_ec2 stonith agent from 2012, written by Andrew Beekhof. It looks like,
> >>that this one not working anymore with newer stonith / cluster versions. Is there any other EC2 agent, that is still
> >>maintained?
> >>>>>>>
> >>>>>>>If not, I'll write one myself. However, I'd like to check all options first.
> >>>>>>>
> >>>>>>>Cheers,
> >>>>>>>Markus
> >>>>>>>
> >>>>>>>_______________________________________________
> >>>>>>>Linux-HA mailing list
> >>>>>>>Linux-HA at lists.linux-ha.org
> >>>>>>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>>>>>See also: http://linux-ha.org/ReportingProblems
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>
> >>>>#!/bin/bash
> >>>>
> >>>>description="
> >>>>fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
> >>>>
> >>>>API functions used by this agent:
> >>>>- aws ec2 describe-tags
> >>>>- aws ec2 describe-instances
> >>>>- aws ec2 stop-instances
> >>>>- aws ec2 start-instances
> >>>>- aws ec2 reboot-instances
> >>>>
> >>>>If the uname used by the cluster node is any of:
> >>>>   - Public DNS name (or part there of),
> >>>>   - Private DNS name (or part there of),
> >>>>   - Instance ID (eg. i-4f15a839)
> >>>>   - Contents of tag associated with the instance
> >>>>then the agent should be able to automatically discover the instances it can control.
> >>>>
> >>>>If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
> >>>>"
> >>>>
> >>>>#
> >>>># Copyright (c) 2011-2013 Andrew Beekhof
> >>>># Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> >>>>#                    All Rights Reserved.
> >>>>#
> >>>># This program is free software; you can redistribute it and/or modify
> >>>># it under the terms of version 2 of the GNU General Public License as
> >>>># published by the Free Software Foundation.
> >>>>#
> >>>># This program is distributed in the hope that it would be useful, but
> >>>># WITHOUT ANY WARRANTY; without even the implied warranty of
> >>>># MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >>>>#
> >>>># Further, this software is distributed without any warranty that it is
> >>>># free of the rightful claim of any third person regarding infringement
> >>>># or the like.  Any license provided herein, whether implied or
> >>>># otherwise, applies only to this software file.  Patent licenses, if
> >>>># any, provided herein do not apply to combinations of this program with
> >>>># other software, or any other product whatsoever.
> >>>>#
> >>>># You should have received a copy of the GNU General Public License
> >>>># along with this program; if not, write the Free Software Foundation,
> >>>># Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> >>>>#
> >>>>#######################################################################
> >>>>
> >>>>quiet=0
> >>>>port_default=""
> >>>>
> >>>>instance_not_found=0
> >>>>unknown_are_stopped=0
> >>>>
> >>>>action_default="reset"         # Default fence action
> >>>>ec2_tag_default="Name"           # EC2 Tag containing the instance's uname
> >>>>
> >>>>sleep_time="1"
> >>>>
> >>>>ec2_tag=${tag}
> >>>>
> >>>>: ${ec2_tag=${ec2_tag_default}}
> >>>>: ${port=${port_default}}
> >>>>
> >>>>function usage()
> >>>>{
> >>>>cat <<EOF
> >>>>`basename $0` - A fencing agent for Amazon EC2 instances
> >>>>
> >>>>$description
> >>>>
> >>>>Usage: `basename $0` -o|--action [-n|--port] [options]
> >>>>Options:
> >>>>   -h, --help         This text
> >>>>   -V, --version        Version information
> >>>>   -q, --quiet         Reduced output mode
> >>>>
> >>>>Commands:
> >>>>   -o, --action        Action to perform: on|off|reboot|status|monitor
> >>>>   -n, --port         The name of a machine/instance to control/check
> >>>>
> >>>>Additional Options:
> >>>>   -p, --profile        Use a specific profile from your credential file.
> >>>>   -t, --tag         Name of the tag containing the instance's uname
> >>>>
> >>>>Dangerous options:
> >>>>   -U, --unknown-are-stopped     Assume any unknown instance is safely stopped
> >>>>
> >>>>EOF
> >>>>
> >>    exit 0;
> >>>>}
> >>>>
> >>>>function getinfo-xml()
> >>>>{
> >>>>    cat <<EOF
> >>>><parameters>
> >>>>    <parameter name="port" unique="1" required="1">
> >>>>        <content type="string" />
> >>>>        <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="profile" unique="0" required="0">
> >>>>        <content type="string" default="default" />
> >>>>        <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="tag" unique="0" required="1">
> >>>>        <content type="string" default="Name" />
> >>>>        <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="unknown_are_stopped" unique="0" required="0">
> >>>>        <content type="string" default="false" />
> >>>>        <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> >>>>    </parameter>
> >>>></parameters>
> >>>>EOF
> >>>>    exit 0;
> >>>>}
> >>>>
> >>>>function metadata()
> >>>>{
> >>>>    cat <<EOF
> >>>><?xml version="1.0" ?>
> >>>><resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
> >>>>    <longdesc>
> >>>>$description
> >>>>    </longdesc>
> >>>>    <parameters>
> >>>>    <parameter name="action" unique="0" required="1">
> >>>>        <getopt mixed="-o, --action=[action]" />
> >>>>        <content type="string" default="reboot" />
> >>>>        <shortdesc lang="en">Fencing Action</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="port" unique="1" required="1">
> >>>>        <getopt mixed="-n, --port=[port]" />
> >>>>        <content type="string" />
> >>>>        <shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="profile" unique="0" required="0">
> >>>>        <getopt mixed="-p, --profile=[profile]" />
> >>>>        <content type="string" default="default" />
> >>>>        <shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="tag" unique="0" required="1">
> >>>>        <getopt mixed="-t, --tag=[tag]" />
> >>>>        <content type="string" default="Name" />
> >>>>        <shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> >>>>    </parameter>
> >>>>    <parameter name="unknown-are-stopped" unique="0" required="0">
> >>>>        <getopt mixed="-U, --unknown-are-stopped" />
> >>>>        <content type="string" default="false" />
> >>>>        <shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> >>>>    </parameter>
> >>>>    </parameters>
> >>>>    <actions>
> >>>>    <action name="on" />
> >>>>    <action name="off" />
> >>>>    <action name="reboot" />
> >>>>    <action name="status" />
> >>>>    <action name="list" />
> >>>>    <action name="monitor" />
> >>>>    <action name="metadata" />
> >>>>    </actions>
> >>>></resource-agent>
> >>>>EOF
> >>>>    exit 0;
> >>>>}
> >>>>
> >>>>function instance_for_port()
> >>>>{
> >>>>    local port=$1
> >>>>    local instance=""
> >>>>
> >>>>    # Look for port name -n in the INSTANCE data
> >>>>    instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk
> >>'{print $8}'`
> >>>>    if [ -z $instance ]; then
> >>>>        # Look for port name -n in the Name TAG
> >>>>        instance=`aws ec2 describe-tags $options | grep
> >>"^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
> >>>>    fi
> >>>>
> >>>>    if [ -z $instance ]; then
> >>>>        instance_not_found=1
> >>>>        instance=$port
> >>>>    fi
> >>>>
> >>>>    echo $instance
> >>>>}
> >>>>
> >>>>function instance_on()
> >>>>{
> >>>>    aws ec2 start-instances $options --instance-ids $instance
> >>>>}
> >>>>
> >>>>function instance_off()
> >>>>{
> >>>>    if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
> >>>>        : nothing to do
> >>>>        ha_log.sh info "Assuming unknown instance $instance is already off"
> >>>>    else
> >>>>        aws ec2 stop-instances $options --instance-ids $instance --force
> >>>>    fi
> >>>>}
> >>>>
> >>>>function instance_status()
> >>>>{
> >>>>    local instance=$1
> >>>>    local status="unknown"
> >>>>    local rc=1
> >>>>
> >>>>    # List of instances and their current status
> >>>>    if [ $unknown_are_stopped = 1 -a $instance_not_found ]; then
> >>>>        ha_log.sh info "$instance stopped (unknown)"
> >>>>    else
> >>>>        status=`aws ec2 describe-instance
> >>s $options --instance-ids $instance | awk '{
> >>>>            if (/^STATE¥t/) { printf "%s", $3 }
> >>>>            }'`
> >>>>        rc=$?
> >>>>    fi
> >>>>    ha_log.sh info "status check for $instance is $status"
> >>>>    echo $status
> >>>>    return $rc
> >>>>}
> >>>>
> >>>>
> >>>>TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped ¥
> >>>>       -n 'fence_ec2' -- "$@"`
> >>>>
> >>>>if [ $? != 0 ];then
> >>>>      usage
> >>>>      exit 1
> >>>>fi
> >>>>
> >>>># Note the quotes around `$TEMP': they are essential!
> >>>>eval set -- "$TEMP"
> >>>>
> >>>>if [ -z $1 ]; then
> >>>>    # If there are no command line args, look for options from stdin
> >>>>    while read line; do
> >>>>        case $line in
> >>>>            option=*|action=*) action=`echo $line | sed s/.*=//`;;
> >>>>            port=*)        port=`echo $line | sed s/.*=//`;;
> >>>>            profile=*)     ec2_profile=`echo $line | sed s/.*=//`;;
> >>>>            tag=*)         ec2_tag=`echo $line | sed s/.*=//`;;
> >>>>            quiet*)        quiet=1;;
> >>>>            unknown-are-stopped*) unknown_are_stopped=1;;
> >>>>            --);;
> >>>>            *) ha_log.sh err "Invalid command: $line";;
> >>>>        esac
> >>>>    done
> >>>>fi
> >>>>
> >>>>while true ; do
> >>>>    case "$1" in
> >>>>        -o|--action|--option) action=$2;   shift; shift;;
> >>>>        -n|--port)            port=$2;     shift; shift;;
> >>>>        -p|--profile)         ec2_profile=$2; shift; shift;;
> >>>>        -t|--tag)          ec2_tag=$2; shift; shift;;
> >>>>        -U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
> >>>>        -q|--quiet) quiet=1; shift;;
> >>>>        -V|--version) echo "1.0.0"; exit 0;;
> >>>>        --help|-h)
> >>>>            usage;
> >>>>            exit 0;;
> >>>>        --) shift ; break ;;
> >>>>        *) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
> >>>>    esac
> >>>>done
> >>>>
> >>>>[ -n "$1" ] && action=$1
> >>>>
> >>>>if [ -z "$ec2_profile"]; then
> >>>>    options="--output text --profile default"
> >>>>else
> >>>>    options="--output text --profile $ec2_profile "
> >>>>fi
> >>>>
> >>>>action=`echo $action | tr 'A-Z' 'a-z'`
> >>>>
> >>>>case $action in
> >>>>    metadata)
> >>>>        metadata
> >>>>    ;;
> >>>>    getinfo-xml)
> >>>>        getinfo-xml
> >>>>    ;;
> >>>>    getconfignames)
> >>>>        for i in profile port tag
> >>>>        do
> >>>>            echo $i
> >>>>        done
> >>>>        exit 0
> >>>>    ;;
> >>>>    getinfo-devid)
> >>>>        echo "EC2 STONITH device"
> >>>>        exit 0
> >>>>    ;;
> >>>>    getinfo-devname)
> >>>>        echo "EC2 STONITH external device"
> >>>>        exit 0
> >>>>    ;;
> >>>>    getinfo-devdescr)
> >>>>        echo "fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
> >>>>        exit 0
> >>>>    ;;
> >>>>    getinfo-devurl)
> >>>>        echo ""
> >>>>        exit 0
> >>>>    ;;
> >>>>esac
> >>>>
> >>>># get my instance id
> >>>>myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
> >>>>
> >>>># check my status.
> >>>># When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
> >>>># While the OS stop processing, Pacemaker can execute the STONITH processing.
> >>>># So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
> >>>># in split-brain, I don't fence other node.
> >>>>if [ -z "$myinstance" ]; then
> >>>>    ha_log.sh err "Failed to get My Instance ID. so can not check my status."
> >>>>    exit 1
> >>>>fi
> >>>>mystatus=`instance_status $myinstance`
> >>>>if [ "$mystatus" != "running" ]; then #do not fence
> >>>>    ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
> >>>>    exit 1
> >>>>fi
> >>>>
> >>>># get target's instance id
> >>>>instance=""
> >>>>if [ ! -z "$port" ]; then
> >>>>    instance=`instance_for_port $port $options`
> >>>>fi
> >>>>
> >>>>case $action in
> >>>>    reboot|reset)
> >>>>        status=`instance_status $instance`
> >>>>        if [ "$status" != "stopped" ]; then
> >>>>            instance_off
> >>>>        fi
> >>>>        while true;
> >>>>        do
> >>>>            status=`instance_status $instance`
> >>>>            if [ "$status" = "stopped" ]; then
> >>>>                break
> >>>>            fi
> >>>>            sleep $sleep_time
> >>>>        done
> >>>>        instance_on
> >>>>        while true;
> >>>>        do
> >>>>            status=`instance_status $instance`
> >>>>            if [ "$status" = "running" ]; then
> >>>>                break
> >>>>            fi
> >>>>            sleep $sleep_time
> >>>>        done
> >>>>    ;;
> >>>>    poweron|on)
> >>>>        instance_on
> >>>>        while true;
> >>>>        do
> >>>>            status=`instance_status $instance`
> >>>>            if [ "$
> >>status" = "running" ]; then
> >>>>                break
> >>>>            fi
> >>>>        done
> >>>>    ;;
> >>>>    poweroff|off)
> >>>>        instance_off
> >>>>        while true;
> >>>>        do
> >>>>            status=`instance_status $instance`
> >>>>            if [ "$status" = "stopped" ]; then
> >>>>                break
> >>>>            fi
> >>>>            sleep $sleep_time
> >>>>        done
> >>>>    ;;
> >>>>    monitor)
> >>>>        # Is the device ok?
> >>>>        aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> >>>>    ;;
> >>>>    gethosts|hostlist|list)
> >>>>        # List of names we know about
> >>>>        a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS¥t$ec2_tag¥t" -F '¥t' '{
> >>>>            if (/^INSTANCES/) { printf "%s¥n", $8 }
> >>>>            else if ( $1"¥t"$2"¥t" ‾ tag_pat ) { printf "%s¥n", $3 }
> >>>>            }' | sort -u`
> >>>>        echo $a
> >>>>    ;;
> >>>>    stat|status)
> >>>>        instance_status $instance > /dev/null
> >>>>    ;;
> >>>>    *) ha_log.sh err "Unknown action: $action"; exit 1;;
> >>>>esac
> >>>>
> >>>>status=$?
> >>>>
> >>>>if [ $quiet -eq 1 ]; then
> >>>>    : nothing
> >>>>elif [ $status -eq 0 ]; then
> >>>>    ha_log.sh info "Operation $action passed"
> >>>>else
> >>>>    ha_log.sh err "Operation $action failed: $status"
> >>>>fi
> >>>>exit $status
> >>>
> >>>>_______________________________________________
> >>>>Linux-HA mailing list
> >>>>Linux-HA at lists.linux-ha.org
> >>>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>>See also: http://linux-ha.org/ReportingProblems
> >>>
> >>>_______________________________________________
> >>>Linux-HA mailing list
> >>>Linux-HA at lists.linux-ha.org
> >>>http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>>See also: http://linux-ha.org/ReportingProblems
> >>>
> >>>
> >>
> >>
> >>
> >>
> >>_______________________________________________
> >>Users mailing list: Users at clusterlabs.org
> >>http://clusterlabs.org/mailman/listinfo/users
> >>
> >>Project Home: http://www.clusterlabs.org
> >>Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> >>Bugs: http://bugs.clusterlabs.org
> >>
> >
> >
> 
> 

> #!/bin/bash
> 
> description="
> fence_ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances.
> 
> API functions used by this agent:
> - aws ec2 describe-tags
> - aws ec2 describe-instances
> - aws ec2 stop-instances
> - aws ec2 start-instances
> - aws ec2 reboot-instances
> 
> If the uname used by the cluster node is any of:
>  - Public DNS name (or part there of),
>  - Private DNS name (or part there of),
>  - Instance ID (eg. i-4f15a839)
>  - Contents of tag associated with the instance
> then the agent should be able to automatically discover the instances it can control.
> 
> If the tag containing the uname is not [Name], then it will need to be specified using the [tag] option.
> "
> 
> #
> # Copyright (c) 2011-2013 Andrew Beekhof
> # Copyright (c) 2014 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> #                    All Rights Reserved.
> #
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of version 2 of the GNU General Public License as
> # published by the Free Software Foundation.
> #
> # This program is distributed in the hope that it would be useful, but
> # WITHOUT ANY WARRANTY; without even the implied warranty of
> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> #
> # Further, this software is distributed without any warranty that it is
> # free of the rightful claim of any third person regarding infringement
> # or the like.  Any license provided herein, whether implied or
> # otherwise, applies only to this software file.  Patent licenses, if
> # any, provided herein do not apply to combinations of this program with
> # other software, or any other product whatsoever.
> #
> # You should have received a copy of the GNU General Public License
> # along with this program; if not, write the Free Software Foundation,
> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
> #
> #######################################################################
> 
> quiet=0
> 
> instance_not_found=0
> unknown_are_stopped=0
> 
> action_default="reset"         # Default fence action
> ec2_tag_default="Name"	       # EC2 Tag containing the instance's uname
> 
> sleep_time="1"
> 
> [ -n "$tag" ] && ec2_tag="$tag"
> 
> : ${ec2_tag=${ec2_tag_default}}
> 
> function usage()
> {
> cat <<EOF
> `basename $0` - A fencing agent for Amazon EC2 instances
>  
> $description
>  
> Usage: `basename $0` -o|--action [-n|--port] [options]
> Options:
>  -h, --help 		This text
>  -V, --version		Version information
>  -q, --quiet 		Reduced output mode
>  
> Commands:
>  -o, --action		Action to perform: on|off|reboot|status|monitor
>  -n, --port 		The name of a machine/instance to control/check
> 
> Additional Options:
>  -p, --profile		Use a specific profile from your credential file.
>  -t, --tag 		Name of the tag containing the instance's uname
> 
> Dangerous options:
>  -U, --unknown-are-stopped 	Assume any unknown instance is safely stopped
> 
> EOF
>     exit 0;
> }
> 
> function getinfo_xml()
> {
> 	cat <<EOF
> <parameters>
> 	<parameter name="port" unique="1" required="0">
> 		<content type="string" />
> 		<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> 	</parameter>
> 	<parameter name="profile" unique="0" required="0">
> 		<content type="string" default="default" />
> 		<shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> 	</parameter>
> 	<parameter name="tag" unique="0" required="0">
> 		<content type="string" default="Name" />
> 		<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> 	</parameter>
> 	<parameter name="unknown_are_stopped" unique="0" required="0">
> 		<content type="string" default="false" />
> 		<shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> 	</parameter>
> </parameters>
> EOF
> 	exit 0;
> }
> 
> function metadata()
> {
> 	cat <<EOF
> <?xml version="1.0" ?>
> <resource-agent name="fence_ec2" shortdesc="Fencing agent for Amazon EC2 instances" >
> 	<longdesc>
> $description
> 	</longdesc>
> 	<parameters>
> 	<parameter name="action" unique="0" required="1">
> 		<getopt mixed="-o, --action=[action]" />
> 		<content type="string" default="reboot" />
> 		<shortdesc lang="en">Fencing Action</shortdesc>
> 	</parameter>
> 	<parameter name="port" unique="1" required="0">
> 		<getopt mixed="-n, --port=[port]" />
> 		<content type="string" />
> 		<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
> 	</parameter>
> 	<parameter name="profile" unique="0" required="0">
> 		<getopt mixed="-p, --profile=[profile]" />
> 		<content type="string" default="default" />
> 		<shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
> 	</parameter>
> 	<parameter name="tag" unique="0" required="0">
> 		<getopt mixed="-t, --tag=[tag]" />
> 		<content type="string" default="Name" />
> 		<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
> 	</parameter>
> 	<parameter name="unknown-are-stopped" unique="0" required="0">
> 		<getopt mixed="-U, --unknown-are-stopped" />
> 		<content type="string" default="false" />
> 		<shortdesc lang="en">DANGER: Assume any unknown instance is safely stopped</shortdesc>
> 	</parameter>
> 	</parameters>
> 	<actions>
> 	<action name="on" />
> 	<action name="off" />
> 	<action name="reboot" />
> 	<action name="status" />
> 	<action name="list" />
> 	<action name="monitor" />
> 	<action name="metadata" />
> 	</actions>
> </resource-agent>
> EOF
> 	exit 0;
> }
> 
> function instance_for_port()
> {
> 	local port=$1
> 	local instance=""
> 
> 	# Look for port name -n in the INSTANCE data
> 	instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk '{print $8}'`
> 	if [ -z $instance ]; then
> 		# Look for port name -n in the Name TAG
> 		instance=`aws ec2 describe-tags $options | grep "^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
> 	fi
> 
> 	if [ -z $instance ]; then
> 		instance_not_found=1
> 		instance=$port
> 	fi
> 
> 	echo $instance
> }
> 
> function instance_on()
> {
> 	aws ec2 start-instances $options --instance-ids $instance
> }
> 
> function instance_off()
> {
> 	if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
> 		: nothing to do
> 		ha_log.sh info "Assuming unknown instance $instance is already off"
> 	else
> 		aws ec2 stop-instances $options --instance-ids $instance --force
> 	fi
> }
> 
> function instance_status()
> {
> 	local instance=$1
> 	local status="unknown"
> 	local rc=1
> 
> 	# List of instances and their current status
> 	if [ "$unknown_are_stopped" = 1 -a $instance_not_found ]; then
> 		ha_log.sh info "$instance stopped (unknown)"
> 	else
> 		status=`aws ec2 describe-instances $options --instance-ids $instance | awk '{ 
> 			if (/^STATE¥t/) { printf "%s", $3 }
> 			}'`
> 		rc=$?
> 	fi
> 	ha_log.sh info "status check for $instance is $status"
> 	echo $status
> 	return $rc
> }
> 
> function monitor()
> {
> 		# Is the device ok?
> 		aws ec2 describe-instances $options | grep INSTANCES &> /dev/null
> }
> 
> TEMP=`getopt -o qVho:e:p:n:t:U --long version,help,action:,port:,option:,profile:,tag:,quiet,unknown-are-stopped ¥
>      -n 'fence_ec2' -- "$@"`
> 
> if [ $? != 0 ];then 
>     usage
>     exit 1
> fi
> 
> # Note the quotes around `$TEMP': they are essential!
> eval set -- "$TEMP"
> 
> if [ -z $1 ]; then
> 	# If there are no command line args, look for options from stdin
> 	while read line; do
> 		case $line in 
> 			option=*|action=*) action=`echo $line | sed s/.*=//`;;
> 			port=*)        port=`echo $line | sed s/.*=//`;;
> 			profile=*)     ec2_profile=`echo $line | sed s/.*=//`;;
> 			tag=*)         ec2_tag=`echo $line | sed s/.*=//`;;
> 			quiet*)        quiet=1;;
> 			unknown-are-stopped*) unknown_are_stopped=1;;
> 			--);;
> 			*) ha_log.sh err "Invalid command: $line";;
> 		esac
> 	done
> fi
> 
> while true ; do
> 	case "$1" in
> 		-o|--action|--option) action=$2;   shift; shift;;
> 		-n|--port)            port=$2;     shift; shift;;
> 		-p|--profile)         ec2_profile=$2; shift; shift;;
> 		-t|--tag)	      ec2_tag=$2; shift; shift;;
> 		-U|--unknown-are-stopped) unknown_are_stopped=1; shift;;
> 		-q|--quiet) quiet=1; shift;;
> 		-V|--version) echo "1.0.0"; exit 0;;
> 		--help|-h) 
> 			usage;
> 			exit 0;;
> 		--) shift ; break ;;
> 		*) ha_log.sh err "Unknown option: $1. See --help for details."; exit 1;;
> 	esac
> done
> 
> [ -n "$1" ] && action=$1
> [ -n "$2" ] && node_to_fence=$2
> 
> if [ -z "$ec2_profile"]; then
> 	options="--output text --profile default"
> else
> 	options="--output text --profile $ec2_profile "
> fi
> 
> action=`echo $action | tr 'A-Z' 'a-z'`
> 
> case $action in 
> 	metadata)
> 		metadata
> 	;;
> 	getinfo-xml)
> 		getinfo_xml
> 	;;
> 	getconfignames)
> 		for i in profile port tag unknown_are_stopped
> 		do
> 			echo $i
> 		done
> 		exit 0
> 	;;
> 	getinfo-devid)
> 		echo "EC2 STONITH device"
> 		exit 0
> 	;;
> 	getinfo-devname)
> 		echo "EC2 STONITH external device"
> 		exit 0
> 	;;
> 	getinfo-devdescr)
> 		echo "ec2 is an I/O Fencing agent which can be used with Amazon EC2 instances."
> 		exit 0
> 	;;
> 	getinfo-devurl)
> 		echo ""
> 		exit 0
> 	;;
> esac
> 
> # get my instance id
> myinstance=`curl http://169.254.169.254/latest/meta-data/instance-id`
> 
> # check my status.
> # When the EC2 instance be stopped by the "aws ec2 stop-instances" , the stop processing of the OS is executed.
> # While the OS stop processing, Pacemaker can execute the STONITH processing.
> # So, If my status is not "running", it determined that I was already fenced. And to prevent fencing each other
> # in split-brain, I don't fence other node.
> if [ -z "$myinstance" ]; then
> 	ha_log.sh err "Failed to get My Instance ID. so can not check my status."
> 	exit 1
> fi
> mystatus=`instance_status $myinstance`
> if [ "$mystatus" != "running" ]; then #do not fence
> 	ha_log.sh warn "I was already fenced (My instance status=$mystatus). I don't fence other node."
> 	exit 1
> fi
> 
> if [ -z "$port" ]; then
> 	port="$node_to_fence"
> fi
> 
> # get target's instance id
> instance=""
> if [ ! -z "$port" ]; then
> 	instance=`instance_for_port $port $options`
> fi
> 
> case $action in 
> 	reboot|reset)
> 		status=`instance_status $instance`
> 		if [ "$status" != "stopped" ]; then
> 			instance_off
> 		fi
> 		while true;
> 		do
> 			status=`instance_status $instance`
> 			if [ "$status" = "stopped" ]; then
> 				break
> 			fi
> 			sleep $sleep_time
> 		done
> 		instance_on
> 		while true;
> 		do
> 			status=`instance_status $instance`
> 			if [ "$status" = "running" ]; then
> 				break
> 			fi
> 			sleep $sleep_time
> 		done
> 	;;
> 	poweron|on)
> 		instance_on
> 		while true;
> 		do
> 			status=`instance_status $instance`
> 			if [ "$status" = "running" ]; then
> 				break
> 			fi
> 		done
> 	;;
> 	poweroff|off)
> 		instance_off
> 		while true;
> 		do
> 			status=`instance_status $instance`
> 			if [ "$status" = "stopped" ]; then
> 				break
> 			fi
> 			sleep $sleep_time
> 		done
> 	;;
> 	monitor)
> 		monitor
> 	;;
> 	gethosts|hostlist|list)
> 		# List of names we know about
> 		a=`aws ec2 describe-instances $options | awk -v tag_pat="^TAGS¥t$ec2_tag¥t" -F '¥t' '{ 
> 			if (/^INSTANCES/) { printf "%s¥n", $8 }
> 			else if ( $1"¥t"$2"¥t" ‾ tag_pat ) { printf "%s¥n", $3 }
> 			}' | sort -u`
> 		echo $a
> 	;;
> 	stat|status)
> 		monitor
> 	;;
> 	*) ha_log.sh err "Unknown action: $action"; exit 1;;
> esac
> 
> status=$?
> 
> if [ $quiet -eq 1 ]; then
> 	: nothing
> elif [ $status -eq 0 ]; then
> 	ha_log.sh info "Operation $action passed"
> else
> 	ha_log.sh err "Operation $action failed: $status"
> fi
> exit $status

> _______________________________________________
> Users mailing list: Users at clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org





More information about the Users mailing list