[ClusterLabs] Comments on stonith:external/ec2 agent

Kristoffer Grönlund kgronlund at suse.com
Tue Dec 1 06:43:47 EST 2015


Hi all,

Markus Guertler who was part of the original discussion in March sent
these comments regarding the external/ec2 agent. Kazuhiko-san, what do
you think about the attached patch and the comments below?

Quoting Markus:

> The bug is not really a bug in the fencing agent, but in the Python
> AWS API CLI. It looks like, that sometimes, when calling two or more
> 'aws ec2 ...' commands in a very short time period, the 'aws ec2'
> commands returns an error. I've created a patch (attached) with a
> little workaround, that lets the agent wait for two seconds between
> two 'describe' commands. Furthermore this patch includes slightly
> improved logging and an improved parameter description.
>
> The 'gethostlist' code seems to be a bit strange:
>
> --- SNIP ---
>        gethosts|hostlist|list)
>                # List of names we know about
>                a=3D`aws ec2 describe-instances $options | awk -v tag_pat=3D=22^TAGS\t$ec2_tag\t=22 -F '\t' '{ 
>                        if (/^INSTANCES/) { printf =22%s\n=22, $8 }
>                        else if ( $1=22\t=22$2=22\t=22 ~ tag_pat ) { printf =22%s\n=22, $3 }
>                        }' | sort -u`
>                echo $a
> --- SNIP ---
>
> First, it outputs all found instance id's (fist if-clause in the awk
> command), not only the ones that are belonging to the cluster, which
> might result in a very large number. Second, why is it printing the
> instance-id's anyway=3F Shouldn't it return hostnames instead=3F
> Hostnames can't be obtained at all using this command.
>
> The 'else if' clause works fine and outputs the hostnames belonging to
> tag_pat, which are hostnames that are set using AWS instance tags with
> a unique tag key and the node hostnames as values.

Best regards,
Kristoffer

-- 
// Kristoffer Grönlund
// kgronlund at suse.com

===File /home/krig/Desktop/ec2.patch========================
--- usr/lib64/stonith/plugins/external/ec2	2015-09-28 10:18:20.000000000 +0200
+++ usr/lib64/stonith/plugins/external/ec2	2015-10-08 23:14:57.092036211 +0200
@@ -94,15 +94,16 @@ 
 <parameters>
 	<parameter name="port" unique="1" required="0">
 		<content type="string" />
-		<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
+		<shortdesc lang="en">The instance name (not the hostname!) or instance-id of a node to fence</shortdesc>
 	</parameter>
 	<parameter name="profile" unique="0" required="0">
 		<content type="string" default="default" />
-		<shortdesc lang="en">Use a specific profile from your credential file.</shortdesc>
+		<shortdesc lang="en">Use a specific profile from your credential file</shortdesc>
 	</parameter>
 	<parameter name="tag" unique="0" required="0">
 		<content type="string" default="Name" />
-		<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
+		<shortdesc lang="en">Name of an AWS instance tag containing the hostname of a node</shortdesc>
+		<longdesc lang="en">Name of an AWS instance tag containing the hostname of a node. When used, the instance tag must be set for all nodes belonging to this cluster. The cluster identifies the list of nodes that can be fenced via the instance tag. Therefore the port parameter can be omitted. The resource can be configured as a clone resource</longdesc>
 	</parameter>
 	<parameter name="unknown_are_stopped" unique="0" required="0">
 		<content type="string" default="false" />
@@ -130,7 +131,7 @@ 
 	<parameter name="port" unique="1" required="0">
 		<getopt mixed="-n, --port=[port]" />
 		<content type="string" />
-		<shortdesc lang="en">The name/id/tag of a instance to control/check</shortdesc>
+		<shortdesc lang="en">The instance name (not the hostname!) or instance-id of a node to fence</shortdesc>
 	</parameter>
 	<parameter name="profile" unique="0" required="0">
 		<getopt mixed="-p, --profile=[profile]" />
@@ -140,7 +141,8 @@ 
 	<parameter name="tag" unique="0" required="0">
 		<getopt mixed="-t, --tag=[tag]" />
 		<content type="string" default="Name" />
-		<shortdesc lang="en">Name of the tag containing the instances uname</shortdesc>
+		<shortdesc lang="en">Name of an AWS instance tag containing the hostname of a node</shortdesc>
+		<longdesc lang="en">Name of an AWS instance tag containing the hostname of a node. When used, the instance tag must be set for all nodes belonging to this cluster. The cluster identifies the list of nodes that can be fenced via the instance tag. Therefore the port parameter can be omitted. The resource can be configured as a clone resource</longdesc>
 	</parameter>
 	<parameter name="unknown-are-stopped" unique="0" required="0">
 		<getopt mixed="-U, --unknown-are-stopped" />
@@ -170,12 +172,15 @@ 
 	# Look for port name -n in the INSTANCE data
 	instance=`aws ec2 describe-instances $options | grep "^INSTANCES[[:space:]].*[[:space:]]$port[[:space:]]" | awk '{print $8}'`
 	if [ -z $instance ]; then
+		# Workaround: Executing too many aws commands within a short period of time might result in an error
+		sleep 2
 		# Look for port name -n in the Name TAG
 		instance=`aws ec2 describe-tags $options | grep "^TAGS[[:space:]]$ec2_tag[[:space:]].*[[:space:]]instance[[:space:]]$port$" | awk '{print $3}'`
 	fi
 
 	if [ -z $instance ]; then
 		instance_not_found=1
+		ha_log.sh err "Instance-id not found for $port!"
 		instance=$port
 	fi
 
============================================================




More information about the Users mailing list