<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
code
{mso-style-priority:99;
font-family:"Courier New";}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Courier New";}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">*** Adding 100 Resources Locks Cluster for Several Minutes<o:p></o:p></p>
<p class="MsoNormal">Adding 100 resources to the cluster causes the cib process to jump to 100% when viewed with the "top" command (all nodes), and the cluster becomes unresponsive to commands like "crm status" or "cibadmin -Q" for several minutes.<br>
<code><span style="font-size:10.0pt">cibadmin -R --scope resources -x rsrc100.xml</span></code><br>
The following listing shows that all the resources were allocated to node 11, no other nodes received resources even though they were online, and every entry listed an error after approximately 10 minutes elapsed from when they were added to the cluster.
<o:p></o:p></p>
<pre>[root@pcs_linuxha_11 ~]# crm status<o:p></o:p></pre>
<pre>============<o:p></o:p></pre>
<pre>Last updated: Fri Jan 27 19:21:12 2012<o:p></o:p></pre>
<pre>Last change: Fri Jan 27 19:14:35 2012 via cibadmin on pcs_linuxha_1<o:p></o:p></pre>
<pre>Stack: openais<o:p></o:p></pre>
<pre>Current DC: pcs_linuxha_1 - partition with quorum<o:p></o:p></pre>
<pre>Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558<o:p></o:p></pre>
<pre>15 Nodes configured, 15 expected votes<o:p></o:p></pre>
<pre>100 Resources configured.<o:p></o:p></pre>
<pre>============<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Online: [ pcs_linuxha_1 pcs_linuxha_2 pcs_linuxha_3 pcs_linuxha_4 pcs_linuxha_5 pcs_linuxha_6 pcs_linuxha_7 pcs_linuxha_8 pcs_linuxha_9 pcs_linuxha_10 pcs_linuxha_11 pcs_linuxha_12 pcs_linuxha_13 pcs_linuxha_14 pcs_linuxha_15 ]<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre> pcs_resource_1 (ocf::idirect:ppct): Started pcs_linuxha_11<o:p></o:p></pre>
<pre> pcs_resource_2 (ocf::idirect:ppct): Started pcs_linuxha_11<o:p></o:p></pre>
<pre>...<o:p></o:p></pre>
<pre> pcs_resource_100 (ocf::idirect:ppct): Started pcs_linuxha_11<o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<pre>Failed actions:<o:p></o:p></pre>
<pre> pcs_resource_1_monitor_0 (node=pcs_linuxha_11, call=-1, rc=1, status=Timed Out): unknown error<o:p></o:p></pre>
<pre> pcs_resource_2_monitor_0 (node=pcs_linuxha_11, call=-1, rc=1, status=Timed Out): unknown error<o:p></o:p></pre>
<pre>...<o:p></o:p></pre>
<pre> pcs_resource_100_monitor_0 (node=pcs_linuxha_11, call=-1, rc=1, status=Timed Out): unknown error<o:p></o:p></pre>
<pre>[root@pcs_linuxha_11 ~]# <o:p></o:p></pre>
<pre><o:p> </o:p></pre>
<p>Update: Adding an additional 300 resources caused the cib process to go to 100% cpu utilization for approximately 17 minutes, and caused the designated controller (DC) to switch from node 1 to node 5. Many errors were logged at the 17 minute point on output
of crm status, although the load was split amongst the cluster instead of all being loaded on node 11 as with the first 100 resources.
<o:p></o:p></p>
<p><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<BR><span style='font-size:8.0pt;font-family:"Arial","sans-serif";color:#003366'>
_____________________________________________________<BR>
This electronic message and any files transmitted with it contains<BR>
information from iDirect, which may be privileged, proprietary<BR>
and/or confidential. It is intended solely for the use of the individual<BR>
or entity to whom they are addressed. If you are not the original<BR>
recipient or the person responsible for delivering the email to the<BR> intended recipient, be advised that you have received this email<BR>
in error, and that any use, dissemination, forwarding, printing, or<BR> copying of this email is strictly prohibited. If you received this email<BR>
in error, please delete it and immediately notify the sender.<BR>
_____________________________________________________
</SPAN></body>
</html>