[Pacemaker] socket is incremented after running crm shell

David Vossel dvossel at redhat.com
Tue Apr 3 11:53:57 EDT 2012


----- Original Message -----
> From: "Junko IKEDA" <tsukishima.ha at gmail.com>
> To: "The Pacemaker cluster resource manager" <pacemaker at oss.clusterlabs.org>
> Sent: Friday, March 30, 2012 1:07:37 AM
> Subject: [Pacemaker] socket is incremented after running crm shell
> 
> Hi,
> 
> I encountered the following error message during the run of crm
> shell.
> 
> crmd: [6837]: ERROR: socket_accept_connection: accept(sock=10): Too
> many open files
> 
> The same error is discussed here,
> https://developerbugs.linuxfoundation.org/show_bug.cgi?id=2626
> but it seems that this one is an issue for ubuntus and upstart,
> and my case is probably the other one.
> 
> I can reproduce the similar case like this;
> in this case, "bl460g6b" is DC.
> and socket is incremented by one on only DC after running crm shell.
> 
> 
> * Initial status
> 
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:26:04 JST 2012
> bl460g6a
> 42
> 
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:26:16 JST 2012
> bl460g6b
> 46
> 
> 
> 
> * Upload the resource setting(just start one Dummy RA)
> 
> # crm configure load update cib.crm
> # crm_mon -1
> 
> ============
> Last updated: Fri Mar 30 14:26:52 2012
> Stack: Heartbeat
> Current DC: bl460g6b (22222222-2222-2222-2222-222222222222) -
> partition with quorum
> Version: 1.0.12-unknown
> 2 Nodes configured, unknown expected votes
> 1 Resources configured.
> ============
> 
> Online: [ bl460g6a bl460g6b ]
> 
> dummy01 (ocf::pacemaker:Dummy): Started bl460g6a
> 
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:27:10 JST 2012
> bl460g6a
> 42
> 
> # date; hostname; lsof -p $(pgrep crmd) | wc -l
> Fri Mar 30 14:27:16 JST 2012
> bl460g6b
> 47 <==== +1

I see the same thing.  I'm using the latest pacemaker source from the master branch, so this definitely still exists.  For me the file leak occurs every time I issue a "cibadmin --replace --xml-file" command.  The shell is doing the same command internally for adding and removing resources, so I see it there as well.

I opened a bug report for this.
http://bugs.clusterlabs.org/show_bug.cgi?id=5051

I'll keep investigating it.

-- Vossel




More information about the Pacemaker mailing list