[ClusterLabs] pcsd processes using 100% CPU
Shobe, Casey
Casey.Shobe at sling.com
Fri May 18 16:04:53 EDT 2018
On a couple clusters that have been running for a little while (without fencing), I'm seeing runaway server.rb processes using 100% of a single CPU core each.
When I look at ps, I can see that these have something to do with pcsd:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 6103 0.0 0.3 1076744 59200 ? Ssl Apr06 59:09 /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root 17548 99.3 0.2 873648 46308 ? Rl Apr18 43356:57 \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root 16688 98.9 0.3 941160 49472 ? Rl May01 24300:52 \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root 6009 98.8 0.3 942188 49688 ? R May02 22607:08 \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root 15556 98.8 0.3 1076344 51836 ? R May03 21410:12 \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
Running strace on one of the processes shows that they are looping on sched_yield().
What are these processes and what is causing them to occur? It appears that killing them frees up the CPU without detrimental impact on the cluster...
Thanks,
--
Casey
More information about the Users
mailing list