[ClusterLabs] pcsd processes using 100% CPU

Fri May 18 16:04:53 EDT 2018

On a couple clusters that have been running for a little while (without fencing), I'm seeing runaway server.rb processes using 100% of a single CPU core each.

When I look at ps, I can see that these have something to do with pcsd:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      6103  0.0  0.3 1076744 59200 ?       Ssl  Apr06  59:09 /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root     17548 99.3  0.2 873648 46308 ?        Rl   Apr18 43356:57  \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root     16688 98.9  0.3 941160 49472 ?        Rl   May01 24300:52  \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root      6009 98.8  0.3 942188 49688 ?        R    May02 22607:08  \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &
root     15556 98.8  0.3 1076344 51836 ?       R    May03 21410:12  \_ /usr/bin/ruby -C/var/lib/pcsd -I/usr/share/pcsd -- /usr/share/pcsd/ssl.rb & > /dev/null &

Running strace on one of the processes shows that they are looping on sched_yield().

What are these processes and what is causing them to occur?  It appears that killing them frees up the CPU without detrimental impact on the cluster...

Thanks,
-- 
Casey