[ClusterLabs Developers] Reminder that /proc is just rather an unreliable quirk, not a firm grip on processes

Wed Jul 3 09:45:03 UTC 2019

[in a sense, this is a follow-up for my recent post:
https://lists.clusterlabs.org/pipermail/users/2019-May/025749.html]

Have come across an interesting experience regarding /proc traversal:

https://rkeene.org/projects/info/wiki/173

(as well as a danger of exhausting available inodes mentioned in the
new spurred discussion: https://lobste.rs/s/ihz50b/day_proc_died)

Even if it wasn't observed with Linux in that particular case, it just
adds to the overall arguments why to avoid it, directly or indirectly
(that's what ps, pidof, killall etc. do make use of) whenever possible,
for instance:

- (at least on most systems) no snapshot semantics, meaning the
  scan-through is completely racy and ephemeral processes (or
  a fork chain thereof, see also CVE-2018-1121 for intentional
  carefully crafted abuse) are easy to miss completely

- problem of recycled PIDs is imminent (however theoretical), when
  the observer cannot subscribe itself to watch for changes in the
  process under supervision (verging on problems related to polling
  vs. event based systems, incl. timely responses to changes)

- finally, all these problems with unexpected behaviours of /proc
  under corner case situations like that mentioned initially, but
  add the possibility that arbitrary unprivileged users can
  deliberately block /proc enumeration triggered in other processes
  incl. privileged ones in Linux systems (see CVE-2018-1120[*]),
  for instance

Now, why I am mentioning, higher layers of cluster stack rely
heavily on /proc inspection, net outcome being that it can only
be as realiable as /proc filesystem is, not more.

So my ask here is to use our brain cluster (pun intended) so as
to devise ways how to get less reliant on /proc based enumeration.
One portable idea is to allow for agents persistency, i.e., the
agent would be directly informed its child (effectively the service
being run as proxied by this agent instance).  One non-portable idea
would be to leverage pidfd facility recently introduced into Linux
(as already mentioned in the May's post).

Good news is that there's still room for _also_ cheap improvements,
such as what I did along the recent security fixes for pacemaker
(in a nutshell: IPC end-points already constitute the systemd-wide
singletons, equivalent for our purposes with checking via /proc,
allowing for a swap, and -- as a paradox -- this positive change
was secondary as it effectively enabled us to close the security
hole at hand, which was the primary objective).

Apparently, the most affected are resource agents.

[*] I've mentioned such risks once on this list already:
    https://lists.clusterlabs.org/pipermail/developers/2018-May/001237.html
    but alas, it received no responses

-- 
Jan (Poki)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <https://lists.clusterlabs.org/pipermail/developers/attachments/20190703/7fdae819/attachment.sig>