[ClusterLabs] [ClusterLabs Developers] checking all procs on system enough during stop action?

Mon Apr 24 13:50:58 EDT 2017

On Mon, 24 Apr 2017 17:52:09 +0200
Jan Pokorný <jpokorny at redhat.com> wrote:

> On 24/04/17 17:32 +0200, Jehan-Guillaume de Rorthais wrote:
> > On Mon, 24 Apr 2017 17:08:15 +0200
> > Lars Ellenberg <lars.ellenberg at linbit.com> wrote:
> >   
> >> On Mon, Apr 24, 2017 at 04:34:07PM +0200, Jehan-Guillaume de Rorthais
> >> wrote:  
> >>> Hi all,
> >>> 
> >>> In the PostgreSQL Automatic Failover (PAF) project, one of most frequent
> >>> negative feedback we got is how difficult it is to experience with it
> >>> because of fencing occurring way too frequently. I am currently hunting
> >>> this kind of useless fencing to make life easier.
> >>> 
> >>> It occurs to me, a frequent reason of fencing is because during the stop
> >>> action, we check the status of the PostgreSQL instance using our monitor
> >>> function before trying to stop the resource. If the function does not
> >>> return OCF_NOT_RUNNING, OCF_SUCCESS or OCF_RUNNING_MASTER, we just raise
> >>> an error, leading to a fencing. See:
> >>> https://github.com/dalibo/PAF/blob/d50d0d783cfdf5566c3b7c8bd7ef70b11e4d1043/script/pgsqlms#L1291-L1301
> >>> 
> >>> I am considering adding a check to define if the instance is stopped even
> >>> if the monitor action returns an error. The idea would be to parse **all**
> >>> the local processes looking for at least one pair of
> >>> "/proc/<PID>/{comm,cwd}" related to the PostgreSQL instance we want to
> >>> stop. If none are found, we consider the instance is not running.
> >>> Gracefully or not, we just know it is down and we can return OCF_SUCCESS.
> >>> 
> >>> Just for completeness, the piece of code would be:
> >>> 
> >>>    my @pids;
> >>>    foreach my $f (glob "/proc/[0-9]*") {
> >>>        push @pids => basename($f)
> >>>            if -r $f
> >>>                and basename( readlink( "$f/exe" ) ) eq "postgres"
> >>>                and readlink( "$f/cwd" ) eq $pgdata;
> >>>    }
> >>> 
> >>> I feels safe enough to me.  
> > 
> > [...]
> > 
> > But anyway, here or there, I would have to add this piece of code looking at
> > each processes. According to you, is it safe enough? Do you see some hazard
> > with it?  
> 
> Just for the sake of completeness, there's a race condition, indeed,
> in multiple repeated path traversals (without being fixed of particular
> entry inode), which can be interleaved with new postgres process being
> launched anew (or what not).  But that may happen even before the code
> in question is executed -- naturally not having a firm grip on the
> process is open to such possible issues, so this is just an aside.

Indeed, a new process can appear right after the glob listing them.

However, in a Pacemaker cluster, only Pacemaker should be responsible to start
the resource. PostgreSQL is not able to restart itself by its own.

I don't want to rely on the postmaster.pid (the postgresql pid file) file
existence or content, neither track the postmaster pid from the RA itself. Way
too much race conditions or complexity appears when I start thinking about it.

Thank you for your answer!

-- 
Jehan-Guillaume de Rorthais
Dalibo