[ClusterLabs] [ClusterLabs Developers] checking all procs on system enough during stop action?

Mon Apr 24 11:32:00 EDT 2017

On Mon, 24 Apr 2017 17:08:15 +0200
Lars Ellenberg <lars.ellenberg at linbit.com> wrote:

> On Mon, Apr 24, 2017 at 04:34:07PM +0200, Jehan-Guillaume de Rorthais wrote:
> > Hi all,
> > 
> > In the PostgreSQL Automatic Failover (PAF) project, one of most frequent
> > negative feedback we got is how difficult it is to experience with it
> > because of fencing occurring way too frequently. I am currently hunting
> > this kind of useless fencing to make life easier.
> > 
> > It occurs to me, a frequent reason of fencing is because during the stop
> > action, we check the status of the PostgreSQL instance using our monitor
> > function before trying to stop the resource. If the function does not return
> > OCF_NOT_RUNNING, OCF_SUCCESS or OCF_RUNNING_MASTER, we just raise an error,
> > leading to a fencing. See:
> > https://github.com/dalibo/PAF/blob/d50d0d783cfdf5566c3b7c8bd7ef70b11e4d1043/script/pgsqlms#L1291-L1301
> > 
> > I am considering adding a check to define if the instance is stopped even
> > if the monitor action returns an error. The idea would be to parse **all**
> > the local processes looking for at least one pair of
> > "/proc/<PID>/{comm,cwd}" related to the PostgreSQL instance we want to
> > stop. If none are found, we consider the instance is not running.
> > Gracefully or not, we just know it is down and we can return OCF_SUCCESS.
> > 
> > Just for completeness, the piece of code would be:
> > 
> >    my @pids;
> >    foreach my $f (glob "/proc/[0-9]*") {
> >        push @pids => basename($f)
> >            if -r $f
> >                and basename( readlink( "$f/exe" ) ) eq "postgres"
> >                and readlink( "$f/cwd" ) eq $pgdata;
> >    }
> > 
> > I feels safe enough to me. The only risk I could think of is in a shared
> > disk cluster with multiple nodes accessing the same data in RW (such setup
> > can fail in so many ways :)). However, PAF is not supposed to work in such
> > context, so I can live with this.
> > 
> > Do you guys have some advices? Do you see some drawbacks? Hazards?  
> 
> Isn't that the wrong place to "fix" it?
> Why did your _monitor  return something "weird"?

Because this _monitor is the one called by the monitor action. It is able to
define if an instance is running and if it feels good.

Take the scenario where the slave instance is crashed:
  1/ the monitor action raise an OCF_ERR_GENERIC
  2/ Pacemaker tries a recover of the resource (stop->start)
  3/ the stop action fails because _monitor says the resource is crashed
  4/ Pacemaker fence the node.

> What did it return?

Either OCF_ERR_GENERIC or OCF_FAILED_MASTER as instance.

> Should you not fix it there?

fixing this in the monitor action? This would bloat the code of this function.
We would have to add a special code path in there to define if it is called
as a real monitor action or just as a status one for other actions.

But anyway, here or there, I would have to add this piece of code looking at
each processes. According to you, is it safe enough? Do you see some hazard
with it?

> Just thinking out loud.

Thank you, it helps :)