[ClusterLabs] Q: "crmd: warning: new_event_notification (7281-97955-15): Broken pipe (32)" as response to resource cleanup
kgaillot at redhat.com
Mon Aug 12 19:03:51 EDT 2019
On Mon, 2019-08-12 at 17:46 +0200, Ulrich Windl wrote:
> I just noticed that a "crm resource cleanup <rsc>" caused some
> unexpected behavior and the syslog message:
> crmd: warning: new_event_notification (7281-97955-15): Broken
> pipe (32)
> It's SLES14 SP4 last updated Sept. 2018 (up since then, pacemaker-
> The cleanup was due to a failed monitor. As an unexpected consequence
> of this cleanup, CRM seemed to restart the complete resource (and
> dependencies), even though it was running.
I assume the monitor failure was old, and recovery had already
completed? If not, recovery might have been initiated before the clean-
up was recorded.
> I noticed that a manual "crm_resource -C -r <rsc> -N <node>" command
> has the same effect (multiple resources are "Cleaned up", resources
> are restarted seemingly before the "probe" is done.).
Can you verify whether the probes were done? The DC should log a
message when each <rsc>_monitor_0 result comes in.
> Actually the manual says when cleaning up a single primitive, the
> whole group is cleaned up, unless using --force. Well ,I don't like
> this default, as I expect any status change from probe would
> propagate to the group anyway...
In 1.1, clean-up always wipes the history of the affected resources,
regardless of whether the history is for success or failure. That means
all the cleaned resources will be reprobed. In 2.0, clean-up by default
wipes the history only if there's a failed action (--refresh/-R is
required to get the 1.1 behavior). That lessens the impact of the
"default to whole group" behavior.
I think the original idea was that a group indicates that the resources
are closely related, so changing the status of one member might affect
what status the others report.
Ken Gaillot <kgaillot at redhat.com>
More information about the Users