[ClusterLabs] Postgres clone resource does not get "notice" events

vitaly vitaly at unitc.com
Tue Jul 5 20:50:18 EDT 2022


OK,
Thank you very much for your help!
_Vitaly

> On 07/05/2022 8:47 PM Reid Wahl <nwahl at redhat.com> wrote:
> 
>  
> On Tue, Jul 5, 2022 at 3:03 PM vitaly <vitaly at unitc.com> wrote:
> >
> > Hello,
> > Yes, the snippet has everything there was for the full second of Jul 05 11:54:34. I did not cut anything between the last line of 11:54:33 and first line of 11:54:35.
> >
> > Here is grep from pacemaker config:
> >
> > d19-25-left.lab.archivas.com ~ # egrep -v '^($|#)' /etc/sysconfig/pacemaker
> > PCMK_logfile=/var/log/pacemaker.log
> > SBD_SYNC_RESOURCE_STARTUP="no"
> > PCMK_trace_functions=services_action_sync,svc_read_output
> > d19-25-left.lab.archivas.com ~ #
> >
> > I also grepped CURRENT pacemaker.log for services_action_sync and got just 4 recs for the time that does not seem to match failures:
> >
> > d19-25-left.lab.archivas.com ~ # grep services_action_sync /var/log/pacemaker.log
> > Jul 05 21:20:21 d19-25-left.lab.archivas.com pacemaker-fenced    [47287] (services_action_sync at services.c:901)  trace:  > (null)_(null)_0: /usr/sbin/fence_ipmilan = 0
> > Jul 05 21:20:21 d19-25-left.lab.archivas.com pacemaker-fenced    [47287] (services_action_sync at services.c:903)  trace:  >  stdout: <?xml version="1.0" ?>
> > Jul 05 21:20:21 d19-25-left.lab.archivas.com pacemaker-fenced    [47287] (services_action_sync at services.c:901)  trace:  > (null)_(null)_0: /usr/sbin/fence_sbd = 0
> > Jul 05 21:20:21 d19-25-left.lab.archivas.com pacemaker-fenced    [47287] (services_action_sync at services.c:903)  trace:  >  stdout: <?xml version="1.0" ?>
> >
> > This is grep of messages for failures:
> >
> > d19-25-left.lab.archivas.com ~ # grep " 5 21:[23].*Failed to .*pgsql-rhino" /var/log/messages
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:43 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:44 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:44 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:47 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:47 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:48 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:48 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:48 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:48 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:20:49 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:20:49 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: error: Failed to receive meta-data for ocf:heartbeat:pgsql-rhino
> > Jul  5 21:30:26 d19-25-left pacemaker-controld[47291]: warning: Failed to get metadata for postgres (ocf:heartbeat:pgsql-rhino)
> > d19-25-left.lab.archivas.com ~ #
> >
> > Sorry, these logs are not the same time as this morning as I reinstalled cluster couple of times today.
> >
> > Thanks,
> > _Vitaly
> >
> 
> Strange. If we reach "Failed to receive meta-data", that means
> services_action_sync() returned true... and if services_action_sync()
> returned true, then we should hit a crm_trace() line no matter what.
> ```
> lrmd_api_get_metadata_params ...
> {
> ...
>     if (!services_action_sync(action)) {
>         crm_err("Failed to retrieve meta-data for %s:%s:%s",
>                 standard, provider, type);
>         services_action_free(action);
>         return -EIO;
>     }
> 
>     if (!action->stdout_data) {
>         crm_err("Failed to receive meta-data for %s:%s:%s",
>                 standard, provider, type);
>         services_action_free(action);
>         return -EIO;
>     }
> ...
> }
> 
> gboolean
> services_action_sync(svc_action_t * op)
> {
>     gboolean rc = TRUE;
> 
>     if (op == NULL) {
>         crm_trace("No operation to execute");
>         return FALSE;
>     }
>     ... snip (no return lines) ...
>     crm_trace(" > " PCMK__OP_FMT ": %s = %d",
>               op->rsc, op->action, op->interval_ms, op->opaque->exec, op->rc);
>     ...
>     return rc;
> }
> ```
> Probably best to file a bug, with the pgsql-rhino resource agent and
> ideally an sosreport or crm_report.
> 
> https://bugs.clusterlabs.org/enter_bug.cgi
> 
> <snip>
> 
> -- 
> Regards,
> 
> Reid Wahl (He/Him), RHCA
> Senior Software Maintenance Engineer, Red Hat
> CEE - Platform Support Delivery - ClusterHA


More information about the Users mailing list