[ClusterLabs Developers] Extend enumeration of OCF return values
jpokorny at redhat.com
Wed Oct 16 07:41:42 EDT 2019
On 16/10/19 09:18 +0000, Yan Gao wrote:
> On 10/15/19 4:31 PM, Ken Gaillot wrote:
>> On Tue, 2019-10-15 at 13:08 +0200, Tony den Haan wrote:
>>> I ran into getting "error 1" from portblock, so OCF_ERR_GENERIC,
>>> which for me doesn't guarantee the error was RC from portblock or
>>> pacemaker itself.
>>> Wouldn't it be quite useful to
>>> 1) give the agents a unique number to add to the OCF RC code, thus
>>> helping to determine origin of error
>>> 2) show an actual error string instead of "unknown error(1)". This is
>>> the last you want to see when a cluster is stuck.
>> I agree it's an issue, but the exit codes have to stay fairly generic.
>> There are only 255 possible exit codes, and half of those most shells
>> use for signals. Meanwhile there are dozens of agents. More
>> importantly, Pacemaker needs standard meanings to know how to respond.
>> However there are possibilities:
>> - OCF could add a few more codes for common error conditions. (This
>> requires updating the standard, as well as software such as Pacemaker
>> to be aware of them.)
>> - OCF already supports an arbitrary string "exit reason" which
>> pacemaker will display beyond just "unknown". It's up to the individual
>> agents to support this, and all of them should. Agents can get as
>> specific as they like with exit reasons.
>> - Agents can also log to the system log, or print error output which
>> pacemaker will log in its detail log. Many already provide good
>> information this way, but there's always room for improvement.
> All make sense. A lot of times, I can feel it's the wording "unknown
> error" that frustrates users since they are definitely not in a good
> mood seeing any errors in their beloved clusters, not to mention ones
> are even "unknown" ;-)
> As a manner of fact, it's probably the mostly returned error. I'd prefer
> to call it something different from user interfaces, for example
> "generic error" or just "error". Since:
\me votes for "sundry error" :-)
Seriously, better for getting the right hits of a random $WEBSEARCHER
since this is the first line of universal defense for a growing
population. Assumes proper and web bots explorable documentation.
> - If "exit reason" gives a hint, it's not really "unknown".
> - Even if there's no "exit reason" given, it doesn't mean it's
> "unknown". Usually clues could be found from logs.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 819 bytes
Desc: not available
More information about the Developers