[ClusterLabs] Heads up for ldirectord in SLES12 SP5 "Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830"

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Wed Aug 3 05:13:16 EDT 2022


Hi!

I wanted to inform you of an unpleasant bug in ldirectord of SLES12 SP5:
We had a short network problem while some redundancy paths reconfigured in the infrastructure, effectively causing that some network services could not be reached.
Unfortunately ldirectord controlled by the cluster reported a failure (the director, not the services being directed to):

h11 crmd[28930]:   notice: h11-prm_lvs_mail_monitor_300000:369 [ Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830, <CFGFILE> line 21. Error [33159] reading file /etc/ldirectord/mail.conf at line 10: invalid address for virtual service\n ]
h11 ldirectord[33266]: Exiting with exit_status 2: config_error: Configuration Error

You can guess wat happened:
Pacemaker tried to recover (stop, then start), but the stop failed, too:
h11 lrmd[28927]:   notice: prm_lvs_mail_stop_0:35047:stderr [ Use of uninitialized value $ip_port in pattern match (m//) at /usr/sbin/ldirectord line 1830, <CFGFILE> line 21. ]
h11 lrmd[28927]:   notice: prm_lvs_mail_stop_0:35047:stderr [ Error [36293] reading file /etc/ldirectord/mail.conf at line 10: invalid address for virtual service ]
h11 crmd[28930]:   notice: Result of stop operation for prm_lvs_mail on h11: 1 (unknown error)

A stop failure meant that the node was fenced, interrupting all the other services.

Examining the logs I also found this interesting type of error:
h11 attrd[28928]:   notice: Cannot update fail-count-prm_lvs_rksapds5#monitor_300000[monitor]=(null) because peer UUID not known (will retry if learned)

Eventually, here's the code that caused the error:

sub _ld_read_config_virtual_resolve
{
        my($line, $vsrv, $ip_port, $af)=(@_);

        if($ip_port){
                $ip_port=&ld_gethostservbyname($ip_port, $vsrv->{protocol}, $af);
                if ($ip_port =~ /(\[[0-9A-Fa-f:]+\]):(\d+)/) {
                        $vsrv->{server} = $1;
                        $vsrv->{port} = $2;
                } elsif($ip_port){
                        ($vsrv->{server}, $vsrv->{port}) = split /:/, $ip_port;
                }
                else {
                        &config_error($line,
                                "invalid address for virtual service");
                }
...

The value returned by ld_gethostservbyname is undefined. I also wonder what the program logic is:
If the host looks like an hex address in square brackets, host and port are split at the colon; otherwise host and port are split at the colon.
Why not split simply at the last colon if the value is defined, AND THEN check if the components look OK?

So the "invalid address for virtual service" is only invalid when the resolver service (e.g. via LDAP) is unavailable.
I used host and service names for readability.

(I reported the issue to SLES support)

Regards,
Ulrich





More information about the Users mailing list