[ClusterLabs] Antw: [EXT] Re: Problem with systemd socket service (start fails when running already)

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Feb 1 02:07:34 EST 2021


>>> Andrei Borzenkov <arvidjaar at gmail.com> schrieb am 29.01.2021 um 18:36 in
Nachricht <7bd34d6c-642f-0e44-e424-1445ebb30e87 at gmail.com>:
> 29.01.2021 14:19, Ulrich Windl пишет:
>> Hi!
>> 
>> I'm having an odd failure using a systemd socket unit controlled by the 
> cluster.
> 
> Why do you need socket unit to be controller by cluster in the first
> place? The whole point of socket unit is to auto-start services on
> access and that defeats purpose of HA manager which controls services.

Virtlockd (used by libvirtd) requires a shared filesystem for locking, which
is OCFS2 here. You cannot start OCFS2 before the cluster, and virtlockd has
issues when it was started before OCFS2. Cluster resources themselves need
libvirtd (via ther TLS socket), so eI can neither start libvirt before, nor
after the cluster. Simple as that ;-)

> 
>> (Personally I feel: "cluster and systemd: One resource controller too
much". 
> But when you need to control a systemd unit...)
>> When the unit is active already, a start peration fails:
>> 
>> Jan 29 12:12:46 h16 pacemaker-execd[7464]:  notice: executing - 
> rsc:prm_libvirtd-tls-sock action:start call_id:198
>> Jan 29 12:12:46 h16 systemd[1]: Reloading.
>> Jan 29 12:12:47 h16 systemd[1]: libvirtd-tls.socket: Socket service 
> libvirtd.service already active, refusing.
> 
> I think message is pretty clear. If libvirtd.service is under pacemaker
> control, there is absolutely no need to have matching .socket unit at
> all. If libvirtd.service is not under pacemaker control, you get
> expected results - service is started outside of pacemaker and pacemaker
> fails.

You are saying starting libvirtd does not require the ro and tls socket units
to be started?

> 
>> Jan 29 12:12:47 h16 systemd[1]: Failed to listen on Libvirt TLS IP socket.
>> Jan 29 12:12:49 h16 pacemaker-controld[7467]:  notice: Transition 313 
> aborted by operation prm_libvirtd-tls-sock_start_0 'modify' on h19: Event 
> failed
>> Jan 29 12:12:49 h16 pacemaker-controld[7467]:  notice: Transition 313
action 
> 81 (prm_libvirtd-tls-sock_start_0 on h19): expected 'ok' but got 'error'
>> Jan 29 12:12:49 h16 pacemaker-attrd[7465]:  notice: Setting 
> fail-count-prm_libvirtd-tls-sock#start_0[h19]: (unset) -> INFINITY
>> 
>> Is there an easy fix other than a software upgrade?
>> 
> 
> Yes, fix your configuration.

I'd like to once I know how. ;-)

Regards,
Ulrich

> _______________________________________________
> Manage your subscription:
> https://lists.clusterlabs.org/mailman/listinfo/users 
> 
> ClusterLabs home: https://www.clusterlabs.org/ 





More information about the Users mailing list