[ClusterLabs] Fedora 31 - systemd based resources don't start
Maverick
mvrk at sapo.pt
Wed Feb 19 12:21:12 EST 2020
How is it possible that pacemaker is reporting that takes 4.2 minutes
(254930ms) to execute the start of httpd systemd unit?
Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) info:
executing - rsc:apache action:start call_id:25
Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec)
debug: Performing asynchronous start op on systemd unit httpd named 'apache'
Feb 19 17:04:09 boss1 pacemaker-execd [1514]
(systemd_unit_exec_with_unit) debug: Calling StartUnit for apache:
/org/freedesktop/systemd1/unit/httpd_2eservice
Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete)
notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms,
remaining=-154930ms)
Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished)
debug: finished - rsc:apache action:monitor call_id:25 exit-code:198
exec-time:254935ms queue-time:235ms
Starting manually works fine and fast:
# time systemctl start httpd
real 0m0.144s
user 0m0.005s
sys 0m0.008s
On 17/02/2020 22:47, Mvrk wrote:
> In attachment the pacemaker.log. On the log i can see that the cluster
> tries to start, the start fails, then tries to stop, and the stop also
> fails also.
>
> One more thing, my cluster was working fine on Fedora 28, i started
> having this problem after upgrade to Fedora 31.
>
> On 17/02/2020 21:30, Ricardo Esteves wrote:
>> Hi,
>>
>> Yes, i also don't understand why is trying to stop them first.
>>
>> SELinux is disabled:
>>
>> # getenforce
>> Disabled
>>
>> All systemd services controlled by the cluster are disabled from
>> starting at boot:
>>
>> # systemctl is-enabled httpd
>> disabled
>>
>> # systemctl is-enabled openvpn-server at 01-server
>> disabled
>>
>>
>> On 17/02/2020 20:28, Ken Gaillot wrote:
>>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote:
>>>> Hi,
>>>>
>>>> When i start my cluster, most of my systemd resources won't start:
>>>>
>>>> Failed Resource Actions:
>>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82,
>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01
>>>> 01:00:54 +01:00', queued=29ms, exec=197799ms
>>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61,
>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01
>>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms
>>> These show that attempts to stop failed, rather than start.
>>>
>>>> So everytime i reboot my node, i need to start the resources manually
>>>> using systemd, for example:
>>>>
>>>> systemd start apache
>>>>
>>>> and then pcs resource cleanup
>>>>
>>>> Resources configuration:
>>>>
>>>> Clone: apache-clone
>>>> Meta Attrs: maintenance=false
>>>> Resource: apache (class=systemd type=httpd)
>>>> Meta Attrs: maintenance=false
>>>> Operations: monitor interval=60 timeout=100 (apache-monitor-
>>>> interval-60)
>>>> start interval=0s timeout=100 (apache-start-interval-
>>>> 0s)
>>>> stop interval=0s timeout=100 (apache-stop-interval-0s)
>>>>
>>>>
>>>>
>>>> Resource: openvpn (class=systemd type=openvpn-server at 01-server)
>>>> Meta Attrs: maintenance=false
>>>> Operations: monitor interval=60 timeout=100 (openvpn-monitor-
>>>> interval-60)
>>>> start interval=0s timeout=100 (openvpn-start-interval-
>>>> 0s)
>>>> stop interval=0s timeout=100 (openvpn-stop-interval-
>>>> 0s)
>>>>
>>>>
>>>>
>>>> Btw, if i try a debug-start / debug-stop the mentioned resources
>>>> start and stop ok.
>>> Based on that, my first guess would be SELinux. Check the SELinux logs
>>> for denials.
>>>
>>> Also, make sure your systemd services are not enabled in systemd itself
>>> (e.g. via systemctl enable). Clustered systemd services should be
>>> managed by the cluster only.
More information about the Users
mailing list