[ClusterLabs] Fedora 31 - systemd based resources don't start
Strahil Nikolov
hunter86_bg at yahoo.com
Wed Feb 19 13:23:09 EST 2020
On February 19, 2020 7:21:12 PM GMT+02:00, Maverick <mvrk at sapo.pt> wrote:
>
>How is it possible that pacemaker is reporting that takes 4.2 minutes
>(254930ms) to execute the start of httpd systemd unit?
>
>Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute)
>info:
>executing - rsc:apache action:start call_id:25
>Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec)
>
>debug: Performing asynchronous start op on systemd unit httpd named
>'apache'
>Feb 19 17:04:09 boss1 pacemaker-execd [1514]
>(systemd_unit_exec_with_unit) debug: Calling StartUnit for apache:
>/org/freedesktop/systemd1/unit/httpd_2eservice
>Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete)
>notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms,
>remaining=-154930ms)
>Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished)
>debug: finished - rsc:apache action:monitor call_id:25 exit-code:198
>exec-time:254935ms queue-time:235ms
>
>
>Starting manually works fine and fast:
>
># time systemctl start httpd
>real 0m0.144s
>user 0m0.005s
>sys 0m0.008s
>
>
>On 17/02/2020 22:47, Mvrk wrote:
>> In attachment the pacemaker.log. On the log i can see that the
>cluster
>> tries to start, the start fails, then tries to stop, and the stop
>also
>> fails also.
>>
>> One more thing, my cluster was working fine on Fedora 28, i started
>> having this problem after upgrade to Fedora 31.
>>
>> On 17/02/2020 21:30, Ricardo Esteves wrote:
>>> Hi,
>>>
>>> Yes, i also don't understand why is trying to stop them first.
>>>
>>> SELinux is disabled:
>>>
>>> # getenforce
>>> Disabled
>>>
>>> All systemd services controlled by the cluster are disabled from
>>> starting at boot:
>>>
>>> # systemctl is-enabled httpd
>>> disabled
>>>
>>> # systemctl is-enabled openvpn-server at 01-server
>>> disabled
>>>
>>>
>>> On 17/02/2020 20:28, Ken Gaillot wrote:
>>>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote:
>>>>> Hi,
>>>>>
>>>>> When i start my cluster, most of my systemd resources won't start:
>>>>>
>>>>> Failed Resource Actions:
>>>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82,
>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01
>>>>> 01:00:54 +01:00', queued=29ms, exec=197799ms
>>>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61,
>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01
>>>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms
>>>> These show that attempts to stop failed, rather than start.
>>>>
>>>>> So everytime i reboot my node, i need to start the resources
>manually
>>>>> using systemd, for example:
>>>>>
>>>>> systemd start apache
>>>>>
>>>>> and then pcs resource cleanup
>>>>>
>>>>> Resources configuration:
>>>>>
>>>>> Clone: apache-clone
>>>>> Meta Attrs: maintenance=false
>>>>> Resource: apache (class=systemd type=httpd)
>>>>> Meta Attrs: maintenance=false
>>>>> Operations: monitor interval=60 timeout=100 (apache-monitor-
>>>>> interval-60)
>>>>> start interval=0s timeout=100
>(apache-start-interval-
>>>>> 0s)
>>>>> stop interval=0s timeout=100
>(apache-stop-interval-0s)
>>>>>
>>>>>
>>>>>
>>>>> Resource: openvpn (class=systemd type=openvpn-server at 01-server)
>>>>> Meta Attrs: maintenance=false
>>>>> Operations: monitor interval=60 timeout=100 (openvpn-monitor-
>>>>> interval-60)
>>>>> start interval=0s timeout=100
>(openvpn-start-interval-
>>>>> 0s)
>>>>> stop interval=0s timeout=100
>(openvpn-stop-interval-
>>>>> 0s)
>>>>>
>>>>>
>>>>>
>>>>> Btw, if i try a debug-start / debug-stop the mentioned resources
>>>>> start and stop ok.
>>>> Based on that, my first guess would be SELinux. Check the SELinux
>logs
>>>> for denials.
>>>>
>>>> Also, make sure your systemd services are not enabled in systemd
>itself
>>>> (e.g. via systemctl enable). Clustered systemd services should be
>>>> managed by the cluster only.
>
>_______________________________________________
>Manage your subscription:
>https://lists.clusterlabs.org/mailman/listinfo/users
>
>ClusterLabs home: https://www.clusterlabs.org/
You really need to debug the start & stop of tthe resource .
Please try the debug procedure and provide the output:
https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures
Best Regards,
Strahil Nikolov
More information about the Users
mailing list