[ClusterLabs] Failed 'virsh' call when test RA run by crm_resource (con't)

Madison Kelly mkelly at alteeve.com
Thu Jan 12 01:23:59 EST 2023


On 2023-01-12 01:12, Reid Wahl wrote:
> On Wed, Jan 11, 2023 at 10:12 PM Reid Wahl <nwahl at redhat.com> wrote:
>>
>> On Wed, Jan 11, 2023 at 8:11 PM Madison Kelly <mkelly at alteeve.com> wrote:
>>>
>>> Hi all,
>>>
>>>     There was a lot of sub-threads, so I figured it's helpful to start a
>>> new thread with a summary so far. For context; I have a super simple
>>> perl script that pretends to be an RA for the sake of debugging.
>>>
>>> https://pastebin.com/9z314TaB
>>>
>>>     I've had variations log environment variables and confirmed that all
>>> the variables in the direct call that work are in the crm_resource
>>> triggered call. There are no selinux issues logged in audit.log and
>>> selinux is permissive. The script logs the real and effective UID and
>>> GID and it's the same in both instances. Calling other shell programs
>>> (tested with 'hostname') run fine, this is specifically crm_resource ->
>>> test RA -> virsh call.
>>>
>>>     I ran strace on the virsh call from inside my test script (changing
>>> 'virsh.good' to 'virsh.bad' between running directly and via
>>> crm_resource. The strace runs made six files each time. Below are
>>> pastebin links with the outputs of the six runs in one paste, but each
>>> file's output is in it's own block (search for file: to see the
>>> different file outputs)
>>>
>>> Good/direct run of the test RA:
>>> - https://pastebin.com/xtqe9NSG
>>>
>>> Bad/crm_resource triggered run of the test RA:
>>> - https://pastebin.com/vBiLVejW
>>>
>>> Still absolutely stumped.
>>
>> The strace outputs show that your bad runs are all getting stopped
>> with SIGTTOU. If you've never heard of that, me either.
>>
>> https://www.gnu.org/software/libc/manual/html_node/Job-Control-Signals.html
>>
>> Macro: int SIGTTOU
>>
>>      This is similar to SIGTTIN, but is generated when a process in a
>> background job attempts to write to the terminal or set its modes.
>> Again, the default action is to stop the process. SIGTTOU is only
>> generated for an attempt to write to the terminal if the TOSTOP output
>> mode is set; see Output Modes.
>>
>>
>> Maybe this has something to do with the buffer settings in the perl
>> script(?). It might be worth trying a version that doesn't fiddle with
>> the outputs and buffer settings.
>>
>> I don't know which difference between your environment and mine is
>> relevant here, such that I can't reproduce the issue using your test
>> script. It works perfectly fine for me.
>>
>> Can you run `stty -a | grep tostop`? If there's a minus sign
>> ("-tostop"), it's disabled; if it's present without a minus sign
>> ("tostop"), it's enabled, as best I can tell.
>>
>> I'm just spitballing here. It's disabled by default on my machine...
>> but even when I enable it, crm_resource --validate works fine. It may
>> be set differently when running under crm_resource.
> 
> I meant to include this:
> https://stackoverflow.com/questions/10588334/unix-background-process-stopped-abnormally

If I understand the post;

====
[root at mk-a07n02 ~]# /usr/bin/nohup perl 
/usr/lib/ocf/resource.d/alteeve/server
/usr/bin/nohup: ignoring input and appending output to 'nohup.out'
[root at mk-a07n02 ~]#
====

I see the output of the virsh call in the logs fine, no hang.

-- 
Madison Kelly
Alteeve's Niche!
Chief Technical Officer
c: +1-647-471-0951
https://alteeve.com/



More information about the Users mailing list