[Pacemaker] killproc not found? o2cb shutdown via resource agent

Andrew Beekhof andrew at beekhof.net
Wed Nov 21 19:54:14 EST 2012


On Sat, Nov 10, 2012 at 9:41 AM, Matthew O'Connor <matt at ecsorl.com> wrote:
>
> On 11/09/2012 04:26 PM, Andrew Beekhof wrote:
>> On Fri, Nov 9, 2012 at 4:43 PM, Matthew O'Connor <matt at ecsorl.com> wrote:
>>> On 11/08/2012 08:15 PM, Andrew Beekhof wrote:
>>>> You're not starting it as a pacemaker resource are you?
>>>> CMAN should be doing that as part of the init script (which explains
>>>> why its still there until after pacemaker is gone).
>>> I thought that was the dlm_controld, not ocfs2_controld?
>> I know it starts gfs_controld when using GFS... I assume its the same for OCFS2
> Yes, I saw that in the cman script...though I can't seem to find the
> magic combination of modules and/or configfs writes to make cman
> actually configure ocfs2/o2cb, though there are times it detects o2cb's
> presence on init (shortly before dying horribly).
>
> I can configure ocfs2 manually (via /etc/ocfs2/cluster.conf), though
> this has no effect on cman (and vice versa), except that cman does not
> crash on shutdown and pacemaker then has no involvement with o2cb.

If you're using cman, then pacemaker should _not_ have anything to do with o2cb.
So this would be correct.

All pacemaker does is mount filesystems in this case.

>  The
> presence of the "cman" option for cluster stack in the o2cb RA is a
> little bewildering.
>
> I will do more research and reading, perhaps trying GFS out just to get
> my head around how it interacts with CMAN.  Perhaps there is a
> corollary, or something simple missing from cluster.conf.  Perhaps GFS
> is the way to go, obviating these problems with OCFS2?

Maybe. I'm really not an authority on cluster filesystems.
About the only time I use them is when I'm updating clusters from
scratch and thats a very controlled environment.

>
> Thank you for your help!
>
>>
>>> dlm_controld
>>> is certainly managed by CMAN, but it hasn't been starting ocfs2_controld
>>> for me...and without it, the OCFS2 shares won't mount.  For reference:
>>>
>>> primitive p_iscsiclient-store0-sandbox ocf:heartbeat:iscsi \
>>>         params portal="10.16.16.5:3260" target="..." \
>>>         ...
>>> primitive p_mount-store0-sandbox ocf:heartbeat:Filesystem \
>>>         params device="-U 443d287f-b98f-45e4-bd6e-d64dd7af0169"
>>> directory="/opt/store3" fstype="ocfs2" \
>>>         ...
>>> primitive p_o2cb ocf:pacemaker:o2cb \
>>>         params stack="cman" \
>>>         ...
>>>
>>> (ordering and colocation constraints omitted, along with uninteresting
>>> arguments.)  I'll feel quite dumb if there was just some additional
>>> configuration required for CMAN and OCFS2 and I somehow missed it.  I
>>> guess that would explain why CMAN would try to restart the
>>> ocfs2_controld if the ocfs2 modules were still loaded and configfs was
>>> still alive and well...though technically it failed every time it tried.
>>>
>>>> On Fri, Nov 9, 2012 at 11:14 AM, Matthew O'Connor <matt at ecsorl.com> wrote:
>>>>> I'm honestly beginning to wonder what exactly that killproc does for the
>>>>> ocfs2_controld.cman process... For kicks, I created a script in /sbin
>>>>> and /usr/sbin for killproc, which simply sources the lsb include and
>>>>> calls the function with whatever was passed via the command-line.
>>>>> Perhaps an equivalent fix to modifying the RA or the included shell
>>>>> extensions file, but still not as friendly as installing a .deb. ;-)
>>>>>
>>>>> However, I'm not sure if it's doing anything useful, even though I can
>>>>> see (via echos) that it's being called.  The ocfs2_controld.cman process
>>>>> doesn't go away till pacemaker is stopped (and isn't started until
>>>>> pacemaker is running and the node is online), which blunders into
>>>>> another problem: the o2cb RA appears to be in charge of unloading any
>>>>> modules it loaded, but it fails to unload the ocfs2_stack_user module.
>>>>> This causes CMAN to fail when shutting down; manually running 'service
>>>>> o2cb stop' before 'service cman stop' resolves the problem, but I would
>>>>> believe the RA should be doing this.  Even when the ocfs2_controld.cman
>>>>> process dies with pacemaker, the module remains.  :-/
>>>>>
>>>>>
>>>>> On 11/08/2012 06:02 AM, Dejan Muhamedagic wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Thu, Nov 08, 2012 at 08:23:53PM +1100, Tim Serong wrote:
>>>>>>> On 11/08/2012 07:56 PM, Andrew Beekhof wrote:
>>>>>>>> On Thu, Nov 8, 2012 at 5:16 PM, Tim Serong <tserong at suse.com> wrote:
>>>>>>>>> On 11/08/2012 12:11 PM, Andrew Beekhof wrote:
>>>>>>>>>> On Thu, Nov 8, 2012 at 9:59 AM, Matthew O'Connor <matt at ecsorl.com> wrote:
>>>>>>>>>>> Follow-up and additional info:
>>>>>>>>>>>
>>>>>>>>>>> System is Ubuntu 12.04.  Not sure where killproc is supposed to be derived
>>>>>>>>>>> from, or if there is an assumption for it to be a standalone binary or
>>>>>>>>>>> script.  I did find it defined in /lib/lsb/init-functions.  Adding a ".
>>>>>>>>>>> /lib/lsb/init-functions" to the start of the
>>>>>>>>>>> /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs file makes the
>>>>>>>>>>> process-kill work, but I suspect this is not the most desirable solution.
>>>>>>>>>> I think thats as good a solution as any.
>>>>>>>>>> I wonder where other distros are getting it from.
>>>>>>>>> SLES 11 SP2:
>>>>>>>>>
>>>>>>>>> # rpm -qf /sbin/killproc
>>>>>>>>> sysvinit-2.86-210.1
>>>>>>>>>
>>>>>>>>> openSUSE 12.2:
>>>>>>>>>
>>>>>>>>> # rpm -qf /sbin/killproc
>>>>>>>>> sysvinit-tools-2.88+-77.3.1.x86_64
>>>>>>>>>
>>>>>>>>> Can't speak for any others offhand...
>>>>>>>> Definitely not on fedora or its derivatives
>>>>>>> Hrm.  Well, I just had a quick skim of the ocfs2-tools source, and I'd
>>>>>>> be willing to bet the o2cb RA was based on the upstream o2cb init
>>>>>>> script, which uses killproc, but also sources /lib/lsb/init-functions.
>>>>>>> Does Fedora have killproc buried somewhere in there maybe?
>>>>>>>
>>>>>>> On SUSE, /lib/lsb/init-functions defines start_daemon(), killproc(), and
>>>>>>> pidofproc() but these just wrap binaries of the same name in /sbin
>>>>>>> (which would explain why o2cb works fine on SUSE, as those "missing"
>>>>>>> things are presumably in $PATH anyway).
>>>>>>>
>>>>>>> I don't know about sourcing /lib/lsb/init-functions in .ocf-shellfuncs -
>>>>>>> might be a bit broad?  Presumably couldn't hurt to source it in the o2cb
>>>>>>> RA though, unless there's some other cleaner solution...
>>>>>> I'd also say just in this particular RA. Unfortunately, the
>>>>>> distro specific stuff creeps now and again into agents supposed
>>>>>> to work everywhere.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Dejan
>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Tim
>>>>>>> --
>>>>>>> Tim Serong
>>>>>>> Senior Clustering Engineer
>>>>>>> SUSE
>>>>>>> tserong at suse.com
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>>
>>>>>>> Project Home: http://www.clusterlabs.org
>>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>> _______________________________________________
>>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>>
>>>>>> Project Home: http://www.clusterlabs.org
>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>




More information about the Pacemaker mailing list