[ClusterLabs] [Problem]The pengine core dumps when changing attributes of bundle.

Ken Gaillot kgaillot at redhat.com
Fri Mar 9 18:48:54 EST 2018


On Sat, 2018-03-10 at 05:47 +0900, renayama19661014 at ybb.ne.jp wrote:
> Hi All, 
> 
> [Sorry..There was a defect in line breaks. to send again.]
> 
> I was checking the operation of Bundle with Pacemaker version 2.0.0-
> 9cd0f6cb86. 
> When Bundle resource is configured in Pacemaker and attribute is
> changed, pengine core dumps. 

Hi Hideo,

At first glance, it's confusing. The backtrace shows that
find_container_child() is being called with a NULL rsc, but I don't see
how it's possible to call it that way.

We'll investigate further and get back on the BZ

> 
> Step1) Start Pacemaker and pour in the settings. (The replicas and
> replicas-per-host are set to 1.) 
> 
> [root at rh74-test ~]# cibadmin --modify --allow-create --scope
> resources -X '
> <bundle id="httpd-bundle"> <docker image="pcmktest:http" replicas="1"
> replicas-per-host="1" options="--log-driver=journald" /> <network ip-
> range-start="192.168.20.188" host-interface="ens192" host-
> netmask="24"> <port-mapping id="httpd-port" port="80"/> </network>
> <storage> <storage-mapping id="httpd-root" source-dir-
> root="/var/local/containers" target-dir="/var/www/html"
> options="rw"/> <storage-mapping id="httpd-logs" source-dir-
> root="/var/log/pacemaker/bundles" target-dir="/etc/httpd/logs"
> options="rw"/> </storage> <primitive class="ocf" id="httpd"
> provider="heartbeat" type="apache" > <operations> <op id="rabbitmq-
> monitor-interval-10" interval="10" name="monitor" timeout="40"/> <op
> id="rabbitmq-start-interval-0s" interval="0s" name="start"
> timeout="200s"/> <op id="rabbitmq-stop-interval-0s" interval="0s"
> name="stop" timeout="200s" on-fail="fence" /> </operations>
> </primitive>
>> 
> Step2) Bundle is configured. 
> 
> [root at rh74-test ~]# crm_mon -1 -Af
> Stack: corosync
> Current DC: rh74-test (version 2.0.0-9cd0f6cb86) - partition WITHOUT
> quorum
> Last updated: Fri Mar  9 10:09:20 2018
> Last change: Fri Mar  9 10:06:30 2018 by root via cibadmin on rh74-
> test 2 nodes configured
> 
> 4 resources configured Online: [ rh74-test ]
> GuestOnline: [ httpd-bundle-0 at rh74-test> 
> Active resources: 
> Docker container: httpd-bundle [pcmktest:http] httpd-bundle-0
> (192.168.20.188)      (ocf::heartbeat:apache):        
> 
> Started rh74-test Node Attributes:
> * Node httpd-bundle-0 at rh74-test:
> * Node rh74-test: Migration Summary:
> * Node rh74-test:
> * Node httpd-bundle-0 at rh74-test> 
> Step3) Change attributes of bundle with cibadmin command. (The
> replicas and replicas-per-host change to 3.)
> 
> 
> [root at rh74-test ~]# cibadmin --modify -X '<docker
> image="pcmktest:http" replicas="3" replicas-per-host="3" options="
> --log-driver=journald"/>' 
> 
> Step4) The pengine will core dump. (snip)
> Mar  9 10:10:21 rh74-test pengine[17726]:  notice: On loss of quorum:
> Ignore
> Mar  9 10:10:21 rh74-test pengine[17726]:    info: Node rh74-test is
> online
> Mar  9 10:10:21 rh74-test crmd[17727]:  error: Connection to pengine
> failed
> Mar  9 10:10:21 rh74-test crmd[17727]:  error: Connection to
> pengine[0x55f2d068bfb0] closed (I/O condition=25)
> Mar  9 10:10:21 rh74-test pacemakerd[17719]:  error: Managed process
> 17726 (pengine) dumped core
> Mar  9 10:10:21 rh74-test pacemakerd[17719]:  error: pengine[17726]
> terminated with signal 11 (core=1)
> Mar  9 10:10:21 rh74-test pacemakerd[17719]:  notice: Respawning
> failed child process: pengine
> Mar  9 10:10:21 rh74-test pacemakerd[17719]:    info: Using uid=990
> and group=984 for process pengine
> Mar  9 10:10:21 rh74-test pacemakerd[17719]:    info: Forked child
> 19275 for process pengine
> (snip) 
> 
> This event reproduces 100 percent. 
> 
> Apparently the problem seems to be due to different handling of
> clone(httpd) resources in the Bundle resource. 
> 
> - I registered this content with the following Bugzilla.
> (https://bugs.clusterlabs.org/show_bug.cgi?id=5337)
> 
> Best Regards
> Hideo Yamauchi.
-- 
Ken Gaillot <kgaillot at redhat.com>



More information about the Users mailing list