[Pacemaker] How to use monitor action in VIrtualDomain resource agent

Fil lists at internyc.net
Wed Dec 7 01:41:41 EST 2011


Hi Andreas,

bellow is the grep you requested. Also while looking into this problem I
came up with some interesting issues with VirtualDomain resource agent.
Since my /etc/libvirt/qemu directory is an NFS share VirtualDomain
sometimes complains it can't read the /etc/libvirt/qemu/test.xml file.
This is a bit puzzling. Looking at the test logic inside VirtualDomain
file I ran into this code:


    # check if we can read the config file (otherwise we're unable to
    # deduce $DOMAIN_NAME from it, see below)
    if [ ! -r $OCF_RESKEY_config ]; then
        if ocf_is_probe; then
            ocf_log info "Configuration file $OCF_RESKEY_config not
readable during probe."
        else
            ocf_log error "Configuration file $OCF_RESKEY_config does
not exist or is not readable."
            return $OCF_ERR_INSTALLED
        fi
    fi

the problem here is that -r operator returns true if $OCF_RESKEY_config
is a regular file or a directory. Shouldn't this be a -f check followed
by the -r check?

thanks
fil

Dec 07 01:25:53 server01.adriaticsolutions.com pengine: [5297]: info:
native_print: vm_test	(ocf::adriatic:VirtualDomain):	Started
server01.adriaticsolutions.com
Dec 07 01:25:53 server01.adriaticsolutions.com lrmd: [5295]: info:
cancel_op: operation monitor[10] on ocf::VirtualDomain::vm_test for
client 5298, its parameters: CRM_meta_timeout=[30000] depth=[0]
CRM_meta_name=[monitor] crm_feature_set=[3.0.5]
config=[/etc/libvirt/qemu/test.xml] CRM_meta_interval=[10000]
hypervisor=[qemu:///system] CRM_meta_depth=[0] migration_transport=[tcp]
 cancelled
Dec 07 01:25:53 server01.adriaticsolutions.com lrmd: [5295]: debug:
on_msg_perform_op: add an operation operation migrate_to[11] on
ocf::VirtualDomain::vm_test for client 5298, its parameters:
CRM_meta_timeout=[120000] CRM_meta_name=[migrate_to]
crm_feature_set=[3.0.5] config=[/etc/libvirt/qemu/test.xml]
CRM_meta_migrate_source=[server01.adriaticsolutions.com]
CRM_meta_migrate_target=[server02.adriaticsolutions.com]
hypervisor=[qemu:///system] migration_transport=[tcp]  to the operation
list.
Dec 07 01:25:57 server01.adriaticsolutions.com lrmd: [5295]: debug:
on_msg_perform_op: add an operation operation stop[12] on
ocf::VirtualDomain::vm_test for client 5298, its parameters:
crm_feature_set=[3.0.5]  to the operation list.
Dec 07 01:25:58 server01.adriaticsolutions.com pengine: [5297]: info:
native_print: vm_test	(ocf::adriatic:VirtualDomain):	Started
server02.adriaticsolutions.com FAILED
Dec 07 01:26:10 server01.adriaticsolutions.com lrmd: [5295]: debug:
on_msg_perform_op: add an operation operation start[13] on
ocf::VirtualDomain::vm_test for client 5298, its parameters:
crm_feature_set=[3.0.5] CRM_meta_name=[start]
config=[/etc/libvirt/qemu/test.xml] migration_transport=[tcp]
CRM_meta_timeout=[120000] hypervisor=[qemu:///system]  to the operation
list.
Dec 07 01:26:11 server01.adriaticsolutions.com lrmd: [5295]: debug:
on_msg_perform_op: add an operation operation monitor[14] on
ocf::VirtualDomain::vm_test for client 5298, its parameters:
CRM_meta_timeout=[30000] depth=[0] CRM_meta_name=[monitor]
crm_feature_set=[3.0.5] config=[/etc/libvirt/qemu/test.xml]
CRM_meta_interval=[10000] hypervisor=[qemu:///system] CRM_meta_depth=[0]
migration_transport=[tcp]  to the operation list.



Dec  7 01:25:53 server01 pengine: [5297]: info: native_print:
vm_test#011(ocf::adriatic:VirtualDomain):#011Started
server01.adriaticsolutions.com
Dec  7 01:25:53 server01 lrmd: [5295]: info: cancel_op: operation
monitor[10] on ocf::VirtualDomain::vm_test for client 5298, its
parameters: CRM_meta_timeout=[30000] depth=[0] CRM_meta_name=[monitor]
crm_feature_set=[3.0.5] config=[/etc/libvirt/qemu/test.xml]
CRM_meta_interval=[10000] hypervisor=[qemu:///system] CRM_meta_depth=[0]
migration_transport=[tcp]  cancelled
Dec  7 01:25:53 server01 VirtualDomain[8680]: INFO: test: Starting live
migration to server02.adriaticsolutions.com (using remote hypervisor URI
qemu+tcp://server02.adriaticsolutions.com/system ).
Dec  7 01:25:57 server01 VirtualDomain[8680]: INFO: test: live migration
to server02.adriaticsolutions.com succeeded.
Dec  7 01:25:57 server01 VirtualDomain[8725]: INFO: Domain name "test"
saved to /var/run/heartbeat/rsctmp/VirtualDomain-vm_test.state.
Dec  7 01:25:58 server01 VirtualDomain[8725]: INFO: Domain test already
stopped.
Dec  7 01:25:58 server01 pengine: [5297]: info: native_print:
vm_test#011(ocf::adriatic:VirtualDomain):#011Started
server02.adriaticsolutions.com FAILED



On 12/06/2011 07:56 PM, Andreas Kurz wrote:
> Hello,
> 
> On 12/05/2011 05:27 AM, Fil wrote:
>> Hi,
>>
>> I have a 2 node cluster (corosync 1.4.2 pacemaker 1.1.6). I need to
>> control couple of virtual machines in this cluster and be able to live
>> migrate them between nodes. Up until now all my tests worked, but as
>> soon as I started using monitor action of VirtualDomain my virtual
>> machines are failing to migrate and sometimes they don't even start
>> cleanly. Every time I need to manually cleanup the resource group and
>> then it seems it seems to work. Could you please explain if I need
>> monitor action and how do I make it work.
>>
>> thanks
>> fil
>>
>> Here are the error messages I get:
>>
>>     vm_test_monitor_10000 (node=server02.adriaticsolutions.com, call=46,
>> rc=5, status=complete): not installed
>>     vm_test_start_0 (node=server01.adriaticsolutions.com, call=52, rc=5,
>> status=complete): not installed
> 
> Any reslust when doing a grep for "VirtualDomain"? Would be interesting
> what the resource agents is telling us ...
> 
> Regards,
> Andreas
> 
> 
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker at oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org




More information about the Pacemaker mailing list