[molpro-user] Killing Molpro Jobs using SGE

Javier Díaz Montes javier.diaz at uclm.es
Tue Dec 4 15:22:40 GMT 2007


Hi,
I am sorry. I am using Open MPI. I could try to compile using MPICH in  
order to see if the problem is solved:

Thanks

En Tue, 04 Dec 2007 16:05:37 +0100, Reuti <reuti at staff.uni-marburg.de>  
escribió:

> Zitat von Javier Díaz Montes <javier.diaz at uclm.es>:
>
>> Hi,
>> I can not kill the jobs using kill command, because I run the jobs from
>> the Frontend of the cluster. Therefore, I would have to login in each
>> node to run the kill command.
>> I have a cluster built using Rocks. I have openmpi-1.1-1.
>
> Hi again,
>
> I'm getting confused: are you now using MPICH or Open MPI? The mpirun  
> command must be the one used for compilation, they are not  
> interchangable.
>
> All my hints up to now were for using MPICH with SGE. If you use Open  
> MPI with SGE it's different.
>
> Also in you other post I saw "orted" processes. So: you are using Open  
> MPI, and not MPICH - right?
>
> -- Reuti
>
>> I want an instance of molpro for each node, because I have AMD64  
>> machines.
>> I run molpro using the next command:
>>
>> 	molprop -n $NSLOTS --mpirun-machinefile $TMPDIR/machines < infile >  
>> outfile
>>
>> How can I specify that I only want a instance per node? Because I think
>> that it consider an instance per node, but I do not know why this
>> create two processes. This is the tree:
>>
>> 27481 ?        R    654:36  \_
>> /home/programs/molpro/molpro-Linux-x86_64-i8-2006.1/molprop_2006_1_i8_p4_mpi.exe
>> 27483 ?        S      0:04      \_
>> /home/programs/molpro/molpro-Linux-x86_64-i8-2006.1/molprop_2006_1_i8_p4_mpi.exe
>>
>> These are the first lines of the Molpro output:
>>
>> ARMCI configured for 6 cluster nodes. Network protocol is 'TCP/IP  
>> Sockets'.
>>
>>  MPP nodes  nproc
>>  compute-0-19.loc    1
>>  compute-0-5.loca    1
>>  compute-0-22.loc    1
>>  compute-0-12.loc    1
>> .......
>>
>> Regards,
>> Javi
>>
>> En Tue, 04 Dec 2007 12:03:23 +0100, Miguel Guilherme Fernandes de Souza
>> <guilhermefsmiguel at gmail.com> escribió:
>>
>>> Mr. Javier,
>>> I have molpro runing on a 6 64bits computers, using mpich as mpi  
>>> interfacer.
>>> I think that this that is happening is normal.
>>> did you try to kill the job using kill -9 <number of the processes> ?
>>> as example kill -9 9988
>>> i always use this command to kill my jobs, and when i do it, all the
>>> instances, on all the nodes have been killed.
>>> molpro can create 2 instances of the same processes on each node just  
>>> if you
>>> want.
>>> The question, do you build this cluster ? if yes, what mpi did you use  
>>> ?
>>> how are you calling your jobs?
>>> all of these affects on your execution...
>>> for example, i will try to explain you what i usually do here:
>>> first of all, i insert all of my machine cluster names to my mpdboot  
>>> file.
>>> node1
>>> node2
>>> ...
>>> node6
>>>
>>> then, i call all of them to be a part of cluster, using the command  
>>> mpbtrace
>>> if everything gets ok, the command list me the name of the machines  
>>> that are
>>> on the cluster:
>>> node1
>>> node2
>>> ...
>>> node6
>>>
>>> then i can run molpro using:
>>>
>>> molpro -n 6 -o CAS.out CAS.in
>>>
>>> in my case, this call 6 instances of molpro to mpi.
>>>
>>> you need to think that you use multithread computers ( if they are  
>>> pentium 4
>>> HT ), so, you will need to call 2 instances on the same machine. Doing  
>>> this,
>>> your computer gets fully used, otherwise, it will use only 50% of your
>>> processor, and your calculus will spend a lot more time ...
>>>
>>> to mpi call 2 instances on the same machine, you need to specify it on  
>>> the
>>> mpdboot file.
>>> i realy don´t have it running on my cluster, because we use Athlon 64
>>> processor, and, as you know, they don´t have Hyper Threading support,  
>>> so,
>>> only one instance per node it is ok.
>>>
>>> So, i hope all of these have been helping to you.
>>> you can ask me if you have any other doubts .
>>>
>>> best regards
>>>
>>>
>>> On Dec 4, 2007 8:16 AM, Javier Díaz Montes <javier.diaz at uclm.es> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a problem when I run Molpro in several nodes of a cluster. The
>>>> problem is when I try to kill a job.
>>>> If I kill the job, using the qdel command of sge, some processes of  
>>>> Molpro
>>>> remain running. I have seen that molpro create 2 processes, in each  
>>>> node,
>>>> of molprop_2006_1_i4_p4_mpi.exe, one running and another one sleep.  
>>>> Then,
>>>> when I kill the job, the processes running are killed and the process
>>>> which was sleeping starts to run.
>>>>
>>>> Molpro create 2 processes of molprop_2006_1_i4_p4_mpi.exe in each  
>>>> node, is
>>>> this normal?
>>>> How could I kill a Molpro job?, because Now I have to kill these  
>>>> processes
>>>> using pkill command in each node.
>>>>
>>>> Regards,
>>>> Javi
>>>>
>>>>
>>>> --
>>>> +---------------------------------------------------------------+
>>>> Javier Diaz Montes
>>>> PhD Candidate
>>>> Grupo de Quimica Computacional y Computacion de Alto Rendimiento.
>>>> Departamento de Tecnologias y Sistemas de Informacion.
>>>> Escuela Superior de Informatica.
>>>> Universidad de Castilla-La Mancha.
>>>> Paseo de la Universidad, 4; 13071 Ciudad Real; SPAIN
>>>> Tel.: 34-926295300; Ext: 3724
>>>> e-mail: javier.diaz at uclm.es
>>>> +---------------------------------------------------------------+
>>>>
>>>
>>>
>>>
>
>
>




More information about the Molpro-user mailing list