Unable to get a cisTEM job to parallelize on multiple nodes in cluster

8 posts / 0 new
Last post
saikat
Unable to get a cisTEM job to parallelize on multiple nodes in cluster

Hi,

I am trying to run cisTEM after requesting an interactive session over mutiple nodes in our cluster. The jobs are just running on processors on single node and not using the other nodes.

The file system in the cluster is gpfs and we are using PBS job submission.

Do you have any suggestions for effectively running cisTEM jobs using mutiple nodes.

Thanks

timgrant
Hi, 

Hi, 

If using a PBS cluster, can't you just qsub all the jobs as single jobs and not worry about running an interactive session?  

I think if you want to run in an interactive session this way, you will need to have a run profile that explicitly ssh's to the machines directly, however each time you start a session the machines will likely be different, so this isn't really optimal.

Tim

 

 

saikat
Hi Tim,

Hi Tim,

Thanks for your response.

So far I have been running the job using the GUI after requesting the interactive session.

What is the best way to launch it as conventional jobs? Do you have any template job file which you can share.

Thanks

saikat
job

Hi Tim,

The reason I asked the above question regarding job file is because we were trying earlier to launch jobs on mutiple nodes using the cisTEM GUI with the following job template file:

#!/bin/csh
#PBS -l nodes=3:ppn=40
module load cistem
$variable

We would specify this job file through the Manager Command at the Settings window of cisTEM.

Unfortunately this didn't help get the job launched on multiple requested nodes.

So any suggestions will be really useful.

Thanks

timgrant
Hi,

Hi,

I think you need to launch each job as 1 node, 1 ppn.  I have not done this on PBS explicitly, but I imagine you need to make a script called "submit_cistem_job.b" or something that looks something like :-

#!/bin/csh
#PBS -l nodes=1:ppn=1
/path_to_cistem_binaries/$1 $2 $3 $4

Then your run command would be something like "qsub submit_cistem_job.b $command"

Your manager command would depend on whether you were running on the head node in which case it would just be "/path_to_cistem_binaries/$command", or you may need to change this to ssh to the head node etc, depending on exactly where you are running the gui.

You will probably need to edit this somewhat, but I hope it gives you the general idea. 

Please feel free to ask questions if this doesn't make sense.

Cheers,

Tim

saikat
Thanks Tim

Thanks Tim

I will try it and let you know.

-Saikat

saikat
Hi Tim,

Hi Tim,

A quick question regarding all the $1,$2,$3 and $4. I guess $1 is the command itself but what are $2,$3,$4?

Is there a one-liner command that will get ride of the commas.

Thanks

rnavaza
A way to make cisTEM work with Torque

Hi,

I got cisTEM to work on multiple nodes in a cluster managed with Torque. Here's the way to do it:

0) Verify that the cluster works : same home folder for your user account on each node, ssh without password between the submit host and the nodes, qsub command in PATH, and so on...

1) Now create a script (that I called submit.sh) with the content:

#!/bin/bash
cat - <<EOF | qsub
#!/bin/bash
#PBS -N cisTEM.${1}
#PBS -l nodes=1:ppn=1
/your_path_to_cisTEM_binaries/${@}
EOF

2) Give that file the executable rights:

chmod 755 submit.sh

3) In cisTEM settings, add a new "Run Profile" with the following parameters :

Manager Command: /your_path_to_cisTEM_binaries/$command
Gui Address: Automatic
Controller Address: Automatic
Command -> Edit:
      Command: /your_path_to_your_script/submit.sh $command
      No. Copies: (Total number of CPUs you want to allocate in the whole cluster)
      Delay (ms): 10

That's all. Hope this helps ;-)

Log in or register to post comments