SLURM jobs

3 posts / 0 new
Last post
jelka
SLURM jobs

Hi,

First of all thank you so much for the release of cisTEM, it looks very promising.

I am trying to set cisTEM up with our queuing system, SLURM. In the documentation (https://cistem.org/documentation#tab-1-15) there is some information on how to do this.

Although, the descried way seems to submit every single process as a individual SLURM job, resulting in multiple nodes being fire up to run only a single process each.

This seems suboptimal and on top of that, it adds a significant load on the SLURM head node.

To make better usage of resources one could make a special partition e.g (cisTEM) in SLURM, which enables job sharing on all the cpu threads (e.g. "SHARED=YES:72" if machines in partition has 72 cpu thread each). Then  the cisTEM job command would look something like this: "srun -n 1 --share -p cisTEM -o /dev/null /cisTEM_bin_directory/$command".

This would to some degree work, but it would still generate a lot of noise in the queue and a lot of cross talk if other cisTEM-SLURM jobs are started simultaneously. 

At the moment we are running cisTEM through "srun.x11" which gives an option to run GUI jobs on a allocated SLURM node. Only downside to this is that users need to close cisTEM and exit the terminal for the SLURM allocation to terminate. A reasonable wall need to be set due to this.   

 

I know cisTEM is not OpenMPI compatible, but it could be nice if it was possible to submit a single srun/sbatch job to a single node that uses all the threads on that node.

So "No. Copies #" was replaced by "-n #". Does that make sense? 

I was hoping that everybody running cisTEM on SLURM could comments on my thoughts and especially if you have a better solution on running cisTEM through SLURM.

Cheers,

Jesper

 

timgrant
Hi Jesper,

Hi Jesper,

I don't know if you've seen it, but there is some information from another user (Craig Yoshioka) with a slurm cluster here :-

https://cistem.org/frequently-asked-questions#tab-1-3

In a future version we will incorporate a more flexible run profile system, which will allow you to specify the number of jobs that each run profile contributes, and add a $number_of_jobs variable.  This way, you could even use mpirun to launch the jobs.

Cheers,

Tim

 

jelka
Thank you so much Tim, 

Thank you so much Tim, 

This works exellent.

Sorry I did not see the FAQ part on SLURM.

Cheers, 

Jesper 

Log in or register to post comments