SLURM jobs

Hi,

First of all thank you so much for the release of cisTEM, it looks very promising.

I am trying to set cisTEM up with our queuing system, SLURM. In the documentation (https://cistem.org/documentation#tab-1-15) there is some information on how to do this.

Although, the descried way seems to submit every single process as a individual SLURM job, resulting in multiple nodes being fire up to run only a single process each.

This seems suboptimal and on top of that, it adds a significant load on the SLURM head node.

To make better usage of resources one could make a special partition e.g (cisTEM) in SLURM, which enables job sharing on all the cpu threads (e.g. "SHARED=YES:72" if machines in partition has 72 cpu thread each). Then the cisTEM job command would look something like this: "srun -n 1 --share -p cisTEM -o /dev/null /cisTEM_bin_directory/$command".

This would to some degree work, but it would still generate a lot of noise in the queue and a lot of cross talk if other cisTEM-SLURM jobs are started simultaneously.

At the moment we are running cisTEM through "srun.x11" which gives an option to run GUI jobs on a allocated SLURM node. Only downside to this is that users need to close cisTEM and exit the terminal for the SLURM allocation to terminate. A reasonable wall need to be set due to this.

I know cisTEM is not OpenMPI compatible, but it could be nice if it was possible to submit a single srun/sbatch job to a single node that uses all the threads on that node.

So "No. Copies #" was replaced by "-n #". Does that make sense?

I was hoping that everybody running cisTEM on SLURM could comments on my thoughts and especially if you have a better solution on running cisTEM through SLURM.

Cheers,

Jesper

Hi Jesper,

I don't know if you've seen it, but there is some information from another user (Craig Yoshioka) with a slurm cluster here :-

https://cistem.org/frequently-asked-questions#tab-1-3

In a future version we will incorporate a more flexible run profile system, which will allow you to specify the number of jobs that each run profile contributes, and add a $number_of_jobs variable. This way, you could even use mpirun to launch the jobs.

Cheers,

Tim

Thank you so much Tim,

This works exellent.

Sorry I did not see the FAQ part on SLURM.

Cheers,

Jesper