I have been learning to use cisTEM recently and it runs very well when used interactively but I am having trouble running it on the cluster with our slurm scheduling system.
We set it up with our system administrator as explained under simple configuration in FAQ. Jobs start normally and initial batches of slurm jobs complete successfully but then the run would usually crash at a random point and only give “Master socket disconnected” error message (there is usually no red warning message inside the GUI). The problem seems to occur at a random point and only when running 3D refinements with my dataset or ab-initio 3D reconstructions with apoferritin test dataset. Movie alignment, ctf estimation and 2D classifications worked with test dataset and using slurm. I’ve already tried to use different partitions, number of cores, memory requirements (between 4G and 20G) and delay times (between 1s and 5s) without success. Any advice is appreciated.