I want to be able to call cisTEM commands from the command line without using interactive mode.
I know you can call the binaries like $ ./refine3d from a terminal and get an interactive session. But this is not conducive to scripting, since the arguments have to be passed in individually.
I think this must be possible because the way the GUI calls commands (from "Run Profiles") appears to be by issuing some bash command. However, the contents of $command is not useful in trying to run jobs independently, it contains some IP addresses and a number.
In the source code, "userinput.cpp" appears to have support for a "scripted mode", based on whether the caller is a tty. However, I am not sure whether this is implemented.
My go to is:
1) open a new file in gedit
2) add #!/bin/bash
3) run the CLI program interactively
4) kill it just after entering the Iast option
5) copy the prompt & answers from the terminal to the text editor
6) delete all prompt, but keep the answers, in order, one per line, watch for trailing ":"
7) anything i'm likely to vary, I replace with a variable. E.g. If my answer for pixel size was 1.2, I would instead put $pixelSize and at the top of the file define pixelSize=$1 (or a fixed val)
8) on the line just prior to your first option:
program-name << eof
9) following your last line
10) there are a number of ways to then run this in parallel, my favorite is to use the cli to "gnu parallel" which you can find plenty of info for out on the wild wild web
Thanks himesb for your response. I think this can work for me.
I can see how gnu parallel might be used to dispatch many different refinement jobs, but I'm wondering what needs to be done to have multiple workers process the same job -- I'm not sure that dispatching N jobs with the same parameters will actually allow multiprocessing of the same refinement. Any thoughts?
You can use the inputs "first particle to refine" and "last particle to refine" to split the job up. I.e. if you have 10000 particles and want to run 10 processes, you run the job 10 times, but with first/last particle set to 1-1000, 1001-2000, 2001-3000 etc. You then need to append all the resulting .par files together to get the final output.
I would also like to be able to run things like a long series of 2D class averaging runs "headless" on our HPC using an sbatch script. If there is an way to do this or a place to find the commands somewhere in the docs that would be really helpful.
Right now, there is no way to run cisTEM outside of the GUI in a way that will integrate with a cisTEM project. You can run the programs, but then you have to keep track of the data and do all the parallelisation yourself with scripting. I'm afraid there is not really any documentation for this, however if you run the program yourself it will ask you for inputs, and there is some very small help available by answering ? to the question. For 2D classification, you would be running the refine2d program, and if you want to do parallelisation, then you will also need the merge2d program.
Is there a reason you cannot run cisTEM on your cluster with the jobs launched through the GUI?