GUI crashes while running automatic and manual refinements

7 posts / 0 new
Last post
kbasore
GUI crashes while running automatic and manual refinements

Hello!

I've recieved great results so far with cisTEM, but haven't been able to successfully run an automatic or manual refinement. When I try to start a refinement, the upper left panel that usually shows what exact job is running remains blank, and after a few seconds, the GUI crashes. In the terminal where I openned the GUI, I usually see 'Segmentation fault (core dumped)'. It doesn't look like I'm hitting the memory limit on the cluster I'm using (Torque). Any idea what could be causing this?

Thanks!

-Katherine

timgrant
Hi Katherine,

Hi Katherine,

Sorry to hear you are having problems.  I take it you have managed to run the earlier jobs ok using the same run profile?

One way to get to the bottom of this, is if you could run a debug version and give us the backtrace output. Can you download a debug version of the cisTEM gui from here :-

http://grigoriefflab.janelia.org/sites/default/files/timgrant_18-04-25_5...

unzip it with :-

gzip -d cisTEM_debug.gz

then run it in GDB :-

gdb ./cisTEM_debug

At this point you need to type "run".  The gui should then run, and you should be able to open your project.  When it crashes, please type "bt" in the window and post the output.

Thanks!

Tim

kbasore
Hi Tim,

Hi Tim,

Thanks for the help! And yes, I'm able to run earlier jobs ok.

 

Here's the output from the debug version. I tried running auto refine, and the GUI didn't crash, but it stopped responding (plus no jobs ever initiated). After I forced it to quit, I typed bt and got this:

 

(timgrant_18-04-25_5920_cisTEM_debug:8948): Gtk-CRITICAL **: IA__gtk_tree_row_reference_new_proxy: assertion `path->depth > 0' failed

Warning: copying a resolution statistics object

Warning: copying a resolution statistics object

[New Thread 0x2aaabbb48700 (LWP 9503)]

 

Program received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x2aaabbb48700 (LWP 9503)]

Image::AddFFTWPadding (this=0x510) at core/image.cpp:4326

4326 in core/image.cpp

(gdb) bt

#0  Image::AddFFTWPadding (this=0x510) at core/image.cpp:4326

#1  0x0000000000758e5c in Image::ReadSlices (this=0x510, input_file=0x510, start_slice=648, end_slice=0) at core/image.cpp:4158

#2  0x000000000044f75f in AutoMaskerThread::Entry (this=0x510) at gui/my_controls.cpp:1639

#3  0x0000000000f58350 in wxThread::CallEntry() ()

#4  0x0000000000f5c4e3 in wxThreadInternal::PthreadStart(wxThread*) ()

#5  0x0000000000f5481c in wxPthreadStart ()

#6  0x0000003c72607aa1 in start_thread () from /lib64/libpthread.so.0

#7  0x0000003c71ae8bcd in clone () from /lib64/libc.so.6

 

I get the same output when I try a manual refinemet (global). Hope this helps!

 

Best,

Katherine

 

 

timgrant
Hi Katherine,

Hi Katherine,

Looks a bit like a memory problem, what is the box size of the particles, and how much memory is on the machine that you are running the GUI on?

Tim

kbasore
Hi Tim,

Hi Tim,

The box size associated with the dataset I'm using to debug is big (1296 pixels). This is much larger than necessary, so I re-ran the ab-initio reconstruction with a box size of 600 pixels over the weekend of a different dataset (I can't go smaller than this for either dataset). I tried running the refinement steps afterward and had the same issue of no jobs initiating. I'm currently running jobs with that db file, so I want to hold-off on running the debug version with it. I imagine I'd get the same output. 

I'm running my jobs on a cluster, and I'm limited to 16GB of RAM. Is this not enough? The machine has much more, and perhaps I can request more. 

Best,

Katherine

timgrant
Hi Katherine,

Hi Katherine,

The info you sent me suggests it is crashing in the automasking step, this is done by the GUI process, so it is the memory of the machine that the GUI is running on that matters.  Are you running that on the head node of the cluster for example?  For 600 pixels, I believe you would need ~2.5GB of memory. for 1296 you would need ~25GB of memory. 

If you turn off automasking in the expert options, it will skip this step, and you will likely be able to proceed. Although for the reconstruction in the refinement steps you will need a decent amount of memory also.

Tim

 

 

kbasore
Hi Tim,

Hi Tim,

Turning off automasking did the trick! Auto refine jobs are running now! I am running the GUI on the login node of the cluster, where I have 16GB of memory, so it sounds like my dataset with the samller box size should be able to complete (I'll give it a try once I'm done with my ab-initio reconconstruction). Thank you so much for your help! I'll let you know if I run into any issues.

Best,

Katherine

Log in or register to post comments