Calculation thread has been waiting

Daniel Asarnow
I'm now seeing another error leading to hung jobs:

"Calculation thread has been waiting for something to do for 30.000000.2 seconds - going to finish"

The behavior is similar to "slave disconnected" in that there is one message for each hung job, and the hung jobs never complete. Again, I'm not ready to blame cisTEM, but I'd appreciate any insight into why the error could be thrown.



Hi Daniel,

This means that there has been no communication between the master and the job for over 30 seconds, because of this the job kills itself.

I guess this points to a possible network problem - is this on a cluster?


Daniel Asarnow
Yes, it is, with all storage

Yes, it is, with all storage on large traditional RAIDs mounted over NFS with multiple simultaneous users. I'm going to get the admins to watch carefully while I try and induce the error.

