EMAN2 box files to Particle Position Assets text file conversion

12 posts / 0 new
Last post
kushalsejwal
kushalsejwal's picture
EMAN2 box files to Particle Position Assets text file conversion

Dear All,

I am looking for a simple script or bash command which can batch convert the box files generated in EMAN2 into one big Particle Position Asset text file that can be read by cisTEM accordin the format mentioned in the FAQ ( https://cistem.org/documentation#tab-1-3 )

Have anybody tried it already? Thanks in advance.

Best,

Kushal

timgrant
Hi Kushal,

Hi Kushal,

We do not have such a script - although I think it shouldn't be too difficult to create.

If you do give it a go, I'm happy to help out with any quesitons you have.  If you manage to get one working, please post your results here.

Cheers,

Tim

kushalsejwal
kushalsejwal's picture
Hi Tim,

Hi Tim,

With my limited Python programming skills, I came up with the following script, which gets the work done (I am sure there must be more efficient way to do this in fewer lines with Bash commands) and it generates one .txt file that contains the coordinates of all the particles from all micrographs (picked in EMAN2's boxer) but when I check the particle stack generated by CisTEM it doesn't seems to pick the correct particles. I do not see my particles in the particle_stack_0.mrc, it appears to be random backround. Does CisTEM defines the coordinates differently than EMAN2?

Here is the script : 

___________________________

import glob

with open('coordinates.txt', 'w') as outfile:
    for filename in glob.glob("*.box"):
        with open(filename) as infile:
            for line in infile:
                outfile.write(filename.replace("box","mrc") + " " + " ".join(line.split("\t")[:2]) + "\n")

___________________________

Best,

Kushal

timgrant
Hi Kushal,

Hi Kushal,

One complication may be that the cisTEM co-ordinates are stored in angstroms, not pixels.  I think the EMAN positions may be stored in pixels, so you would need to multiply the position by the pixel size.

Also, I remember that EMAN used to store the co-ordinate of the corner, and then size of the box (I'm not sure if this is still true).  cisTEM needs the centre of the box.

Cheers,

Tim

kushalsejwal
kushalsejwal's picture
Hi Tim,

Hi Tim,

Thanks for the reply. Indeed I have to incoroprate these two changes in the script. For storing the coordinates in Angstrom, I simply set the Apix value to 1. When I now import this new coordinate file and generate a new refinement package based on them, I see all my particles in Display Stack GUI. But I still have some issues:

1) In the display stack, the particles are colored back. I have negative stain data and while importing the microgrpahs, I ticked "Particles are white". Is it a bug? Will this affect 2D classfication?

2) When I use this particle stack (~2000 particles) for 2D classfication, after the random start, all the classes look grey (See screensht : https://ibb.co/heQayb)

 

This is a small negative stain dataset and I do not wish to do the CTF correction.

Best,

Kushal

timgrant
Hi Kushal,

Hi Kushal,

cisTEM expects protein to be black, so if you tick "particles are white" the contrast will be inverted.  This is the expected result, so that all sounds fine.

There is no way to turn of CTF correction at the moment.  What defocus values do you have for the particles that you imported?

Cheers,

Tim

kushalsejwal
kushalsejwal's picture
Hi Tim,

Hi Tim,

I have rather large defocus for negative stain particles and the particle density is very high. I am not targetting 3D and only wish to do 2D classficiation with the dataset.

I did CTF using cisTEM now and subsequently 2d classificaiton seems to work fine.

So in principle the particles coodinates picked with EMAN2 works well with cisTEM.

Thank you and the whole cisTEM team for support and a wonderful software.

Best,

Kushal

Arne
Order of micrographs

Dear Tim and all, 

 

we are trying something similar and it works for a single mircrograph or several if the order is exactly as stored in the database. However, this must not neccessarily always be true. If micrographs had been improted from various folders the order in the database may not correspond to the logical order (for example alphabetical). 

Is there a quick way to obtain a list with micrograph id and micrograph name from cisTEM? 

this would resolve our issue

 

many thanks

Arne

 

Arne
Order of micrographs

Ok I just found that you can also import with the name - dont need the identifier. 

 

Nevertheless, is it possible to generate a list with ID and Micrographname?

 

cheers

Arne

 

timgrant
This is listed in the image

This is listed in the image assets panel.  If you want to output it as a text the only way would be to directly access the information in the database.  The following command would do this on the command line :-

sqlite3 name_of_databse.db "select image_asset_id, filename from image_assets";

Cheers,

Tim

 

dovile
The list of selected micrographs

Dear Tim,

How can I modify this script (sqlite3 name_of_databse.db "select filename from image_assets") to get a list not of all micrographs, but only of the selected ones that are placed in a new image group, for example, called ctf-better-than-4?

Cheers,

Dovile

 

timgrant
Hi Doville,

Hi Doville,

First you need to know the group ID for the group you want.  If you run :-

sqlite3 my_database.db "select * from image_group_list;"

you will get a list of the image groups, the ID is in the first column.

Then if you run the command below replacing $group_id (it appears twice) with the id of the group you want, you will get the filenames you want.

sqlite3 my_database.db "select filename from image_assets, image_group_$group_id where image_assets.image_asset_id = image_group_$group_id.image_asset_id;"

e.g. if you want the filenames of all the images in group 1 :-

sqlite3 my_database.db "select filename from image_assets, image_group_1 where image_assets.image_asset_id = image_group_1.image_asset_id;"

Thanks!

Tim

Log in or register to post comments