Name
e2boxer.py - particle picking utility (This page is still under construction)
Usage
This page will focus almost exclusively on the Gauss mode of the e2boxer tool which includes both automatic particle picking and CTF estimation in both GUI and command-line mode. The Gauss mode in e2boxer is intended to replace the functionalities in sxboxer which will soon become obsolete. For additional information on the e2boxer tool, see EMAN2's e2boxer help page.
e2boxer.py encapsulates several functionalities
1. Interactive particle picking using GUI
Run e2boxer.py as follows:
e2boxer.py mic0.hdf &
The GUI will automatically start up. Select 'manual' in the drop down menu labeled Current Boxing Tool and use your mouse to select particles by clicking on the micrograph window.
2. Automated particle picking using Gauss convolution method via GUI
Run e2boxer.py as above and select Gauss in the Current Boxing Tool drop down menu. Set desired parameters listed under Parameters of Gauss convolution and use Run button to start automatic particle picking. You will see the boxed particles appear in the Particles window. To write the particle images to disk, use the Write output button. This will save the particle images to bdb:particles/micname_ptcls, where micname is the name of the micrograph from which particles were picked. To clear the boxes and re-do autoboxing using different parameters, use the Clear Boxes button, change parameters, and use Run button to re-do automatic particle picking.
3. Command-line (batch) automated particle picking (no GUI) using Gauss convolution method:
There are two ways to do this:
The recommended way is to run automated particle picking for a micrograph, say mic0.hdf, using Gauss convolution method via GUI as described above. This will cause the particle picking parameters for mic0.hdf to be saved to the database. Next, invoke e2boxer.py with the option --gauss_autoboxer set to mic0.hdf:
e2boxer.py --gauss_autoboxer=mic0.hdf --write_ptcls --boxsize=64 mic*.hdf
For the example above we put mic*.hdf which denotes all the micrographs in the current directory which are prefixed with mic and have a hdf extension. The --write_ptcls flag causes the boxed particles to be saved to particles in the current directory. Particles picked from mic0.hdf would be saved to bdb:particles#mic0_ptcls, mic1.hdf to bdb:particles#mic1_ptcls, etc.
One can avoid having to invoke the GUI as follows by manually filling in the required parameters in the database using python commands:
gbdb = db_open_dict('bdb:e2boxercache#gauss_box_DB')
gbdb['testingtesting'] ={'gauss_width':1.0, 'pixel_input':5.2, 'pixel_output':5.2, 'thr_low':1.0, 'thr_hi':30.0, "invert_contrast":False, "use_variance":True}
Next, invoke e2boxer.py with the option gauss_autoboxer set to the key the parameters were saved under:
e2boxer.py --gauss_autoboxer='testingtesting' --write_ptcls --boxsize=64 mic*.hdf
4. CTF estimation, in GUI mode (only defocus, it is not recommended in sparx to use envelope functions).
Start e2boxer.py in gui mode:
e2boxer.py mic0.hdf &
and select Gauss under the Current Boxing Tool drop down menu. Adjust/enter CTF parameters and then use Estimate CTF button to estimate the defocus and use the Inspect CTF button to examine the CTF estimation result.
5. CTF estimation, in command-line mode (only defocus, it is not recommended in sparx to use envelope functions).
There are two ways to do this:
The recommended way is to use the GUI to adjust/enter CTF estimation parameters and then estimate CTF for a micrograph, say mic0.hdf, as described above. Next, invoke e2boxer.py with --do_ctf option set to mic0.hdf:
e2boxer.py --do_ctf=mic0.hdf mic*.hdf
The CTF estimation parameters entered for mic0.hdf in the GUI will be used to estimate CTF of mic*.hdf, and the associated CTF objects with the estimated defocus will be stored in the micrograph.
If one does not wish to invoke the GUI to enter CTF parameters, one can also manually fill the database with the desired parameters using python commands:
gbdb = db_open_dict('bdb:e2boxercache#gauss_box_DB')
gbdb['testingtesting'] ={"pixel_input":1.84, "pixel_output":1.84, "ctf_cs":2.0, "ctf_fstart":0.02, "ctf_fstop":0.5, "ctf_ampcont":10, "ctf_volt":120, "ctf_window":512, "ctf_edge":0, "ctf_overlap":50}
Next, invoke e2boxer.py with the option do_ctf set to the key the parameters were saved under:
e2boxer.py --do_ctf='testingtesting' mic*.hdf
6. The windowed particles must be properly normalized in order for SPARX to perform optimally. In both GUI and command-line mode, there is a choice of normalization methods. To ensure the windowed particles are normalized properly as described in Methods below, set option --norm=normalize.ramp.normvar if autoboxing in command-line mode. If autoboxing is done via the GUI, use Write output in the main window to initiate writing the particles to file. A separate Write Particle Output window will pop up, and in the drop down menu next to Normalize Images, select 'normalize.ramp.normvar'.
Input
- <micrograph>
- micrograph filenames to process
Output
- <particles>
- particles, either as boxed particle images or coordinates list
Parameters
Automatic particle picking and CTF estimation parameters can be adjusted in the GUI after switching to Gauss mode via the drop down menu Current Boxing Tool. They can also be set manually in the database for command-line mode (see above).
- --Box Size
- Box size in pixels
- --Use Variance
- use variance micrograph for particle picking.
- --Invert Contrast
- invert densities in micrograph(s).
- --Overlap
- (auto:grid) number of pixels of overlap between boxes. May be negative.
- --Gauss Width Adjust
- width of the Gaussian kernel used for automated particle picking.
- --Threshold Low
- low CCF threshold for automated particle picking. Particles with a CCF value below this threshold will be discarded automatically.
- --Threshold High
- high CCF threshold for automated particle picking. Particles with a CCF value above this threshold will be discarded automatically.
- --Input Pixel Size
- pixel size of the input micrograph. It is determined from the microscope magnification and physical pixel size of the scanner. It has to be set correctly in order for CTF determination to properly.
- --Output Pixel Size
pixel size of the windowed particles. Note: pix_out >= pix_in. If output pixel size is larger than micrograph pixel size, the windowed particles will be decimate appropriately. Highly recommended to use if the scanning was done with excessively small pixel size.
- --F_start
- Lower frequency for CTF determination in [1/Å].
- --F_stop
Higher frequency for CTF determination in [1/Å]. Only the region of the estimated power spectrum between fstart and fend will be used for the CTF determination.
- --Window size
- window size for power spectrum determination. Default is 512 pixels.
- --Overlap
- overlap of windows for power spectrum determination (Welch method).
- --Edge size
- edge size of individual windows disregarded during power spectrum determination.
- --Amplitude Contrast
- amplitude contrast value in %. Default is 10%.
- --Cs
- spherical aberration coefficient in mm. Default is 2.0.
- --Voltage
- microscope voltage. Default is 200kV.
Description of typical usage
- Select a "typical" micrograph and activate e2boxer.py in GUI mode.
If the intention is to process micrographs interactively, select Manual in the drop down menu Current Boxing Tool and simply use computer mouse to select particles from the window that has the micrograph in it. They will appear in the small window, it is possible to delete those that were selected by mistake. Once done, use Write output button in the main window. If CTF is needed, before writing images switch to Gauss mode by selecting Gauss in the drop down menu Current Boxing Tool and estimate the CTF. The resulting CTF, including pixel size, will be added to the header of written images. It is also possible to restart the session and pick up additional particles.
- If the intention is to process micrographs automatically in a command line mode, one has to first establish (a) reasonable parameters for the automated method and (b) fstart and fstop for CTF defocus estimation. For automated method, one has to run the program with default parameters and examine the outcome in the small window. Some adjustment of Gaussian width (using the slider) may improve the result. Also, one may consider whether to use the "variance image" (see Method below) or original image. Once this is settled, one has to use to adjust max and min thresholds for the CCF using sliders within the Gauss Advanced panel. Unless micrographs were collected using widely varying conditions (for example mixing film and CCD), the CCF thresholds should hold for the entire set of micrographs. Once thresholds are found, they can be used in subsequent automated runs of e2boxer.py.
Each time you process a micrograph with e2boxer.py and write particle images, windowed particles will be stored in a bdb file <name>_ptcls in the particles directory where name is the micrograph name. In order to gather particles from different micrographs, you can virtually link each bdb particle file to create an unique virtual stack (metafile). This is done using e2bdb.py. For example after processing micrographs mic0.hdf, mic1.hdf and mic2.hdf, e2boxer will write bdb files mic0_ptcls, mic1_ptcls and mic2_ptcls in the directory particles. To link each file to a virtual stack named data2: e2bdb.py bdb:particles#mic0_ptcls bdb:particles#mic1_ptcls bdb:particles#mic2_ptcls --appendvstack=bdb:.#data2. All bdb files are virtually appended to data2. Only binaries informations are link to the virtual stack. It means if attributes are changed to the header of data2, it will not affect the originals headers of mic0, mic1 and mic2. But if you modify an image from the virtual stack, this image will also be modified to the original one.
Readjustment of the CTF using e2ctf.py. The minimum command is e2ctf.py bdb:data2 --storeparm. For each set of particles that originated come from the same micrograph, CTF will be readjusted and stored in the header (option storeparm). By default, e2ctf.py processes all particles from all micrographs. But if the intention is to readjust CTF for particles of a specific micrograph, you need to specify the micrograph name: e2ctf.py bdb:data2 --storeparm --id_micrograph=mic2. You can also adjust new CTF parameters in the e2boxer.py GUI. In this case CTF parameters will be automatically written to both the micrographs (and particle images if they are written to disk) after closing the GUI.
Eventually, particles will be stored in a series of bdb files linked into a metafile. So, the files can be located in different directories (EMAN2DB subdirectories). If so desired, one can copy them all into one bdb file uing sxcpy command or in order to transfer to another computer, to an hdf file.
Method
Below describes Gauss convolution based automatic particle picking which occurs in e2boxer's Gauss mode.
- The micrograph is high-pass filtered using a Gaussian function with half-width derived from the particle window size.
If requested, the overall average is subtracted and the micrograph is squared resulting in a variance image. For data with low contrast, this step significantly improves the detection rate.
- The micrograph is convolved with another Gaussian function with width derived from the expected particle size (can be adjusted using slider).
- Locations of putative particles are selected as maxima of the Gaussian-convolved image with additional exclusion of locations that are are too close to each other, as determined by the output window size.
The micrograph is decimated using a convolution with a sinc-Blackman kernel and subsampling. (Note: do not use micrographs decimated by the so-called pixel binning, it is incorrect and creates problems in sparx).
- The particles are windowed.
Within each particle window, a 2D linear trend (a ramp) is subtracted.
- Each particle window is normalized using values within window corners (thus presumably only the background noise): the average and standard deviation of the background noise are calculated and then the average is subtracted from the entire image and the entire image is divided by the standard deviation. In effect, the background noise is normalized (zero average and standard deviation one), while particles have values depending on their intensity. It is essential to have this (proper) normalization in order for many sparx commands to perform optimally.
Reference
Please consult:
Penczek, P.A.: Single Particle Reconstruction, in U. Shmueli (Ed.), International Tables for Crystallography 3 edn., vol. B Reciprocal Space, 2008.
for a broad description of the automated method and rationale behind normalization.
Author / Maintainer
it is a version of e2boxer by Steve Ludtke. Gauss method was written by P. Penczek. Command-line version was written by J. Loerke. Additional changes by Penczek group.
Keywords
- category 1
- APPLICATIONS
Files
e2boxer.py
See also
Maturity
- beta, still under development
- not all features and formats have been tested rigorously.
Bugs
Command-line version is still fairly unstable and needs further testing.