DockEM is a software package to quantitatively dock, or fit, a crystal structure into an EM map of a macromolecular complex that contains that structure. A density map is calculated from the domain to form a density search object. This search object is then correlated locally (in real space) throughout the asymmetric unit of the map in order to find the position where the density of the search object and the map match best. The positions corresponding to the top 10,000 normalised correlation coefficients are available for inspection.
The algorithm and a validation example are presented in Roseman (2000) Acta Cryst. D56, 1332-1340. An example of an application is published in Roseman et al. (2001).
Some notes on the algorithm
The domain to be docked is treated as a rigid body. If you suspect it has two parts joined by a flexible region then the best approach is to break it in two pieces and dock them separately. If the two solutions are compatible this will provide some degree of validation. Otherwise a larger domain, or one with more distinctive features for docking, could provide constraints for another.
The real space local correlation coefficient is computed between the search object and the EM map at every point in the 6 dimensional search space (3 translational degrees of freedom and 3 in orientation). Sample points from the search object are compared with corresponding points from the map that lie under the footprint of the search object at the current position. Therefore no part of the map beyond the boundary of the search object affects the correlation coefficient, though there is no explicit masking of the map density.
Therefore the algorithm can deal with molecules in tight complexes and interfaces. Search object density extending into weaker density of map, or outside molecule, will penalise the score.
The EM map of the complex should be band pass filtered. The low pass Fourier filter should be set at the maximum resolution the EM map is reliable. I have been using the FSC = 0.5 limit (See Frank, 1996). A high pass filter should be set at a resolution corresponding to be bit larger than the maximum dimension of the domain to be docked.
The program Mapfilter (included with DockEM) will Fourier filter the map, and set up the unit cell dimensions. The standard cell is defined as P1, with the origin (0,0,0) at the first pixel. Maps are assumed to be cube shaped, i.e. the X,Y,Z dimensions are equal to one another.
Then an initial placement of the coordinates into the map should be made, using O or possibly another program. This needs only to be approximate and defines the origin of the search. The domain should at least be placed somewhere inside the map. Save the coordinates at their new position. A density model can be made from the coordinates using Makedensity. This must be Fourier filtered at the same resolution as the map, using Mapfilter.
The density matching search is executed within the program DockEM. A file describing the angular range to search is required. This is in the same format as used for SPIDER angles files and can be generated with the SPIDER VO EA command. The DockEM program could run for a long time, depending on the size of the search object and the sampling, step size and angular range to be searched. Progress can be monitored and a run time estimated from the standard output of the program, as the progress though the angles list is reported. If the search region can be restricted, it will go quicker. In some cases the sampling of the EM map and search object can be reduced to be compatible with the resolution, i.e. sampling in Å/pixel at 1/3 * maximum resolution is adequate. Finer sampling will make the computation unnecessarily longer.
Several files are output on completion of the search. A three digit run code 00n is associated with each DockEM search run, and is part of the output filenames.
Atomic models for any of the scores listed in the file searchdoc00n.dkm can be generated using the program DockXsoln. The same PDB file and sampling as used for the search should be input. Obviously the first solution to inspect is the top score. PBD files with the name pattern tophit00n.00I.pdb are generated, where n is the run code, and I is the rank in the scores. Many of the solutions listed will be neighbours or shoulders to a local maximum. It is possible to detect clusters of solutions based on the X,Y,Z shift. Clusters in angular space can also be detected, but sometimes it is not obvious that two sets of Euler angles describe a very similar rotation.
5 main steps:
1. Set up the cell dimensions and filter the EM map.
2. Define starting position by manual placement into the map, using O.
3. Convert the coordinates to a density, and filter.
4. Run DockEM to do the local correlation search.
5. Extract the solutions as coordinates and examine in O.
Plus 2 others:
Convert the map to mrc format.
1. Set up the cell dimensions and filter the EM map
Run Mapfilter, to filter the EM map and set up the cell parameters.
Angles 90 degrees, C = Cell dimension = (N-1) x D. D = sampling (Å/pixel), N = dimension of the EM map in voxels. The P1 cell implies Cx, Cy, Cz = 90, 90, 90.
Input EM map filename.
Enter sampling in Å/pixel.
Enter low pass (high resolution cuttoff) and high pass (low resolution cuttoff) filters.
Enter filename for output map.
2. Define starting position by manual placement into the map, using O
a) Convert mrc format EM map to BRIX format for display in O. Use mrc2ccp to convert from mrc to CCP4 format. MAPMAN will read the CCP4 file and convert it to BRIX format.
Starting with somefile.map, in mrc format.
somefile_cnv.map (CCP4 filename)
b) Run O
Read in the map, display.
Read in the coordinates, display them.
Move them around.
Write them out at the new position.
O manuals are at their website.
3. Converting to density
Enter coordinates filename.
Enter a map file name to take header information from, this would be the EM map you made the placement into, the MRC format version.
Enter the sampling of the EM map in Å/pixel.
Enter an output filename for the search object density.
Filter this search object map using Mapfilter.
Use the filters as for the EM map.
Choose a threshold of density to include in the search object (after filtering) using some visualisation program (Web (part of Spider), or O or other.). The threshold value should define an isosurface that includes approximately the expected volume of the domain, or slightly less in order to pick the strongest features, with a higher signal to noise ratio. The DockEM search algorithm is not very sensitive to the exact threshold and I have found it adequate to find the threshold by eye-ball using Web.
4. Run DockEM
Enter the search object density (created at step 3).
Enter the sampling of the EM map in Å/pixel.
Enter the density threshold for the search object.
Enter the name of the pdb file. (This is used to define the origin of the transformations.)
Enter a three digit run code for the output files.
Enter the angles file filename.
Enter the angular sampling (in degrees). The same as used to create the angles file.
Enter the target EM map filename to dock into (filtered in step 1).
Enter a search radius in voxels.
Enter a step size for the search, in voxels
5. Examine output files
Make coordinates for the solutions.
Enter the coordinates file (pdb)
Enter the three digit run code
Enter the sampling (Å/pixel)
Enter the solution range required. (e.g. 1,10 for the top ten solutions).
Local refinement using finer steps and angular sampling can be done, starting from the best solution, or set of solutions, from a previous run.
7. Validation, or additional constraints
Check symmetries, or filter by them.
Check compatibility of different domain solutions
Check possible connectivity.
Check point mutation and biochemical data.
Angles files are in Euler angles convention, in spider document format.
Coordinates files are in pdb format.
Maps are in mrc/ccp4 format (but must be converted to brix to display in O).
Use CCP4 programs to apply symmetries or other manipulations, detect clashes automatically.
The P1 unit cell goes 0 -> (N-1)*D.
Resample the EM map, at ~ resolution/3, to get optimum speed.
Run time is linear with the x, y, z range, because its a real space search.
The search does every position in the range per angle, therefore the angles are in the outermost loop.
A run code is associated with every output file, and links it to a specific DockEM run.
Columns in order left to right are-
Some example angles files are included.
These sample all orientations at 8,4,2, or 1 degree intervals respectively:
These sample up to 30 degrees tilt (range of Euler angle theta = 0,30) at 8, 4, 2 or 1 degree intervals respectively:
To set up your own angles file use the SPIDER VO EA command.
Only change delta theta and range of theta to specify restricted searches.
The program searches another degree of freedom, a rotation of -180 to +180 degrees about the old Z axis after the rotation defined by theta and phi. The angular sampling, 7th input to the program DockEM, defines the step size for this.
Roseman, A. M. (2000). Docking structures of domains into maps from cryo-electron microscopy using local correlation. Acta Cryst. D56, 1332-1340.
Roseman et al. (2001). Journal of Structural Biology, accepted.
Frank (1996) Three-Dimensional Electron Microscopy of Macromolecular Assemblies. Academic Press, San Diego.
Upsala Software Factory: http://xray.bmc.uu.se/usf/