Written by Nicholas M Luscombe & Roman A Laskowski
Appendix A - PDB file format
Appendix B - The *.bond file format
Appendix C - The HBADD program
a. pre-processor for HBPLUS
b. the algorithm
|Back to NUCPLOT homepage||Last modified 15 Jan 1998|
NUCPLOT is a program that automatically generates schematic 2D representations of protein-nucleic acid interactions. The input is a standard PDB file and the output is a colour or black-and-white PostScript file which gives a simple, at-a-glance representation of the hydrogen bonds and hydrophobic contacts between proteins and nucleic acids.
By default, NUCPLOT expects the hydrogen bonds and non-bonded contacts to be calculated by the HBPLUS program ( McDonald & Thornton, 1994) and can read files output by that program. It is possible to use data computed by other means provided they are supplied in the format described in Appendix B.
HBPLUS was primarily designed for calculating interactions within large molecules found in the PDB. This means that it is unable to recognize the majority of small ligands and as a result may miss certain hydrogen bonds between the protein, nucleic acid and ligand. This would result in plots lacking the required small molecule-nucleic acid interactions.
HBPLUS does allow the user to define ligands in a separate input file as described in the HBPLUS Operating Manual. The file describes the nature of hydrogen bonds each atom in the ligand is able to make and therefore allows HBPLUS to correctly calculate the interactions to it.
A program called HBADD
is supplied which aims to reduce the work involved in creating this
input file for HBPLUS. The program looks for any HETATM records in a
given PDB file and searches for its' structure in the Het Group
Dictionary. The input file is then created. The Het Group Dictionary
is available directly from the PDB at:
The script file provided in the package automatically runs HBADD, HBPLUS and NUCPLOT. We recommend the use of this script for producing your plots.
Change directories to hbplus/ and type the following command:
Change directories to nucplot/ and type the following command:
-lm is an option that includes the Standard Maths library.
Place the Het Group Dictionary in the nucplot/ directory.
Under UNIX, the following aliases need to be defined to run NUCPLOT and HBPLUS. They can be placed in the user's .cshrc file.# NUCPLOT
** nucplot-directory gives the full path to where the NUCPLOT program files are held. Please change accordingly.# HBPLUS
After modifying the .cshrc file, type the following command to set up the aliases:
We recommend running NUCPLOT outside the nucplot/ and hbplus/ directories.
To run NUCPLOT on a structure for the first time, simply type:
This runs a script which automatically executes the HBADD, HBPLUS and NUCPLOT programs in turn to produce the plot.
For subsequent runs of NUCPLOT on the same structure (eg. after changing the parameters to alter the appearance of the plot - see below), simply type:
This runs the NUCPLOT program only and cuts down on the processing time taken for running HBADD and HBPLUS. The *.hb2 and *.nb2 files for the structure must be placed in the same directory as the one in which you are running NUCPLOT
This script may also be used if you wish to supply your own list of hydrogen bonds and non bonded contacts.
The resulting plots can be altered by changing the plot parameters in the parameter file "nucplot.par which can be found in the current directory. When running NUCPLOT in a new directory, the default parameter file is copied from the nucplot directory. To return to the default parameters for the plot, simply remove the parameter file from the working directory and run the NUCPLOT script. The file can be edited with any text editor.
The colours of the background and various symbols of the plot can be defined. The definitions of the different colours are described below. The colour name must be one that is defined in the list of colour definitions.
The colour definitions table allows you to modify any of the default colour definitions. You canalso set up new colours of your own (up to 20 different colours are allowed).
Each entry contains three numbers, each between 0.0 and 1.0, giving the ratios of red, green and blue, respectively, making up the given colour.
Each colour has a name defined within the single quotes and may be altered. These names are referred to when assigning colours in the "Colour parameters" section (see above section).
The list allows you to include non-standard bases in the plot. New base types may be added by appending the name to the list as it appears in the PDB.
If you are familiar with PostScript files, you can make simple amendments to the plot by editing the nucplot.ps file.
The file is an ASCII text file, and so can be modified using any text editor. The sorts of amendments you can make are: changes to labels (in terms of size, colour and text), addition of other text, changes to colours, sizes, etc.
Some changes, of course, are easier by altering the nucplot.par parameter file and re-running NUCPLOT.
There are several possible explanations:
As mentioned in the introduction, NUCPLOT uses a list of bonds supplied by the HBPLUS program by default. In this case, HBPLUS sometimes gives incorrect results as, when the program encounters a ligand, residue or base it does not recognize, it may be unable to correctly calculate all the hydrogen bonds the interacting entity makes.
HBPLUS does allow the user to define ligands in a separate input file as described in section 2.6 of the HBPLUS Operating Manual. The file describes the nature of hydrogen bonds each atom in the ligand is able to make and therefore allows HBPLUS to correctly calculate the interactions to it.
A program called HBADD (see Appendix C) is supplied which aims to reduce the work involved in creating this input file for HBPLUS. The program looks for any HETATM records in a given PDB file and searches for its' structure in the Het Group Dictionary. HBADD relies on the atom names and connectivities of the ligand atoms in the PDB file matching those in the dictionary.
If there is a mismatch, you may have to edit the PDB file or create your own HBPLUS input file.
HBPLUS calculates hydrogen bonds in the following method. All possible hydrogen atom (H) positions are calculated for donor atoms (D) which satisfy specified geometrical criteria with acceptor atoms (A) in the vicinity. The criteria used are: the H-A distance is < 2.7Å, the D-A distance is < 3.35Å, the D-H-A angle is > 90° and the H-A-AA angle is > 90°, where AA is the atom attached to the acceptor.
For non-bonded contacts, all atoms within 3.9Å of each other are considered to be interacting by HBPLUS.
NUCPLOT uses an additional distance cut-off filter which is specified in the nucplot.par parameter file (more details). This is easier to alter than the input for HBPLUS and as long as the distance required is less than the cut-offs specified by HBPLUS, we recommend the use of this method.
NUCPLOT only considers hydrogen bonds between water molcules and nucleic acids. Non bonded contacts are not included to prevent overcrowding in the diagram.
NUCPLOT can be instructed to read the interaction information from the *.bond file. The simplest way to make alterations to this file is to run NUCPLOT, see which bonds are missing, and add these manually to the file before re-running the program using NUCONLY. The file can also be edited to remove any unwanted interactions. See Section 5 for details on the nucplot.par file and Appendix B for an explanation of the *.bond file format.
Luscombe N M, Laskowski R A, Thornton J M. (1997). NUCPLOT: a program to generate schematic diagams of protein-DNA interactions. Nucleic Acids Research, 25, 4940-4945.
NUCPLOT reads the atomic coordinates from standard PDB format files only.
In particular, NUCPLOT may not produce coherent outputs if the following are not observed:
For complete details of the PDB format please see:
The table below shows the NUCPLOT file format for the *.bond file containing the list of hydrogen bonds, non bonded contacts and covalent bonds to be plotted. The file is automatically produced when the HBPLUS files are used as inputs.
10 20 30 40 12345678790123456787901234567879012345678790123 =============================================== Line 1: NUCPLOT v.1.0 - Bond file (pdb1zaa.bond) Line 2: ----------------------------------------------- | Line 4: **** Hydrogen Bonds *************************** | Donor Acceptor Distance | ARG C 70 NH2 G A 2 O1P 2.87 | ARG C 80 NH2 G A 2 N7 2.86 | ARG C 80 NH1 G A 2 O6 2.99 Line X: HOH 319 O G B 4 O6 2.60 | | Line X+3: **** Non Bonded Contacts ********************** | protein DNA Distance | THR C 56 CG2 C A 3 P 3.63 | THR C 56 CG2 C A 3 O1P 3.82 Line Y: THR C 56 CB C A 3 O2P 3.38 | | Line Y+3: **** Covalent Bonds *************************** | protein DNA Distance
------------------------------------------------------------ Field | Column | Description No. | range | ------------------------------------------------------------ 1. | 1 - 3 | Donor residue 3-letter code - | 4 - 4 | Blank 2. | 5 - 5 | Donor chain ID - | 6 - 7 | Blank 3. | 8 - 10 | Donor residue number - | 11 - 13 | Blank 4. | 14 - 17 | Donor residue atom name - | 18 - 21 | Blank 5. | 22 - 24 | Acceptor residue 3-letter code - | 25 - 25 | Blank 6. | 26 - 26 | Acceptor chain ID - | 27 - 28 | Blank 7. | 29 - 31 | Acceptor residue number - | 32 - 34 | Blank 8. | 35 - 38 | Acceptor residue name - | 39 - 41 | Blank 9. | 42 - 45 | H-bond distance ------------------------------------------------------------
This program identifies all Het groups in the input PDB file and searches for them in the Het Group Dictionary, available from the PDB. Any matches are used to generate a definition of the residue type in HBPLUS format. Each residue's definition gives the atom connectivities and, for the polar atoms, defines which are hydrogen-bond donors/acceptors and how many hydrogens they can donate/accept.
The output file, containing the residue definitions, is called hbplus.rc. The information allows the HBPLUS program to calculate all potential H-bonds between ligand and protein correctly.
It locates each HET group in the HET Group Dictionary. Connectivities are recorded. Atoms are matched by name and then by connectivity. Where the HET group in the PDB file successfully matches the dictionary definition on both these criteria, the relevant bond angles are computed where necessary.
The rules for H-bond formation, applied to all O, N and S atoms, are as follows (John Mitchell, personal communication):-
In the first case, the residue definition might still need to be manually prepared for HBPLUS (as described in section 2.6 of that program's documentation).
In the second case, a simple solution might be to rename the relevant atoms in the PDB file so that they correspond to the atom-naming in the HET Group Dictionary.