Appendices
Appendix A : File Formats
The
following file types are described below:
Grasp Surface File
DelPhi Potential Map
DelPhi Charge File
DelPhi Size File
PDB File (and Grasp variants)
Grasp Property File
Grasp Script File
Pair Wise Interaction File
Grasp Surface File
This
unformatted file starts with five "lines" of eighty characters. The first line
contains a format specifier, i.e. the words
"format=1". (There are
no other
formats at this time.) The second and third lines contain key words for the
information contained within, i.e. "vertices"
for the vertex positions, "accessible"
for the associated accessible surface point coordinates,
"normals" for the normal
vector (of length unity) for each vertex,
"triangles" for the triangle index list. The
latter is a list of integers such that entries i-2, i-1
and i, where mod(i,3)=0, give the
which vertices make up triangle i/3. NOTE that the index integers for the
triangles are INTEGER*2 ! Line three contains which variable are also written to
this file, "potentials", "curvature",
"distances", "gproperty1", "gproperty2" being
the key words for the appropriate quantities. Line four contains the number of
vertices, the number of triangles, the grid size of the
lattice used to create it (i.e.
the number of points along one edge of the cube, always
=65), and the reciprocal
lattice spacing.
Line five contains the midpoint of the coordinate system from
which the vertices were derived (i.e. the midpoint of the
Grasp box). The data
then follows in the order of the keywords. Note that all are REAL*4 except for the
triangle indexes which are INTEGER*2.
DelPhi Potential Map
This is also an unformatted file. Its contents follow the format:
character*20 uplbl
character*10 nxtlbl, character*60 toplbl
real*4 phi(65,65,65)
character*16 botlbl
real*4 scale, mid(3)
Here phi contains the map information, mid the grid
midpoint, scale the reciprocal
grid spacing.
The rest are just character strings containing non-Grasp
information.
DelPhi Charge File
This
file can have any number of header comment lines as long as the first
character is an exclamation point "!". After the last comment line should
appear a
line as below,
atom__resnumbc_charge_
Any comments after this line must be to the right of the
first twenty second
character of the line, since this part of the line is not
interpreted as an
assignment.
To
make an assignment statement the user places the atom name
specification under the four characters "atom"
in the above line, the three
characters of the residue name under "res", the
number of the residue under
"numb" and the chain designator under
"c". Finally the charge
value is placed
under "charge". If a descriptor field left totally blank is treated as wild,
i.e. if the
chain specifier is blank the assignments spelt out by the
other fields in that line
are applied to all chains in the molecule. However, blank spaces within a field
are treated as true blanks, i.e. a "c " in the atom field
will not apply to a "ca
"
atom. This
is different from how DelPhi radii files are dealt with as is described
below.
Individual
assignment statements are interpreted identically in both Grasp and
DelPhi.
However the method of eventual assignment differs. In DelPhi each
such specifications are entered in a hash table. When all lines are read the
charge assigned to a particular atom is that which is
most specifically declared in
the assignment statements. On the other hand the method Grasp uses to assign
charges from a DelPhi charge file is that each line is
interpreted individually
before the next is read.
As
an example of the different approaches, the lines:
atom__resnumbc_charge_
nz lys 25 A 0.0
nz lys
1.0
will charge all zeta nitrogens of the molecule for Grasp,
and all BUT lysine 25 on
chain A in DelPhi.
If the line order where reversed however the charging would
be identical since Grasp would change the charge on
lysine 25 on chain A back
to zero after first setting it to 1.0. Hence when using DelPhi charge files in
Grasp
be sure that the more general charge assignments precede
the more specific.
DelPhi Size File
The
radii assignment files in DelPhi are similar to, but simpler from, the
charge assignment files. There may again be comment lines before a "key
word"
line, which in this case is,
atom__res_radius
i.e. size files only contain room for atom names and
residue names.
There
is one further difference between DelPhi size files and radii files,
namely that in size files empty spaces within a
descriptor field are treated as wild
cards. Thus
"c " under
the "atom" header in a size file applies to all atoms
whose names begin with "c".
The
same difference in mode of assignment between DelPhi and Grasp
occurs as is described above for charge files, i.e. the
user should be careful to
put more general assignments first if the same assignment
is to be made by both
programs.
Both
charge and size files are supported by Grasp for a DelPhi users
convenience.
However the Grasp user might want to consider using neither and
instead using the more general format of Grasp
"alt" (see section on command
line syntax) commands in a script file to assign radii
and charges.
Protein Data Bank File
The
complete specification for these files is quite complex. Grasp (and
DelPhi) only use information on lines beginning with
"ATOM" or "HETATM" or
"TER" (the latter only Grasp). These lines are expected to have
format,
(A6, 5X, A5, X, A3, X, A1, A4, 4X, 3F8.3)
in the first fifty four characters, where A6 is
the"ATOM " or "HETATM"
header,
A5 is the atom name (only four of which are used in
Grasp), A3 the residue
name, A1 the chain name, A4 the residue number and 3f8.3
the atom
coordinates in Angstroms.
In
some files the field to the right of these (i.e. characters 55 to 80) are also
used to store information. In standard PDB format columns 55-60 are used for
the occupancy (F6.2), columns 61-67 for the B-value
(F7.3). The Grasp variants
are as follows.
If the first two lines of the file are:
GRASP PDB FILE
FORMAT NUMBER= 1
then these two fields will be read in as the radius and
charge of the atom in that
order. If
the first two lines are:
GRASP PDB FILE
FORMAT NUMBER= 3
Then the values in these fields are read into general
property arrays one and two
respectively.
If instead they are:
GRASP PDB FILE
FORMAT NUMBER= 2
then the entire field of 55-80 is read in free format as
general properties one and
two. This
clearly allows the user to store values to much higher accuracy.
Note
that in using any of these formats there MUST be two numbers in some
format or an error may occur.
End
of molecule statements, i.e. "TER" lines should occur after each
molecule
described in a PDB file for Grasp to treat them as
independent molecules.
Grasp Property File
These
ascii files contain a listing of a single property for either all atoms or all
surface vertices.
The user may write as many comment lines as desired before a
line which has the key word "surface=" or
"atoms=" (so be careful not to include
these in comment lines). All lines after that are assumed by the program to be
numerical values (in any format). On the same line as the above key words
the
user should have one of the following:
"potential", "distance", "curvature",
"gproperty1", "gproperty2",
"accessible", or "charge", e.g.
surface=distance
will inform the program that the list of values should be
placed in the distance
array for surface vertices.
Grasp Script File.
An
ascii file the user can store commands in to executed together. For
instance a set of coloring commands for atoms. Any line beginning with an
exclamation mark (!) are disregarded. All other lines are sent to the command
interpreter to be acted upon.
Interaction Energy (Matrix) File.
These
come in two flavors. Both should have as their first line the words,
GRASP RESIDUE INTERACTION
The next line should either be,
FORMAT= 1
or
FORMAT = 2
The
first format is more complete and
expects lines in the FORTRAN format,
(a3,3x,i3,10x,a3,3x,i3,10x,12g). Here the a3 is the first residue name,
i3 the
residue number, a3 the next residue name, i3 the next
residue number, finally
12g is format of the interaction energy.
The
second format contains only the residue numbers and the interaction
strength, i.e. (i5,x,i5,4x,g12). There can be at most 10,000 lines of
such
descriptions.
There is no provision for comment lines. Reading in a file
automatically replaces all previous interaction values.