A FAB file is an array that gives a certain bonus or penalty for each residue of a protein being at the interface. The primary use of FAB files is for antibodies; the CDR 's (complementarity determining regions) of antibodies allow a certain degree of confidence in the binding region of the antibody.
In a FAB file, residues that are considered to be likely to be at the interface (CDR's in antibodies) are marked as true (T), residues that could be interface residues are marked as neutral (N), and all other residues are disallowed and marked as false (.) (an F can be substituted for the period). For antibodies, a certain number of neutrals (3-4) should both preceed and follow the true (CDR) regions. Since two or more "false" residues at the interface will cause a "FAB failure" and invalidate a decoy, the "neutral" status should be distributed fairly generously.
FAB
files should be named
with the four-letter code of their pdb and a .fab suffix, example: "1FBI.fab".
1FBI L
DIQMTQTTSSLSASLGDRVTISCRASQDISNY-----LNWYQKKPDGTVKLLIYYTSRLH
TFN
NN...............NNNTTTTTTTTTTTTTTTTNNN.........NNNTTTTTTTTN
1FBI
L
SGVPSRFSGSGSGTDYSLTIRNLEQEDIATYFCQQGYTLP--YTFGGGTKLEIK-RADAA
TFN
NN..........................NNNTTTTTTTTTTTNNNNNNNNN.........
1FBI
L
PTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDST
TFN
............................................................
1FBI
L
YSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
TFN
..........................................
1FBI
H
QVQLQQPGAELVKPGASVKLSCKASGYTFT--SYWMHWVKQGPGQGLEWIGEIDPSDSYP
TFN
NN................NNNNTTTTTTTTTTTTNNNN.......NNNTTTTTTTTTTTN
1FBI
H
NYNEKFKGKATLTVDKSSSTAYMQLSSLTSEDSAVYYCASLYYYGTSYGVLD--------
TFN
NNNNN..............................NNNTTTTTTTTTTTTTTTTTTTTTT
1FBI H
---YWGQGTSVTVSSAKTTPPSVYPLAPGSAAQTNSMVTLGCLVKGYFPEPVTVTWNSGS
TFN
TTTNNNNNNNNN................................................
1FBI
H
LSSGVHTFPAVLQSDLYTLSSSVTVPSSPRPSETVTCNVAHPASSTKVDKKIVP
TFN
......................................................
The format, down to the column number of the first residue of the alignement is unmalleable. The four-letter code preceeding the chain ID need not be the same as the pdb, but should be for the sake of clarity. TAB's should not be used. The sequence and TFN should start at column 16 and end at or before column 75.
For
rosetta to use a FAB file, it must
be run in fab mode (
-fab1 or
-fab2) with 1 or 2 being dependant on the order of the docking
partners, 1 if the antibody is listed first in the pdb and 2 if it is
listed second. With the fab flag and the FAB file in the running
directory (the same as the starting structure), rosetta will run in
fab-mode. Rosetta will then give a small bonus to the score for each
"true" residue at the interface. If two or more "false" residues are at
the interface, the decoy will be declared a fab failure and should be
discarded in post-processing. If, during post-processing, a large
fraction of the output, perhaps more than one fifth, is made up of
"fab-failures", there is most likely an error in the FAB file.
FAB files like the one above can be generated by hand if one knows the location of the CDR's. They can be determined easily using the WAM canonical definitions, WAM Web Antibody Modeling. Those residues considered to be part of the CDR regions should be labeled as true, they should be bracketed with neutrals and the remainder of the antibody should be false.
As
a time saving device, the process of creating a FAB file from a PDB has
been automated. This python
script when
run on a pdb
file, will
automatically generate a fab file with the appropriate name from it.
Be advised, the script has not been widely tested and may fail,
especially with non-natural or novel antibodies. The output should
always be examined and compared to the expected results. It requires
it's two fasta files (Light.fasta and
Heavy.fasta to run). The fab-script rewrites the pdb to make the
order of the chains antigen-light-heavy so that a constant "-fab2"
flag can be used. The fab script also requires clustalw
and biopython;
instructions for downloading them can be found in the
header of the script.
KHP / JHU / 8-16-04