AUTHOR==Andrew Leaver-Fay and Elizabeth Kellogg
METADATA==The documentation was last updated on 4/7/2011, by Andrew Leaver-Fay. Questions about this documentation should be directed to David Baker: dabaker@u.washington.edu.
EXAMPLES==The ddg_monomer application lives in src/apps/public/ddg/ddg_monomer.cc (This file had previously been named "fix_bb_monomer_ddg.cc", but has been renamed since it now moves the backbone). This file houses the main() function. The central subroutines invoked by this file live in the ddGMover class defined in src/protocols/moves/ddGMover.hh and src/protocols/moves/ddGMover.cc.\nA helper application, minimize_with_cst, lives in src/apps/public/ddg/minimize_with_cst.cc.\nA helper script for generating one of the input files needed by the ddg_monomer application lives in src/apps/public/ddg/convert_to_cst_file.sh.\nAn integration test for this application lives in test/integration/tests/ddG_of_mutation/. The test in this directory runs a shortened trajectory for predicting the wild-type and mutant energies. To turn this into a production-run example, set the value for the "-ddg:iterations" flag given in the file "test/integration/tests/ddG_of_mutations/flags" to 50.
REFERENCES==The new algorithm for performing limited relaxation of the backbone was published in\nE. Kellogg, A. Leaver-Fay, and D. Baker, (2011) "Role of conformational sampling in computing mutation-induced changes in protein structure and stability", Proteins: Structure, Function, and Bioinformatics. V 79, pp 830--838.\n\nThe older, fixed-backbone, soft-repulsive scorefunction algorithm (analogous to that described in row 4 of [Kellogg 2011] but with weights trained towards recapitulating alanine-scanning mutation experiments. weights are in minirosetta_database/scoring/weights/ddg_monomer.wts) was published in:\nKortemme et al. (2002) "A simple physical model for binding energy hot spots in protein-protein complexes", PNAS 22, 14116-21
DESCRIPTION==The purpose of this application is to predict the change in stability (the ddG) of a monomeric protein induced by a point mutation. The application takes as input the crystal structure of the wild-type (which must be first pre-minimized), and generates a structural model of the point-mutant. The ddG is given by the difference in rosetta energy between the wild-type structure and the point mutant structure. More precisely, 50 models each of the wild-type and mutant structures should be generated, and the most accurate ddG is taken as the difference between the mean of the top-3-scoring wild type structures and the top-3-scoring point-mutant structures.\n\n
ALGORITHM==There are two main ways that this application should be used: a high-resolution and a low-resolution way. They are nearly as accurate as each other with corellation coefficients of 0.69 and 0.68 on a set of 1210 mutations. These are described by the protocols on rows 16 and 3 of [Kellogg,2011], respectively.\nA) High Resolution Protocol:\nThis protocol allows a small degree of backbone conformational freedom. The protocol optimizes both the initial input structure for the wild-type and the generated structure for the point mutant in the same way, for the same number of iterations (recommended 50). It begins by optimizing the rotamers at all residues in the protein using Rosetta's standard side-chain optimization module (the packer). It follows this initial side-chain optimization with three rounds of gradient-based minimiztion, where the repulsive component of the Lennard-Jones (van der Waals) term is downweighted in the first iteration (10% of its regular strength), weighted at an intermediate value in the second iteration (33% of its regular strength), and weighted at its standard value in the third iteration. This repacking followed by minimiztion is run several times, always starting from the same structure. Scores and optionally PDBs are written out.\nDistance Restraints: The high-resolution protocol relies on the use of Calpha-Calpha distance restraints as part of the optimization to prevent the backbone from moving too far from the starting conformation. These distance restraints may be generated externally before the protocol may be run. If not specified, constraints will be automatically generated based on the input structure, but the results obtained in the published paper utilized constraints based on the high-resolution crystal structure. The constraints used in the generation of data for row 16 in [Kellogg2011] were given as distance restraints between all Calpha pairs within 9 Angstroms of each other in the wild type structure; for each harmonic restraint, the ideal value for the restraint was taken as the distance in the original crystal structure (not the pre-minimized structure which should be given as input) and the standard-deviation on the harmonic constraint was set to 0.5 Angstroms. For example, the distance restraint between the c-alpha of residue 1 and the c-alpha of residue 2 of the PDB 1hz6 is described in the input constraint file by the line "AtomPair CA 2 CA 1 HARMONIC 3.79007 0.5".\n\ndistance restraints can be generated through the use of this shell script:\n/convert_to_cst_file.sh mincst.log > input.cst\nthis shell script simply takes the output of the minimization log (from pre-minimization of the input structure) and converts it to the appropriate constraint file format. ( see below to obtain minimization log \n\nB) Low Resolution Protocol:\nThis protocol only allows sidechain conformational flexibility. It optimizes the rotamers for the residues in the neighborhood of the mutation; those with CBeta atoms within 8 Angstroms of the CBeta atom of the mutated residue (or Calpha for glycine). The same set of residues is optimized for both the wild-type and mutant structures. The optimization is performed a recommended 50 times for both the wildtype and point-mutant, and the scores (and optional PDBs) are written out.
TIPS==The experimentally determined structure of the wildtype (only crystal structures have been used) should be preminimized to reduce collisions that otherwise introduce large amounts of noise into the relaxation process. Structures are backbone-and-sidechain minimized with the use of harmonic distance constraints on all c-alpha atoms within 9 Angstrom in the crystal structure. Usually, one generates a set of constraints based on the crystal structures, uses these constraints for the initial minimization as well as for the ddg backbone-and-sidechain minimization later on.\nthe command to perform pre-minimization is as follows:\n/path/to/minimize_with_cst.linuxgccrelease -in:file:l lst  -in:file:fullatom -ignore_unrecognized_res -fa_max_dis 9.0 -database /path/to/minirosetta_database/ -ddg::harmonic_ca_tether 0.5 -score:weights standard -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch minirosetta_database/scoring/weights/score12.wts_patch > mincst.log\nthis application will only take in a list of pdb structures, designated by -in:file:l lst the resulting minimized structures will have a prefix designated by -ddg::out_pdb_prefix. In this case the structures will have a prefix "min_cst_0.5" followed by the original input pdb name.\nas explained in the previous section, in order to obtain constraints based on the crystal structure, run the shell script: ./convert_to_cst_file.sh on the log-file output (mincst.log)
LIMITATIONS==The high-resolution protocol is designed to limit the amount of conformational flexibility allowed to protein backbones as the amount of noise that seeps into the protocol from increased flexibility tends to drown out the signal that might be gained by searching a larger region of conformation space. The result is that this protocol is probably not well suited to model several mutations simultaneously where backbone motion would be expected.
INPUTS==All PDBs should be renumbered so that their first residue is residue 1 and number consecutively so that, if there are missing residues in the structure (due maybe to missing density) that these residues are simply skipped in the residue numbering. The numbering of all residues in both the distance-restraint file and the mutation-list file should follow this numbering.\nThere are two main input files: a) Restraint / constraint files for calpha atom pairs, and b) Mutation-list files, describing which set of point mutations to entertain for an input backbone.\na) Restraint / constraint files for calpha atom pairs  The constraint file should list all calpha atom pairs within 9 Angstroms, giving the distance measured from the original (non-pre-minimized) experimental structure. See RosettaCommons (Maybe a part of the toolkit).
TIPS==Make sure to use exactly as described in protocol 13 of the main paper referenced above.
OUTPUTS==The output of the ddg protocol is a 'ddg_predictions.out' which contains, for each mutation, the total predicted ddg and a breakdown of all the score components which contribute to that total. Furthermore, output structures are dumped either in silent-file or pdb format. If silent-files are output, the following naming convention is used. All wild-type structures are dumped into wt_<WT_AA><RESIDUE_NUM><MUTANT_AA>.out The reason wild-type structures are always dumped is because if local optimization around the site of mutation is being done, the wild-type structures can potentially be different from one another due to different constraint definitions or different packing definitions. Mutant structures follow a similar convention: mut_<WT_AA><RESIDUE_NUM><MUTANT_AA>.out For example, if you made a A to Q mutation at residue 123, you would see two silent-files as output: wt_A123Q.out and mut_A123Q.out this is done regardless of the protocol used.
ANALYSIS==No Comments
