Phi value analysis
Encyclopedia
Phi value analysis is an experimental protein engineering
method used to study the structure of the folding transition state in small protein domains that fold
in a two-state manner. Since the folding transition state is by definition a transient and partially unstructured state, its structure is difficult to determine by traditional methods such as protein NMR or X-ray crystallography
. In phi-value analysis, the folding kinetics
and conformational folding stability of the wild-type protein are compared with those of one or more point mutants
. This comparison yields a phi value (defined below) that seeks to measure the mutated residue's energetic contribution to the folding transition state (and thus the degree of native structure
around the mutated residue in the transition state) from the relative free energies
of the unfolded state, the folded state and the transition state for the wild-type and mutant proteins.
Typically, a high fraction of the protein's residues are mutated one by one to identify clusters of residues that are well-ordered in the folded transition state. The interactions of these residues can be validated using double-mutant-cycle phi analysis, in which the effects of the single mutants are compared with those of the double mutant. In general, the mutations are conservative and replace the original residue with a smaller one (cavity-creating mutations), most commonly alanine
; however, others such as tyrosine
-to-phenylalanine
, isoleucine
-to-valine
and threonine
-to-serine
mutations are also used. Examples of proteins that have been studied by phi value analysis include chymotrypsin
inhibitor, SH3 domain
s, individual domains of proteins L and G, ubiquitin
, and barnase
.
where represents the energy difference between the transition state
and the denatured state for the wild-type protein, represents this energy difference for the mutant protein, and the terms represent the energy difference between the native state and the denatured state. Thus, the phi value represents the ratio of the energetic destabilization introduced by the mutation to the transition state versus that introduced to the native folded state.
The phi value should range from 0 to 1. A phi value of 0 implies that the mutation has no effect on the structure of the rate-limiting transition state on the folding pathway, while a phi value of 1 indicates that the degree to which the transition state is destabilized by the mutation is exactly equal to the degree to which the folded state is destabilized. A phi value near 0 suggests that the region surrounding the mutation is relatively unfolded or unstructured in the transition state, while a value near 1 implies that the local structure around the mutation site in the transition state closely resembles the structure in the native state. It is generally the case that conservative substitutions on the surface of a protein yield phi values near 1. When the phi value is intermediate between 0 and 1, the method cannot directly distinguish between the alternative hypotheses that the transition state is partially structured, or that there are two populations of mostly-unfolded and mostly-folded states.
has a well-defined and relatively deep global minimum, the resemblance of a folding intermediate structure to the native state may closely correlate with the energy of that structure. However, if the energy landscape is relatively flat or has many local minima, the relationship may not hold strongly enough for free energy destabilizations to provide useful structural information. The method also assumes that the folding pathway is not significantly altered, although the folding energies may be. For nonconservative mutations this assumption might be fundamentally flawed; thus conservative substitutions are preferred, though they may yield smaller energetic destabilizations that are thus more difficult to detect experimentally. Lastly, the restriction of the phi value range as necessarily nonnegative assumes that the introduction of a mutation will not increase the stability and thus lower the energy of either the native or the transition state relative to those of the wild-type protein. Also, it is implicitly assumed that the interactions that stabilize a folding transition state are native-like in nature. Many recent studies of protein folding, however, have suggested that stabilizing non-native interactions in a folding transition state may aid in folding. An elegant example of this is given in Zarrine-Afsar et al. (2008) PNAS, where authors have demonstrated that stabilizing non-native interaction in the Fyn SH3 domain actually accelerated the folding rate of this protein.
pioneered the phi value analysis method by first applying it to the small bacteria
l protein barnase
. In conjunction with molecular dynamics
simulations, the analysis illustrated that, at least for this protein, the transition state between folding and unfolding is the same in both reaction directions and more closely resembled the native state. Phi values were found to vary considerably with the location of the mutation, with some regions of the protein yielding values near 0 and others yielding values near 1. The distribution of phi values over the protein agrees well with the simulated unfolding transition state in all but one helix, later identified as folding semi-independently and forming native-like contacts with the remainder of the protein only after the complete transition state has been reached. Such variations in the folding rate within a protein present another challenge in interpreting phi values, since the transition state structure cannot be determined experimentally. Folding and unfolding simulations, though computationally expensive, can provide valuable structural information that complements phi value results.
s.
s adds uncertainty to the reported values. When the stability difference between the native and mutant protein are low (< 7 kJ/mol), experimental error can be very large; unusual phi-values outside the 0-1 range may arise from these errors rather than illustrating deviations from the conditions assumed by the method. In addition, calculated phi values have been shown to depend strongly on the number of data points collected and the laboratory in which the experiment was performed.
Protein engineering
Protein engineering is the process of developing useful or valuable proteins. It is a young discipline, with much research taking place into the understanding of protein folding and recognition for protein design principles....
method used to study the structure of the folding transition state in small protein domains that fold
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....
in a two-state manner. Since the folding transition state is by definition a transient and partially unstructured state, its structure is difficult to determine by traditional methods such as protein NMR or X-ray crystallography
X-ray crystallography
X-ray crystallography is a method of determining the arrangement of atoms within a crystal, in which a beam of X-rays strikes a crystal and causes the beam of light to spread into many specific directions. From the angles and intensities of these diffracted beams, a crystallographer can produce a...
. In phi-value analysis, the folding kinetics
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....
and conformational folding stability of the wild-type protein are compared with those of one or more point mutants
Point mutation
A point mutation, or single base substitution, is a type of mutation that causes the replacement of a single base nucleotide with another nucleotide of the genetic material, DNA or RNA. Often the term point mutation also includes insertions or deletions of a single base pair...
. This comparison yields a phi value (defined below) that seeks to measure the mutated residue's energetic contribution to the folding transition state (and thus the degree of native structure
Tertiary structure
In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.-Relationship to primary structure:...
around the mutated residue in the transition state) from the relative free energies
Thermodynamic free energy
The thermodynamic free energy is the amount of work that a thermodynamic system can perform. The concept is useful in the thermodynamics of chemical or thermal processes in engineering and science. The free energy is the internal energy of a system less the amount of energy that cannot be used to...
of the unfolded state, the folded state and the transition state for the wild-type and mutant proteins.
Typically, a high fraction of the protein's residues are mutated one by one to identify clusters of residues that are well-ordered in the folded transition state. The interactions of these residues can be validated using double-mutant-cycle phi analysis, in which the effects of the single mutants are compared with those of the double mutant. In general, the mutations are conservative and replace the original residue with a smaller one (cavity-creating mutations), most commonly alanine
Alanine
Alanine is an α-amino acid with the chemical formula CH3CHCOOH. The L-isomer is one of the 20 amino acids encoded by the genetic code. Its codons are GCU, GCC, GCA, and GCG. It is classified as a nonpolar amino acid...
; however, others such as tyrosine
Tyrosine
Tyrosine or 4-hydroxyphenylalanine, is one of the 22 amino acids that are used by cells to synthesize proteins. Its codons are UAC and UAU. It is a non-essential amino acid with a polar side group...
-to-phenylalanine
Phenylalanine
Phenylalanine is an α-amino acid with the formula C6H5CH2CHCOOH. This essential amino acid is classified as nonpolar because of the hydrophobic nature of the benzyl side chain. L-Phenylalanine is an electrically neutral amino acid, one of the twenty common amino acids used to biochemically form...
, isoleucine
Isoleucine
Isoleucine is an α-amino acid with the chemical formula HO2CCHCHCH2CH3. It is an essential amino acid, which means that humans cannot synthesize it, so it must be ingested. Its codons are AUU, AUC and AUA....
-to-valine
Valine
Valine is an α-amino acid with the chemical formula HO2CCHCH2. L-Valine is one of 20 proteinogenic amino acids. Its codons are GUU, GUC, GUA, and GUG. This essential amino acid is classified as nonpolar...
and threonine
Threonine
Threonine is an α-amino acid with the chemical formula HO2CCHCHCH3. Its codons are ACU, ACA, ACC, and ACG. This essential amino acid is classified as polar...
-to-serine
Serine
Serine is an amino acid with the formula HO2CCHCH2OH. It is one of the proteinogenic amino acids. By virtue of the hydroxyl group, serine is classified as a polar amino acid.-Occurrence and biosynthesis:...
mutations are also used. Examples of proteins that have been studied by phi value analysis include chymotrypsin
Chymotrypsin
Chymotrypsin is a digestive enzyme that can perform proteolysis. Chymotrypsin preferentially cleaves peptide amide bonds where the carboxyl side of the amide bond is a tyrosine, tryptophan, or phenylalanine. These amino acids contain an aromatic ring in their sidechain that fits into a...
inhibitor, SH3 domain
SH3 domain
The SRC Homology 3 Domain is a small protein domain of about 60 amino acids residues first identified as a conserved sequence in the viral adaptor protein v-Crk and the non-catalytic parts of enzymes such as phospholipase and several cytoplasmic tyrosine kinases such as Abl and Src...
s, individual domains of proteins L and G, ubiquitin
Ubiquitin
Ubiquitin is a small regulatory protein that has been found in almost all tissues of eukaryotic organisms. Among other functions, it directs protein recycling.Ubiquitin can be attached to proteins and label them for destruction...
, and barnase
Barnase
Barnase is a bacterial protein that consists of 110 amino acids and has ribonuclease activity. It is synthesized and secreted by the bacterium Bacillus amyloliquefaciens, but is lethal to the cell when expressed without its inhibitor barstar...
.
Mathematical definition
The phi value is defined as:where represents the energy difference between the transition state
Transition state
The transition state of a chemical reaction is a particular configuration along the reaction coordinate. It is defined as the state corresponding to the highest energy along this reaction coordinate. At this point, assuming a perfectly irreversible reaction, colliding reactant molecules will always...
and the denatured state for the wild-type protein, represents this energy difference for the mutant protein, and the terms represent the energy difference between the native state and the denatured state. Thus, the phi value represents the ratio of the energetic destabilization introduced by the mutation to the transition state versus that introduced to the native folded state.
The phi value should range from 0 to 1. A phi value of 0 implies that the mutation has no effect on the structure of the rate-limiting transition state on the folding pathway, while a phi value of 1 indicates that the degree to which the transition state is destabilized by the mutation is exactly equal to the degree to which the folded state is destabilized. A phi value near 0 suggests that the region surrounding the mutation is relatively unfolded or unstructured in the transition state, while a value near 1 implies that the local structure around the mutation site in the transition state closely resembles the structure in the native state. It is generally the case that conservative substitutions on the surface of a protein yield phi values near 1. When the phi value is intermediate between 0 and 1, the method cannot directly distinguish between the alternative hypotheses that the transition state is partially structured, or that there are two populations of mostly-unfolded and mostly-folded states.
Key assumptions
Phi value analysis fundamentally assumes a close relationship between structure and energy. If the energy landscapeEnergy landscape
In physics, an energy landscape is a mapping of all possible conformations of a molecular entity, or the spatial positions of interacting molecules in a system, and their corresponding energy levels, typically Gibbs free energy, on a two- or three-dimensional Cartesian coordinate system.In...
has a well-defined and relatively deep global minimum, the resemblance of a folding intermediate structure to the native state may closely correlate with the energy of that structure. However, if the energy landscape is relatively flat or has many local minima, the relationship may not hold strongly enough for free energy destabilizations to provide useful structural information. The method also assumes that the folding pathway is not significantly altered, although the folding energies may be. For nonconservative mutations this assumption might be fundamentally flawed; thus conservative substitutions are preferred, though they may yield smaller energetic destabilizations that are thus more difficult to detect experimentally. Lastly, the restriction of the phi value range as necessarily nonnegative assumes that the introduction of a mutation will not increase the stability and thus lower the energy of either the native or the transition state relative to those of the wild-type protein. Also, it is implicitly assumed that the interactions that stabilize a folding transition state are native-like in nature. Many recent studies of protein folding, however, have suggested that stabilizing non-native interactions in a folding transition state may aid in folding. An elegant example of this is given in Zarrine-Afsar et al. (2008) PNAS, where authors have demonstrated that stabilizing non-native interaction in the Fyn SH3 domain actually accelerated the folding rate of this protein.
Example: barnase
Alan FershtAlan Fersht
Sir Alan Roy Fersht FRS is a British chemist at the MRC Laboratory of Molecular Biology in Cambridge. He is distinguished for his pioneering work on protein folding.-Biography:...
pioneered the phi value analysis method by first applying it to the small bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...
l protein barnase
Barnase
Barnase is a bacterial protein that consists of 110 amino acids and has ribonuclease activity. It is synthesized and secreted by the bacterium Bacillus amyloliquefaciens, but is lethal to the cell when expressed without its inhibitor barstar...
. In conjunction with molecular dynamics
Molecular dynamics
Molecular dynamics is a computer simulation of physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a period of time, giving a view of the motion of the atoms...
simulations, the analysis illustrated that, at least for this protein, the transition state between folding and unfolding is the same in both reaction directions and more closely resembled the native state. Phi values were found to vary considerably with the location of the mutation, with some regions of the protein yielding values near 0 and others yielding values near 1. The distribution of phi values over the protein agrees well with the simulated unfolding transition state in all but one helix, later identified as folding semi-independently and forming native-like contacts with the remainder of the protein only after the complete transition state has been reached. Such variations in the folding rate within a protein present another challenge in interpreting phi values, since the transition state structure cannot be determined experimentally. Folding and unfolding simulations, though computationally expensive, can provide valuable structural information that complements phi value results.
Variants of -value analysis
Other kinetic-perturbation techniques for analyzing the folding transition state have been developed in recent years. The most well-known variant is the -value, in which two metal-binding residues such as histidine are engineered into a protein; the folding kinetics are then studied as a function of the metal ion concentration. However, Fersht has illustrated some difficulties with this approach. An alternative "cross-linking" variant of the -value was developed in the course of studying the association of segments in the folding transition state through the introduction of covalent crosslinks such as disulfide bondDisulfide bond
In chemistry, a disulfide bond is a covalent bond, usually derived by the coupling of two thiol groups. The linkage is also called an SS-bond or disulfide bridge. The overall connectivity is therefore R-S-S-R. The terminology is widely used in biochemistry...
s.
Errors associated with -value analysis
Experimental errors can be high in measuring equilibrium stability as well the folding/unfolding rates in water for the wild-type protein and mutants. The necessity of extrapolating phi values in pure water from measurements made in solutions containing denaturantDenaturation (biochemistry)
Denaturation is a process in which proteins or nucleic acids lose their tertiary structure and secondary structure by application of some external stress or compound, such as a strong acid or base, a concentrated inorganic salt, an organic solvent , or heat...
s adds uncertainty to the reported values. When the stability difference between the native and mutant protein are low (< 7 kJ/mol), experimental error can be very large; unusual phi-values outside the 0-1 range may arise from these errors rather than illustrating deviations from the conditions assumed by the method. In addition, calculated phi values have been shown to depend strongly on the number of data points collected and the laboratory in which the experiment was performed.