Protein engineering
Encyclopedia
Protein engineering is the process of developing useful or valuable protein
s. It is a young discipline, with much research taking place into the understanding of protein folding
and recognition for protein design
principles.
There are two general strategies for protein engineering, rational design and directed evolution. These techniques are not mutually exclusive; researchers will often apply both. In the future, more detailed knowledge of protein structure
and function, as well as advancements in high-throughput technology
, may greatly expand the capabilities of protein engineering. Eventually, even unnatural amino acids may be incorporated thanks to a new method
that allows the inclusion of novel amino acids in the genetic code.
techniques are well-developed. However, its major drawback is that detailed structural knowledge of a protein is often unavailable, and even when it is available, it can be extremely difficult to predict the effects of various mutations.
Computational protein design algorithms seek to identify novel amino acid sequences that are low in energy when folded to the pre-specified target structure. While the sequence-conformation space that needs to be searched is large, the most challenging requirement for computational protein design is a fast, yet accurate, energy function that can distinguish optimal sequences from similar suboptimal ones.
is applied to a protein, and a selection regime is used to pick out variants that have the desired qualities. Further rounds of mutation and selection are then applied. This method mimics natural evolution
and generally produces superior results to rational design. An additional technique known as DNA shuffling
mixes and matches pieces of successful variants in order to produce better results. This process mimics the recombination
that occurs naturally during sexual reproduction
. The great advantage of directed evolution is that it requires no prior structural knowledge of a protein, nor is it necessary to be able to predict what effect a given mutation will have. Indeed, the results of directed evolution experiments are often surprising in that desired changes are often caused by mutations that were not expected to have that effect. The drawback is that they require high-throughput
, which is not feasible for all proteins. Large amounts of recombinant DNA
must be mutated and the products screened for desired qualities. The sheer number of variants often requires expensive robotic equipment to automate the process. Furthermore, not all desired activities can be easily screened for.
, as well as sensors for unnatural molecules. The engineering of fusion protein
s has yielded rilonacept
, a pharmaceutical which has secured FDA
approval for the treatment of cryopyrin-associated periodic syndrome
.
Another computational method, IPRO, successfully engineered the switching of cofactor specificity of Candida boidinii xylose reductase. Iterative Protein Redesign and Optimization (IPRO) redesigns proteins to increase or give specificity to native or novel substrates and cofactors. This is done by repeatedly randomly perturbing the backbones of the proteins around specified design positions, identifying the lowest energy combination of rotamers, and determining if the new design has a lower binding energy than previous ones. The iterative nature of this process allows IPRO to make additive mutations to the protein sequence that collectively improve the specificity towards the desired substrates and/or cofactors. Details on how to download the software implemented in Python and experimental testing of predictions are outlined in the following paper.
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
s. It is a young discipline, with much research taking place into the understanding of protein folding
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....
and recognition for protein design
Protein design
Protein design is the design of new protein molecules, either from scratch or by making calculated variations on a known structure. The use of rational design techniques for proteins is a major aspect of protein engineering....
principles.
There are two general strategies for protein engineering, rational design and directed evolution. These techniques are not mutually exclusive; researchers will often apply both. In the future, more detailed knowledge of protein structure
Protein structure
Proteins are an important class of biological macromolecules present in all organisms. Proteins are polymers of amino acids. Classified by their physical size, proteins are nanoparticles . Each protein polymer – also known as a polypeptide – consists of a sequence formed from 20 possible L-α-amino...
and function, as well as advancements in high-throughput technology
High-throughput screening
High-throughput screening is a method for scientific experimentation especially used in drug discovery and relevant to the fields of biology and chemistry. Using robotics, data processing and control software, liquid handling devices, and sensitive detectors, High-Throughput Screening allows a...
, may greatly expand the capabilities of protein engineering. Eventually, even unnatural amino acids may be incorporated thanks to a new method
Expanded genetic code
An expanded genetic code refers to an artificially modified genetic code in which one or more specific codons have been allocated to encode an amino acid which is not among the twenty/twenty-two found in nature.-Background:...
that allows the inclusion of novel amino acids in the genetic code.
Rational design of proteins
In rational protein design, the scientist uses detailed knowledge of the structure and function of the protein to make desired changes. This generally has the advantage of being inexpensive and technically easy, since site-directed mutagenesisSite-directed mutagenesis
Site-directed mutagenesis, also called site-specific mutagenesis or oligonucleotide-directed mutagenesis, is a molecular biology technique in which a mutation is created at a defined site in a DNA molecule. In general, this form of mutagenesis requires that the wild type gene sequence be known...
techniques are well-developed. However, its major drawback is that detailed structural knowledge of a protein is often unavailable, and even when it is available, it can be extremely difficult to predict the effects of various mutations.
Computational protein design algorithms seek to identify novel amino acid sequences that are low in energy when folded to the pre-specified target structure. While the sequence-conformation space that needs to be searched is large, the most challenging requirement for computational protein design is a fast, yet accurate, energy function that can distinguish optimal sequences from similar suboptimal ones.
Directed evolution
In directed evolution, random mutagenesisMutagenesis
Mutagenesis is a process by which the genetic information of an organism is changed in a stable manner, resulting in a mutation. It may occur spontaneously in nature, or as a result of exposure to mutagens. It can also be achieved experimentally using laboratory procedures...
is applied to a protein, and a selection regime is used to pick out variants that have the desired qualities. Further rounds of mutation and selection are then applied. This method mimics natural evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
and generally produces superior results to rational design. An additional technique known as DNA shuffling
DNA shuffling
DNA shuffling is a way to rapidly propagate beneficial mutations in a directed evolution experiment. It is used to rapidly increase DNA library size. -Procedure:DNAse I is first used to fragment a set of parent genes into pieces of 50-100 bp in length...
mixes and matches pieces of successful variants in order to produce better results. This process mimics the recombination
Homologous recombination
Homologous recombination is a type of genetic recombination in which nucleotide sequences are exchanged between two similar or identical molecules of DNA. It is most widely used by cells to accurately repair harmful breaks that occur on both strands of DNA, known as double-strand breaks...
that occurs naturally during sexual reproduction
Sexual reproduction
Sexual reproduction is the creation of a new organism by combining the genetic material of two organisms. There are two main processes during sexual reproduction; they are: meiosis, involving the halving of the number of chromosomes; and fertilization, involving the fusion of two gametes and the...
. The great advantage of directed evolution is that it requires no prior structural knowledge of a protein, nor is it necessary to be able to predict what effect a given mutation will have. Indeed, the results of directed evolution experiments are often surprising in that desired changes are often caused by mutations that were not expected to have that effect. The drawback is that they require high-throughput
High-throughput
High-throughput may refer to:* High-throughput computing - a computer science concept * High-throughput screening - a bioinformatics concept* Measuring data throughput - a communications concept...
, which is not feasible for all proteins. Large amounts of recombinant DNA
Recombinant DNA
Recombinant DNA molecules are DNA sequences that result from the use of laboratory methods to bring together genetic material from multiple sources, creating sequences that would not otherwise be found in biological organisms...
must be mutated and the products screened for desired qualities. The sheer number of variants often requires expensive robotic equipment to automate the process. Furthermore, not all desired activities can be easily screened for.
Examples of engineered proteins
Using computational methods, a protein with a novel fold has been designed, known as Top7Top7
Top7 is an artificial 93-residue protein, classified as a de novo protein since it was designed by Brian Kuhlman and Gautam Dantas in David Baker's laboratory at the University of Washington to have a unique fold not found in nature. The protein was designed ab initio on a computer with the help of...
, as well as sensors for unnatural molecules. The engineering of fusion protein
Fusion protein
Fusion proteins or chimeric proteins are proteins created through the joining of two or more genes which originally coded for separate proteins. Translation of this fusion gene results in a single polypeptide with functional properties derived from each of the original proteins...
s has yielded rilonacept
Rilonacept
Rilonacept also known as IL-1 Trap , is a dimeric fusion protein consisting of the extracellular domain of human interleukin-1 receptor and the FC domain of human IgG1 that binds and neutralizes IL-1.Rilonacept is used for the treatment of cryopyrin-associated periodic syndromes , including familial...
, a pharmaceutical which has secured FDA
Food and Drug Administration
The Food and Drug Administration is an agency of the United States Department of Health and Human Services, one of the United States federal executive departments...
approval for the treatment of cryopyrin-associated periodic syndrome
Cryopyrin-associated periodic syndrome
Cryopyrin-associated periodic syndrome is a spectrum of autoinflammatory syndromes including familial cold autoinflammatory syndrome , the Muckle-Wells syndrome , and neonatal-onset multisystem inflammatory disease...
.
Another computational method, IPRO, successfully engineered the switching of cofactor specificity of Candida boidinii xylose reductase. Iterative Protein Redesign and Optimization (IPRO) redesigns proteins to increase or give specificity to native or novel substrates and cofactors. This is done by repeatedly randomly perturbing the backbones of the proteins around specified design positions, identifying the lowest energy combination of rotamers, and determining if the new design has a lower binding energy than previous ones. The iterative nature of this process allows IPRO to make additive mutations to the protein sequence that collectively improve the specificity towards the desired substrates and/or cofactors. Details on how to download the software implemented in Python and experimental testing of predictions are outlined in the following paper.
See also
- Display:
- Bacterial displayBacterial displayBacterial display is a protein engineering technique used for in vitro protein evolution...
- Phage displayPhage displayPhage display is a method for the study of protein–protein, protein–peptide, and protein–DNA interactions that uses bacteriophages to connect proteins with the genetic information that encodes them. Phage Display was originally invented by George P...
- mRNA displayMRNA displaymRNA display is a display technique used for in vitro protein, and/or peptide evolution to create molecules that can bind to a desired target. The process results in translated peptides or proteins that are associated with their mRNA progenitor via a puromycin linkage. The complex then binds to...
- Ribosome displayRibosome displayRibosome display is a technique used to perform in vitro protein evolution to create proteins that can bind to a desired ligand. The process results in translated proteins that are associated with their mRNA progenitor which is used, as a complex, to bind to an immobilized ligand in a selection step...
- Yeast displayYeast displayYeast display is a technique used in the field of protein engineering. The yeast display technique was first published by the laboratory of Professor K. Dane Wittrup. The technology was sold to Abbott Laboratories in 2001....
- Bacterial display
- Enzyme engineeringEnzyme engineeringEnzyme engineering is the application of modifying an enzyme's structure or modifying the catalytic activity of isolated enzymes to produce new metabolites, to allow new pathways for reactions to occur, or to convert from some certain compounds into others...
- Enzymology
- Expanded genetic codeExpanded genetic codeAn expanded genetic code refers to an artificially modified genetic code in which one or more specific codons have been allocated to encode an amino acid which is not among the twenty/twenty-two found in nature.-Background:...
- Gene synthesisGene synthesisArtificial gene synthesis is the process of synthesizing a gene in vitro without the need for initial template DNA samples. The main method is currently by oligonucleotide synthesis from digital genetic sequences and subsequent annealing of the resultant fragments...
- MeganucleasesMeganucleasesMeganucleases are endodeoxyribonucleases characterized by a large recognition site ; as a result this site generally occurs only once in any given genome. For example, the 18-base pair sequence recognized by the I-SceI meganuclease would on average require a genome twenty times the size of the...
- Nucleic acid analoguesNucleic acid analoguesNucleic acid analogues are compounds structurally similar to naturally occurring RNA and DNA, used in medicine and in molecular biology research....
- Protein foldingProtein foldingProtein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....
- Protein designProtein designProtein design is the design of new protein molecules, either from scratch or by making calculated variations on a known structure. The use of rational design techniques for proteins is a major aspect of protein engineering....
- ProteomicsProteomicsProteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells. The term "proteomics" was first coined in 1997 to make an analogy with...
- ProteomeProteomeThe proteome is the entire set of proteins expressed by a genome, cell, tissue or organism. More specifically, it is the set of expressed proteins in a given type of cells or an organism at a given time under defined conditions. The term is a portmanteau of proteins and genome.The term has been...
- SCOPE (protein engineering)SCOPE (protein engineering)Structure-Based Combinatorial Protein Engineering is a synthetic biology technique for creating gene libraries of defined composition designed from structural and probabilistic constraints of the encoded proteins...
- Structural biologyStructural biologyStructural biology is a branch of molecular biology, biochemistry, and biophysics concerned with the molecular structure of biological macromolecules, especially proteins and nucleic acids, how they acquire the structures they have, and how alterations in their structures affect their function...
- Synthetic biologySynthetic biologySynthetic biology is a new area of biological research that combines science and engineering. It encompasses a variety of different approaches, methodologies, and disciplines with a variety of definitions...
External links
- Max Planck
- Centre for Protein Engineering
- Protein Engineering Design and Selection
- Structure with Folding & Design
- EGAD; a free and open-source program for automated protein design
- servers for protein engineering and related topics based on the WHAT IF softwareWHAT IF softwareWHAT IF is a computer program used in a wide variety of in silico macromolecular structure research fields such as:* Homology models of protein tertiary structures as well as quaternary structures,...
- Enzymes Built from Scratch - Researchers engineer never-before-seen catalysts using a new computational technique, Technology Review, March 10, 2008
- SeSaM-Biotech - Directed Evolution
- DNA2.0 Protein Engineering
- IPRO Software