Site-specific recombination
Encyclopedia
Site-specific recombination, also known as conservative site-specific recombination, is a type of genetic recombination
in which DNA
strand exchange takes place between segments possessing only a limited degree of sequence homology. Site-specific recombinases perform rearrangements of DNA segments by recognizing and binding to short DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved and rejoin the DNA strands. While in some site-specific recombination systems having just a recombinase enzyme together with the recombination sites is enough to perform all these reactions, in some other systems a number of accessory proteins and accessory sites are also needed.
Site-specific recombination systems are highly specific, fast and efficient, even when faced with complex eukaryotic genomes. They are employed in a variety of cellular processes, including bacterial genome replication, differentiation and pathogenesis, and movement of mobile genetic elements
(Nash 1996). For the same reasons, they present a potential basis for the development of genetic engineering
tools.
Recombination sites are typically between 30 and 200 nucleotide
s in length and consist of two motifs with a partial inverted-repeat symmetry, to which the recombinase binds, and which flank a central crossover sequence at which the recombination takes place. The pairs of sites between which the recombination occurs are usually identical, but there are exceptions (e.g. attP and attB of λ integrase, see lambda phage
).
reactions. During strand exchange, the DNA cut at fixed points within crossover region of the site releases a deoxyribose
hydroxyl group, while recombinase protein forms a transient covalent bond
to a DNA backbone phosphate
. This phosphodiester bond
between hydroxyl group of the nucleophilic, serine
or tyrosine
residue conserves the energy that was expended in cleaving the DNA. Energy stored in this bond is subsequently used for the rejoining of the DNA to the corresponding deoxyribose hydroxyl group on the other site. The entire process therefore goes through without the need for external energy rich cofactors
such as ATP
.
The recombination sites are slightly asymmetric, which allows the enzyme to tell apart the left and right ends of the site. When generating products left ends are always joined to the right ends of their partner sites and vice versa. This causes the recombination sites to be reconstituted in the recombination products. Joining of left ends to left or right to right is avoided due to the asymmetric “overlap” sequence between the staggered points of top and bottom strand exchange. Left-left or right-right half-site recombinants would contain mismatched base pairs.
Although the basic chemical reaction is the same for both tyrosine and serine recombinases there are marked differences. Tyrosine recombinases, such as Cre
or FLP
, cleave one DNA strand at a time at points that are staggered by 6-8bp, linking 3’ end of DNA to the hydroxyl group of the tyrosine nucleophile
. Strand exchange then proceeds via a crossed strand intermediate analogous to the Holliday junction
in which only one pair of strands has been exchanged.
Conversely, serine recombinases like gamma-delta and Tn3 resolvase
cut all four DNA strands simultaneously at points that are staggered by 2bp. During cleavage protein-DNA bond is formed via transesterification
reaction in which a phosphodiester bond
is replaced by a phosphoserine bond between a 5’ phosphate at the cleavage site and hydroxyl group of the conserved serine
residue (S10) in resolvase. It is still not entirely clear how the strand exchange occurs after the DNA has been cleaved. However, it has been shown that the strands are exchanged while covalently linked to the protein with a resulting net rotation of 180°. Two current models can account for this, namely the subunit rotation model and the domain swapping model. In both of these models DNA duplexes are situated outside of the protein complex, and large movement of protein is needed to achieve the strand exchange. This is in stark contrast to the mechanism employed by the tyrosine recombinases.
The reaction catalysed by the recombinase may lead to excision of the DNA segment flanked by the two sites, but may also lead to integration or inversion of the orientation of the flanked DNA segment. What the outcome of the reaction will be, is dictated mainly by the relative location and the orientation of sites that are to be recombined, but also by the innate specificity of the site-specific system in question. Excisions and inversions occur if the recombination takes place between two sites that are found on the same molecule (intramolecular recombination), and if the sites are in the same (direct repeat) or in an opposite orientation (inverted repeat), respectively. Insertions on the other hand take place if the recombination occurs on sites that are situated on two different DNA molecules (intermolecular recombination), provided that at least one of these molecules is circular. Most site-specific systems are highly specialised catalysing only one of these different types of reaction and have evolved to ignore the sites that are in the ‘wrong’ orientation.
Typical examples of tyrosine recombinases are the well known enzymes such as Cre
(from the P1 phage
), FLP
(from yeast S. cerevisiae) and λ integrase (from lambda phage
) while famous serine recombinases include enzymes such as: gamma-delta resolvase (from the Tn1000 transposon), Tn3 resolvase
(from the Tn3 transposon) and φC31 integrase (from the φC31 phage).
Although the individual members of the two recombinase families can perform reactions with same practical outcomes, the two families are unrelated to each other, having different protein structures and reaction mechanisms. Unlike tyrosine recombinases, serine recombinases are highly modular as was first hinted by biochemical studies, and later shown by crystallographic structures. Knowledge of these protein structures could prove useful when attempting to reengineer recombinase proteins as tools for genetic manipulation.
Genetic recombination
Genetic recombination is a process by which a molecule of nucleic acid is broken and then joined to a different one. Recombination can occur between similar molecules of DNA, as in homologous recombination, or dissimilar molecules, as in non-homologous end joining. Recombination is a common method...
in which DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
strand exchange takes place between segments possessing only a limited degree of sequence homology. Site-specific recombinases perform rearrangements of DNA segments by recognizing and binding to short DNA sequences (sites), at which they cleave the DNA backbone, exchange the two DNA helices involved and rejoin the DNA strands. While in some site-specific recombination systems having just a recombinase enzyme together with the recombination sites is enough to perform all these reactions, in some other systems a number of accessory proteins and accessory sites are also needed.
Site-specific recombination systems are highly specific, fast and efficient, even when faced with complex eukaryotic genomes. They are employed in a variety of cellular processes, including bacterial genome replication, differentiation and pathogenesis, and movement of mobile genetic elements
Mobile genetic elements
Mobile genetic elements are a type of DNA that can move around within the genome. They include:*Transposons **Retrotransposons**DNA transposons**Insertion sequences*Plasmids...
(Nash 1996). For the same reasons, they present a potential basis for the development of genetic engineering
Genetic engineering
Genetic engineering, also called genetic modification, is the direct human manipulation of an organism's genome using modern DNA technology. It involves the introduction of foreign DNA or synthetic genes into the organism of interest...
tools.
Recombination sites are typically between 30 and 200 nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...
s in length and consist of two motifs with a partial inverted-repeat symmetry, to which the recombinase binds, and which flank a central crossover sequence at which the recombination takes place. The pairs of sites between which the recombination occurs are usually identical, but there are exceptions (e.g. attP and attB of λ integrase, see lambda phage
Lambda phage
Enterobacteria phage λ is a temperate bacteriophage that infects Escherichia coli.Lambda phage is a virus particle consisting of a head, containing double-stranded linear DNA as its genetic material, and a tail that can have tail fibers. The phage particle recognizes and binds to its host, E...
).
Mechanism
Recombination between two DNA sites begins by the recognition and binding of these sites by the recombinase protein. This is followed by the synapsis i.e. bringing the sites together to form the synaptic complex. It is within this synaptic complex that the strand exchange takes place, as the DNA is cleaved and rejoined by controlled transesterificationTransesterification
In organic chemistry, transesterification is the process of exchanging the organic group R″ of an ester with the organic group R′ of an alcohol. These reactions are often catalyzed by the addition of an acid or base catalyst...
reactions. During strand exchange, the DNA cut at fixed points within crossover region of the site releases a deoxyribose
Deoxyribose
Deoxyribose, more, precisely 2-deoxyribose, is a monosaccharide with idealized formula H---3-H. Its name indicates that it is a deoxy sugar, meaning that it is derived from the sugar ribose by loss of an oxygen atom...
hydroxyl group, while recombinase protein forms a transient covalent bond
Covalent bond
A covalent bond is a form of chemical bonding that is characterized by the sharing of pairs of electrons between atoms. The stable balance of attractive and repulsive forces between atoms when they share electrons is known as covalent bonding....
to a DNA backbone phosphate
Phosphate
A phosphate, an inorganic chemical, is a salt of phosphoric acid. In organic chemistry, a phosphate, or organophosphate, is an ester of phosphoric acid. Organic phosphates are important in biochemistry and biogeochemistry or ecology. Inorganic phosphates are mined to obtain phosphorus for use in...
. This phosphodiester bond
Phosphodiester bond
A phosphodiester bond is a group of strong covalent bonds between a phosphate group and two 5-carbon ring carbohydrates over two ester bonds. Phosphodiester bonds are central to all known life, as they make up the backbone of each helical strand of DNA...
between hydroxyl group of the nucleophilic, serine
Serine
Serine is an amino acid with the formula HO2CCHCH2OH. It is one of the proteinogenic amino acids. By virtue of the hydroxyl group, serine is classified as a polar amino acid.-Occurrence and biosynthesis:...
or tyrosine
Tyrosine
Tyrosine or 4-hydroxyphenylalanine, is one of the 22 amino acids that are used by cells to synthesize proteins. Its codons are UAC and UAU. It is a non-essential amino acid with a polar side group...
residue conserves the energy that was expended in cleaving the DNA. Energy stored in this bond is subsequently used for the rejoining of the DNA to the corresponding deoxyribose hydroxyl group on the other site. The entire process therefore goes through without the need for external energy rich cofactors
Cofactor (biochemistry)
A cofactor is a non-protein chemical compound that is bound to a protein and is required for the protein's biological activity. These proteins are commonly enzymes, and cofactors can be considered "helper molecules" that assist in biochemical transformations....
such as ATP
Adenosine triphosphate
Adenosine-5'-triphosphate is a multifunctional nucleoside triphosphate used in cells as a coenzyme. It is often called the "molecular unit of currency" of intracellular energy transfer. ATP transports chemical energy within cells for metabolism...
.
The recombination sites are slightly asymmetric, which allows the enzyme to tell apart the left and right ends of the site. When generating products left ends are always joined to the right ends of their partner sites and vice versa. This causes the recombination sites to be reconstituted in the recombination products. Joining of left ends to left or right to right is avoided due to the asymmetric “overlap” sequence between the staggered points of top and bottom strand exchange. Left-left or right-right half-site recombinants would contain mismatched base pairs.
Although the basic chemical reaction is the same for both tyrosine and serine recombinases there are marked differences. Tyrosine recombinases, such as Cre
Cre recombinase
Cre recombinase, often abbreviated to Cre, is a Type I topoisomerase from P1 bacteriophage that catalyzes site-specific recombination of DNA between loxP sites. The enzyme does not require any energy cofactors and Cre-mediated recombination quickly reaches equilibrium between substrate and reaction...
or FLP
FLP-FRT Recombination
In genetics, FLP-FRT recombination is a site-directed recombination technology used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-Lox recombination...
, cleave one DNA strand at a time at points that are staggered by 6-8bp, linking 3’ end of DNA to the hydroxyl group of the tyrosine nucleophile
Nucleophile
A nucleophile is a species that donates an electron-pair to an electrophile to form a chemical bond in a reaction. All molecules or ions with a free pair of electrons can act as nucleophiles. Because nucleophiles donate electrons, they are by definition Lewis bases.Nucleophilic describes the...
. Strand exchange then proceeds via a crossed strand intermediate analogous to the Holliday junction
Holliday junction
A Holliday junction is a mobile junction between four strands of DNA. The structure is named after Robin Holliday, who proposed it in 1964 to account for a particular type of exchange of genetic information he observed in yeast known as homologous recombination...
in which only one pair of strands has been exchanged.
Conversely, serine recombinases like gamma-delta and Tn3 resolvase
Tn3 resolvase
The Tn3 transposon is a 4957 base pair mobile genetic element, found in prokaryotes.It encodes three proteins:* β-lactamase, an enzyme that confers resistance to β-lactam antibiotics .* Tn3 transposase...
cut all four DNA strands simultaneously at points that are staggered by 2bp. During cleavage protein-DNA bond is formed via transesterification
Transesterification
In organic chemistry, transesterification is the process of exchanging the organic group R″ of an ester with the organic group R′ of an alcohol. These reactions are often catalyzed by the addition of an acid or base catalyst...
reaction in which a phosphodiester bond
Phosphodiester bond
A phosphodiester bond is a group of strong covalent bonds between a phosphate group and two 5-carbon ring carbohydrates over two ester bonds. Phosphodiester bonds are central to all known life, as they make up the backbone of each helical strand of DNA...
is replaced by a phosphoserine bond between a 5’ phosphate at the cleavage site and hydroxyl group of the conserved serine
Serine
Serine is an amino acid with the formula HO2CCHCH2OH. It is one of the proteinogenic amino acids. By virtue of the hydroxyl group, serine is classified as a polar amino acid.-Occurrence and biosynthesis:...
residue (S10) in resolvase. It is still not entirely clear how the strand exchange occurs after the DNA has been cleaved. However, it has been shown that the strands are exchanged while covalently linked to the protein with a resulting net rotation of 180°. Two current models can account for this, namely the subunit rotation model and the domain swapping model. In both of these models DNA duplexes are situated outside of the protein complex, and large movement of protein is needed to achieve the strand exchange. This is in stark contrast to the mechanism employed by the tyrosine recombinases.
The reaction catalysed by the recombinase may lead to excision of the DNA segment flanked by the two sites, but may also lead to integration or inversion of the orientation of the flanked DNA segment. What the outcome of the reaction will be, is dictated mainly by the relative location and the orientation of sites that are to be recombined, but also by the innate specificity of the site-specific system in question. Excisions and inversions occur if the recombination takes place between two sites that are found on the same molecule (intramolecular recombination), and if the sites are in the same (direct repeat) or in an opposite orientation (inverted repeat), respectively. Insertions on the other hand take place if the recombination occurs on sites that are situated on two different DNA molecules (intermolecular recombination), provided that at least one of these molecules is circular. Most site-specific systems are highly specialised catalysing only one of these different types of reaction and have evolved to ignore the sites that are in the ‘wrong’ orientation.
Classification: tyrosine vs. serine recombinases
Based on amino acid sequence homology and mechanistic relatedness most site-specific recombinases are grouped into one of two families: the tyrosine recombinase family or the serine recombinase family. The names stem from the conserved nucleophilic amino acid residue that they use to attack the DNA and which becomes covalently linked to it during strand exchange. Serine recombinase family is also sometimes known as resolvase/invertase family, while tyrosine recombinases are known as the integrase family, which reflects the types of reaction that most known members in each family have evolved to catalyse.Typical examples of tyrosine recombinases are the well known enzymes such as Cre
Cre recombinase
Cre recombinase, often abbreviated to Cre, is a Type I topoisomerase from P1 bacteriophage that catalyzes site-specific recombination of DNA between loxP sites. The enzyme does not require any energy cofactors and Cre-mediated recombination quickly reaches equilibrium between substrate and reaction...
(from the P1 phage
P1 phage
P1 is a temperate bacteriophage . The P1 phage can be used to create the P1-derived artificial chromosome cloning vector.-Life cycle:Temperate phage, such as P1, have the ability to exist within the bacterial cell they infect in two different ways...
), FLP
FLP-FRT Recombination
In genetics, FLP-FRT recombination is a site-directed recombination technology used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-Lox recombination...
(from yeast S. cerevisiae) and λ integrase (from lambda phage
Lambda phage
Enterobacteria phage λ is a temperate bacteriophage that infects Escherichia coli.Lambda phage is a virus particle consisting of a head, containing double-stranded linear DNA as its genetic material, and a tail that can have tail fibers. The phage particle recognizes and binds to its host, E...
) while famous serine recombinases include enzymes such as: gamma-delta resolvase (from the Tn1000 transposon), Tn3 resolvase
Tn3 resolvase
The Tn3 transposon is a 4957 base pair mobile genetic element, found in prokaryotes.It encodes three proteins:* β-lactamase, an enzyme that confers resistance to β-lactam antibiotics .* Tn3 transposase...
(from the Tn3 transposon) and φC31 integrase (from the φC31 phage).
Although the individual members of the two recombinase families can perform reactions with same practical outcomes, the two families are unrelated to each other, having different protein structures and reaction mechanisms. Unlike tyrosine recombinases, serine recombinases are highly modular as was first hinted by biochemical studies, and later shown by crystallographic structures. Knowledge of these protein structures could prove useful when attempting to reengineer recombinase proteins as tools for genetic manipulation.