Complementary DNA
Encyclopedia
In genetics
, complementary DNA (cDNA) is DNA
synthesized from a messenger RNA (mRNA) template in a reaction catalyzed by the enzyme reverse transcriptase
and the enzyme DNA polymerase
. cDNA is often used to clone eukaryotic
gene
s in prokaryote
s. When scientists want to express a specific protein
in a cell that does not normally express
that protein (i.e., heterologous
expression), they will transfer the cDNA that codes for the protein to the recipient cell. cDNA is also produced by retroviruses (such as HIV-1, HIV-2, Simian Immunodeficiency Virus
, etc.) which is integrated into its host to create a provirus
.
outlines that in synthesizing proteins, DNA
is transcribed
into mRNA, which is translated
into protein. One difference between eukaryotic and prokaryotic genes is that eukaryotic genes can contain intron
s (intervening sequences) which are not coding sequences (in contrast with exon
s which are coding sequences), and must be removed from the RNA primary transcript before it becomes mRNA and can be translated into protein. Prokaryotic genes have no introns, so their RNA is not subject to splicing
.
Often it is desirable to express eukaryotic genes in prokaryotic cells. A simplified method of doing so would include the addition of eukaryotic DNA to a vector
, sometimes a prokaryotic host, which would then transcribe the DNA to mRNA and then translate it to protein. However, as eukaryotic DNA has introns, and since prokaryotes lack the machinery to splice them, the splicing of eukaryotic DNA must be done prior to adding the eukaryotic DNA into the host. This DNA, which was made as a complementary copy of the RNA and has no introns, is called complementary DNA (cDNA). To obtain expression
of the protein encoded by the eukaryotic cDNA, prokaryotic regulatory sequences would also be required (e.g. a promoter).
. This enzyme operates on a single strand of mRNA, generating its complementary DNA based on the pairing of RNA base pair
s (A, U, G and C) to their DNA complements (T, A, C and G respectively).
To obtain eukaryotic cDNA whose introns have been removed:
The reverse transcriptase scans the mature mRNA and synthesizes a sequence of DNA that complements the mRNA template. This strand of DNA is complementary DNA.
Note that the central dogma of molecular biology
is broken in this process.
. When scientists transfer a gene from one cell into another cell in order to express the new genetic material as a protein in the recipient cell, the cDNA will be added to the recipient (rather than the entire gene), because the DNA for an entire gene may include DNA that does not code for the protein or that interrupts the coding sequence of the protein (e.g., intron
s). Partial sequences of cDNAs are often obtained as expressed sequence tag
s.
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....
, complementary DNA (cDNA) is DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
synthesized from a messenger RNA (mRNA) template in a reaction catalyzed by the enzyme reverse transcriptase
Reverse transcriptase
In the fields of molecular biology and biochemistry, a reverse transcriptase, also known as RNA-dependent DNA polymerase, is a DNA polymerase enzyme that transcribes single-stranded RNA into single-stranded DNA. It also helps in the formation of a double helix DNA once the RNA has been reverse...
and the enzyme DNA polymerase
DNA polymerase
A DNA polymerase is an enzyme that helps catalyze in the polymerization of deoxyribonucleotides into a DNA strand. DNA polymerases are best known for their feedback role in DNA replication, in which the polymerase "reads" an intact DNA strand as a template and uses it to synthesize the new strand....
. cDNA is often used to clone eukaryotic
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...
gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s in prokaryote
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...
s. When scientists want to express a specific protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
in a cell that does not normally express
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...
that protein (i.e., heterologous
Heterologous
In medicine a heterologous transplant means 'between species' or 'from one species to another'.In cell biology and protein biochemistry, heterologous expression means that a protein is experimentally put into a cell that does not normally make that protein...
expression), they will transfer the cDNA that codes for the protein to the recipient cell. cDNA is also produced by retroviruses (such as HIV-1, HIV-2, Simian Immunodeficiency Virus
Simian immunodeficiency virus
Simian immunodeficiency virus , also known as African Green Monkey virus and also as Monkey AIDS is a retrovirus able to infect at least 33 species of African primates...
, etc.) which is integrated into its host to create a provirus
Provirus
A provirus is a virus genome that is integrated into the DNA of a host cell.This state can be a stage of virus replication, or a state that persists over longer periods of time as either inactive viral infections or an endogenous retrovirus. In inactive viral infections the virus will not replicate...
.
Overview
The central dogma of molecular biologyCentral dogma of molecular biology
The central dogma of molecular biology was first articulated by Francis Crick in 1958 and re-stated in a Nature paper published in 1970:In other words, the process of producing proteins is irreversible: a protein cannot be used to create DNA....
outlines that in synthesizing proteins, DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
is transcribed
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...
into mRNA, which is translated
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
into protein. One difference between eukaryotic and prokaryotic genes is that eukaryotic genes can contain intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...
s (intervening sequences) which are not coding sequences (in contrast with exon
Exon
An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing. The mature RNA molecule can be a messenger RNA...
s which are coding sequences), and must be removed from the RNA primary transcript before it becomes mRNA and can be translated into protein. Prokaryotic genes have no introns, so their RNA is not subject to splicing
Splicing (genetics)
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...
.
Often it is desirable to express eukaryotic genes in prokaryotic cells. A simplified method of doing so would include the addition of eukaryotic DNA to a vector
Vector (molecular biology)
In molecular biology, a vector is a DNA molecule used as a vehicle to transfer foreign genetic material into another cell. The four major types of vectors are plasmids, viruses, cosmids, and artificial chromosomes...
, sometimes a prokaryotic host, which would then transcribe the DNA to mRNA and then translate it to protein. However, as eukaryotic DNA has introns, and since prokaryotes lack the machinery to splice them, the splicing of eukaryotic DNA must be done prior to adding the eukaryotic DNA into the host. This DNA, which was made as a complementary copy of the RNA and has no introns, is called complementary DNA (cDNA). To obtain expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...
of the protein encoded by the eukaryotic cDNA, prokaryotic regulatory sequences would also be required (e.g. a promoter).
Synthesis
Though there are several methods for doing so, cDNA is most often synthesized from mature (fully spliced) mRNA using the enzyme reverse transcriptaseReverse transcriptase
In the fields of molecular biology and biochemistry, a reverse transcriptase, also known as RNA-dependent DNA polymerase, is a DNA polymerase enzyme that transcribes single-stranded RNA into single-stranded DNA. It also helps in the formation of a double helix DNA once the RNA has been reverse...
. This enzyme operates on a single strand of mRNA, generating its complementary DNA based on the pairing of RNA base pair
Base pair
In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...
s (A, U, G and C) to their DNA complements (T, A, C and G respectively).
To obtain eukaryotic cDNA whose introns have been removed:
- A eukaryotic cell transcribes the DNA (from genes) into RNA (pre-mRNA).
- The same cell processesPost-transcriptional modificationPost-transcriptional modification is a process in cell biology by which, in eukaryotic cells, primary transcript RNA is converted into mature RNA. A notable example is the conversion of precursor messenger RNA into mature messenger RNA , which includes splicing and occurs prior to protein synthesis...
the pre-mRNA strands by removing introns, and adding a poly-A tailPolyadenylationPolyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...
and 5’ Methyl-Guanine cap5' capThe 5' cap is a specially altered nucleotide on the 5' end of precursor messenger RNA and some other primary RNA transcripts as found in eukaryotes. The process of 5' capping is vital to creating mature messenger RNA, which is then able to undergo translation...
. - This mixture of mature mRNA strands is extracted from the cell. The Poly-A tail of the post transcription mRNA can be taken advantage of with oligo(dT) beads in an affinity chromatographyAffinity chromatographyAffinity chromatography is a method of separating biochemical mixtures and based on a highly specific interaction such as that between antigen and antibody, enzyme and substrate, or receptor and ligand.-Uses:Affinity chromatography can be used to:...
assay. - A poly-TThymineThymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...
oligonucleotideOligonucleotideAn oligonucleotide is a short nucleic acid polymer, typically with fifty or fewer bases. Although they can be formed by bond cleavage of longer segments, they are now more commonly synthesized, in a sequence-specific manner, from individual nucleoside phosphoramidites...
primerPrimer (molecular biology)A primer is a strand of nucleic acid that serves as a starting point for DNA synthesis. They are required for DNA replication because the enzymes that catalyze this process, DNA polymerases, can only add new nucleotides to an existing strand of DNA...
is hybridized onto the poly-A tailPolyadenylationPolyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...
of the mature mRNA template, or random hexamer primers can be added which contain every possible 6 base single strand of DNA and can therefore hybridize anywhere on the RNA (Reverse transcriptase requires this double-stranded segment as a primer to start its operation.) - Reverse transcriptase is added, along with deoxynucleotide triphosphates (A, T, G, C). This synthesizes one complementary strand of DNA hybridized to the original mRNA strand.
- To synthesize an additional DNA strand, you need to digest the RNA of the hybrid strand, using an enzyme like RNase HRNase HThe enzyme RNase H is a non-specific endonuclease and catalyzes the cleavage of RNA via a hydrolytic mechanism. Members of the RNase H family can be found in nearly all organisms, from archaea to bacteria and eukaryota....
. - After digestion of the RNA, a single stranded DNA (ssDNA) is left and because single stranded nucleic acids are hydrophobic, it tends to loop around itself. It is likely that the ssDNA forms a hairpin loop at the 3' end.
- From the hairpin loop, a DNA polymerase can then use it as a primer to transcribe a complementary sequence for the ss cDNA.
- Now, you should be left with a double stranded cDNA with identical sequence as the mRNA of interest.
The reverse transcriptase scans the mature mRNA and synthesizes a sequence of DNA that complements the mRNA template. This strand of DNA is complementary DNA.
Note that the central dogma of molecular biology
Central dogma of molecular biology
The central dogma of molecular biology was first articulated by Francis Crick in 1958 and re-stated in a Nature paper published in 1970:In other words, the process of producing proteins is irreversible: a protein cannot be used to create DNA....
is broken in this process.
Applications
Complementary DNA is often used in gene cloning or as gene probes or in the creation of a cDNA libraryCDNA library
A cDNA library is a combination of cloned cDNA fragments inserted into a collection of host cells, which together constitute some portion of the transcriptome of the organism. cDNA is produced from fully transcribed mRNA found in the nucleus and therefore contains only the expressed genes of an...
. When scientists transfer a gene from one cell into another cell in order to express the new genetic material as a protein in the recipient cell, the cDNA will be added to the recipient (rather than the entire gene), because the DNA for an entire gene may include DNA that does not code for the protein or that interrupts the coding sequence of the protein (e.g., intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...
s). Partial sequences of cDNAs are often obtained as expressed sequence tag
Expressed sequence tag
An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence. They may be used to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination. The identification of ESTs has proceeded rapidly, with approximately 65.9 million ESTs now available in...
s.