Transcription (genetics)
Encyclopedia
Transcription is the process of creating a complementary
RNA
copy of a sequence of DNA
. Both RNA and DNA are nucleic acid
s, which use base pair
s of nucleotide
s as a complementary
language that can be converted back and forth from DNA to RNA by the action of the correct enzyme
s. During transcription, a DNA sequence is read by RNA polymerase
, which produces a complementary, antiparallel
RNA strand. As opposed to DNA replication
, transcription results in an RNA complement that includes uracil
(U) in all instances where thymine
(T) would have occurred in a DNA complement.
Transcription can be explained easily in 4 or 5 steps, each moving like a wave along the DNA.
Transcription is the first step leading to gene expression
. The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene
. If the gene transcribed encodes a protein
, the result of transcription is messenger RNA
(mRNA), which will then be used to create that protein via the process of translation
. Alternatively, the transcribed gene may encode for either ribosomal RNA
(rRNA) or transfer RNA
(tRNA), other components of the protein-assembly process, or other ribozyme
s.
A DNA transcription unit encoding for a protein contains not only the sequence that will eventually be directly translated into the protein (the coding sequence) but also regulatory sequences that direct and regulate the synthesis of that protein. The regulatory sequence before (upstream
from) the coding sequence is called the five prime untranslated region
(5'UTR), and the sequence following (downstream
from) the coding sequence is called the three prime untranslated region
(3'UTR).
Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.
As in DNA replication, DNA is read from 3' → 5' during transcription. Meanwhile, the complementary RNA is created from the 5' → 3' direction. This means its 5' end is created first in base pairing. Although DNA is arranged as two antiparallel strands in a double helix, only one of the two DNA strands, called the template strand, is used for transcription. This is because RNA is only single-stranded, as opposed to double-stranded DNA. The other DNA strand is called the coding (lagging) strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). The use of only the 3' → 5' strand eliminates the need for the Okazaki fragment
s seen in DNA replication.
Transcription is divided into 5 stages: pre-initiation, initiation, promoter clearance, elongation and termination.
, and therefore the initiation of transcription, requires the presence of a core promoter sequence in the DNA. Promoters are regions of DNA that promote transcription and, in eukaryotes, are found at -30, -75, and -90 base pairs upstream from the transcription start site (abbreviated to TSS). Core promoters are sequences within the promoter that are essential for transcription initiation. RNA polymerase is able to bind to core promoters in the presence of various specific transcription factor
s.
The most characterized type of core promoter in eukaryotes is a short DNA sequence known as a TATA box
, found 25-30 base pairs upstream from the TSS. The TATA box, as a core promoter, is the binding site for a transcription factor known as TATA-binding protein (TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D
(TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and RNA polymerase combine around the TATA box in a series of stages to form a preinitiation complex
. One transcription factor, DNA helicase, has helicase
activity and so is involved in the separating of opposing strands of double-stranded DNA to provide access to a single-stranded DNA template. However, only a low, or basal, rate of transcription is driven by the preinitiation complex alone. Other proteins known as activators
and repressor
s, along with any associated coactivators or corepressors, are responsible for modulating transcription rate.
Thus, preinitiation complex contains:
1. Core Promoter Sequence
2. Transcription Factors
3. DNA Helicase
4. RNA Polymerase
5. Activators and Repressors
The transcription preinitiation in archaea
is, in essence, homologous to that of eukaryotes, but is much less complex. The archaeal preinitiation complex assembles at a TATA-box binding site; however, in archaea, this complex is composed of only RNA polymerase II, TBP, and TFB (the archaeal homologue of eukaryotic transcription factor II B
(TFIIB)).
, transcription begins with the binding of RNA polymerase to the promoter in DNA. RNA polymerase is a core enzyme
consisting of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. At the start of initiation, the core enzyme is associated with a sigma factor
that aids in finding the appropriate -35 and -10 base pairs downstream of promoter sequences. When the sigma factor and RNA polymerase combine, they form a holoenzyme.
Transcription initiation is more complex in eukaryotes. Eukaryotic RNA polymerase does not directly recognize the core promoter sequences. Instead, a collection of proteins called transcription factor
s mediate the binding of RNA polymerase and the initiation of transcription. Only after certain transcription factors are attached to the promoter does the RNA polymerase bind to it. The completed assembly of transcription factors and RNA polymerase bind to the promoter, forming a transcription initiation complex. Transcription in the archaea domain is similar to transcription in eukaryotes.
-dependent process, consuming adenosine triphosphate
(ATP).
Promoter clearance coincides with phosphorylation of serine 5 on the carboxy terminal domain of RNA Pol in eukaryotes, which is phosphorylated by TFIIH.
s are replaced with uracil
s, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one less oxygen atom) in its sugar-phosphate backbone).
Unlike DNA replication, mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene.
Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.
Transcription termination in eukaryotes is less understood but involves cleavage of the new transcript followed by template-independent addition of As at its new 3' end, in a process called polyadenylation
.
or euchromatin
. Such sites can be visualized by allowing engaged polymerases to extend their transcripts in tagged precursors (Br-UTP or Br-U) and immuno-labeling the tagged nascent RNA. Transcription factories can also be localized using fluorescence in situ hybridization or marked by antibodies directed against polymerases. There are ~10,000 factories in the nucleoplasm of a HeLa cell, among which are ~8,000 polymerase II factories and ~2,000 polymerase III factories. Each polymerase II factory contains ~8 polymerases. As most active transcription units are associated with only one polymerase, each factory usually contains ~8 different transcription units. These units might be associated through promoters and/or enhancers, with loops forming a ‘cloud’ around the factor.
and Jacques Monod
. RNA synthesis by RNA polymerase was established in vitro
by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.
In 1972, Walter Fiers became the first person to actually prove the existence of the terminating enzyme.
Roger D. Kornberg
won the 2006 Nobel Prize in Chemistry
"for his studies of the molecular basis of eukaryotic transcription".
, the cause of AIDS
), have the ability to transcribe RNA into DNA. HIV has an RNA genome that is duplicated into DNA. The resulting DNA can be merged with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA from an RNA template is called reverse transcriptase
. In the case of HIV, reverse transcriptase is responsible for synthesizing a complementary DNA
strand (cDNA) to the viral RNA genome. An associated enzyme, ribonuclease H, digests the RNA strand, and reverse transcriptase synthesises a complementary strand of DNA to form a double helix DNA structure. This cDNA is integrated into the host cell's genome via another enzyme (integrase
) causing the host cell to generate viral proteins that reassemble into new viral particles. In HIV, subsequent to this, the host cell undergoes programmed cell death, apoptosis
of T cell
s. However, in other retroviruses, the host cell remains intact as the virus buds out of the cell.
Some eukaryotic cells contain an enzyme with reverse transcription activity called telomerase
. Telomerase is a reverse transcriptase that lengthens the ends of linear chromosomes. Telomerase carries an RNA template from which it synthesizes DNA repeating sequence, or "junk" DNA. This repeated sequence of DNA is important because, every time a linear chromosome is duplicated, it is shortened in length. With "junk" DNA at the ends of chromosomes, the shortening eliminates some of the non-essential, repeated sequence rather than the protein-encoding DNA sequence farther away from the chromosome end. Telomerase is often activated in cancer cells to enable cancer cells to duplicate their genomes indefinitely without losing important protein-coding DNA sequence. Activation of telomerase could be part of the process that allows cancer cells to become immortal. However, the true in vivo
significance of telomerase has still not been empirical
ly proven.
Complementarity (molecular biology)
In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA, as well as DNA:RNA duplexes. Each strand is complementary to the other in that the base pairs between them are non-covalently connected via two or three hydrogen bonds...
RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....
copy of a sequence of DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
. Both RNA and DNA are nucleic acid
Nucleic acid
Nucleic acids are biological molecules essential for life, and include DNA and RNA . Together with proteins, nucleic acids make up the most important macromolecules; each is found in abundance in all living things, where they function in encoding, transmitting and expressing genetic information...
s, which use base pair
Base pair
In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...
s of nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...
s as a complementary
Complementarity (molecular biology)
In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA, as well as DNA:RNA duplexes. Each strand is complementary to the other in that the base pairs between them are non-covalently connected via two or three hydrogen bonds...
language that can be converted back and forth from DNA to RNA by the action of the correct enzyme
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...
s. During transcription, a DNA sequence is read by RNA polymerase
RNA polymerase
RNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...
, which produces a complementary, antiparallel
Antiparallel (biochemistry)
In biochemistry, two molecules are antiparallel if they run side-by-side in opposite directions or when both strands are complimentary to each other....
RNA strand. As opposed to DNA replication
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...
, transcription results in an RNA complement that includes uracil
Uracil
Uracil is one of the four nucleobases in the nucleic acid of RNA that are represented by the letters A, G, C and U. The others are adenine, cytosine, and guanine. In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine.Uracil is a common and...
(U) in all instances where thymine
Thymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...
(T) would have occurred in a DNA complement.
Transcription can be explained easily in 4 or 5 steps, each moving like a wave along the DNA.
- RNA Polymerase unwinds/"unzips" the DNA by breaking the hydrogen bonds between complementary nucleotides.
- RNA Polymerase adds matching RNA nucleotides are paired with complementary DNA bases.
- RNA sugar-phosphate backbone forms with assistance from RNA polymerase.
- Hydrogen bonds of the untwisted RNA+DNA helix break, freeing the newly synthesized RNA strand.
- If the cell has a nucleusCell nucleusIn cell biology, the nucleus is a membrane-enclosed organelle found in eukaryotic cells. It contains most of the cell's genetic material, organized as multiple long linear DNA molecules in complex with a large variety of proteins, such as histones, to form chromosomes. The genes within these...
, the RNA is further processed (addition of a 3' poly-A tail and a 5' cap) and exits through to the cytoplasm through the nuclear pore complex.
Transcription is the first step leading to gene expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...
. The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
. If the gene transcribed encodes a protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
, the result of transcription is messenger RNA
Messenger RNA
Messenger RNA is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcribed from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes. Here, the nucleic acid polymer is translated into a polymer of amino acids: a protein...
(mRNA), which will then be used to create that protein via the process of translation
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
. Alternatively, the transcribed gene may encode for either ribosomal RNA
Ribosomal RNA
Ribosomal ribonucleic acid is the RNA component of the ribosome, the enzyme that is the site of protein synthesis in all living cells. Ribosomal RNA provides a mechanism for decoding mRNA into amino acids and interacts with tRNAs during translation by providing peptidyl transferase activity...
(rRNA) or transfer RNA
Transfer RNA
Transfer RNA is an adaptor molecule composed of RNA, typically 73 to 93 nucleotides in length, that is used in biology to bridge the three-letter genetic code in messenger RNA with the twenty-letter code of amino acids in proteins. The role of tRNA as an adaptor is best understood by...
(tRNA), other components of the protein-assembly process, or other ribozyme
Ribozyme
A ribozyme is an RNA molecule with a well defined tertiary structure that enables it to catalyze a chemical reaction. Ribozyme means ribonucleic acid enzyme. It may also be called an RNA enzyme or catalytic RNA. Many natural ribozymes catalyze either the hydrolysis of one of their own...
s.
A DNA transcription unit encoding for a protein contains not only the sequence that will eventually be directly translated into the protein (the coding sequence) but also regulatory sequences that direct and regulate the synthesis of that protein. The regulatory sequence before (upstream
Upstream and downstream (DNA)
In molecular biology and genetics, upstream and downstream both refer to a relative position in DNA or RNA. Each strand of DNA or RNA has a 5' end and a 3' end, so named for the carbons on the deoxyribose ring. Relative to the position on the strand, downstream is the region towards the 3' end of...
from) the coding sequence is called the five prime untranslated region
Five prime untranslated region
A messenger ribonucleic acid molecule codes for a protein through translation. The mRNA also contains regions that are not translated: in eukaryotes these include the 5' untranslated region, 3' untranslated region, 5' cap and poly-A tail....
(5'UTR), and the sequence following (downstream
Upstream and downstream (DNA)
In molecular biology and genetics, upstream and downstream both refer to a relative position in DNA or RNA. Each strand of DNA or RNA has a 5' end and a 3' end, so named for the carbons on the deoxyribose ring. Relative to the position on the strand, downstream is the region towards the 3' end of...
from) the coding sequence is called the three prime untranslated region
Three prime untranslated region
In molecular genetics, the three prime untranslated region is a particular section of messenger RNA . It is preceeded by the coding region....
(3'UTR).
Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.
As in DNA replication, DNA is read from 3' → 5' during transcription. Meanwhile, the complementary RNA is created from the 5' → 3' direction. This means its 5' end is created first in base pairing. Although DNA is arranged as two antiparallel strands in a double helix, only one of the two DNA strands, called the template strand, is used for transcription. This is because RNA is only single-stranded, as opposed to double-stranded DNA. The other DNA strand is called the coding (lagging) strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). The use of only the 3' → 5' strand eliminates the need for the Okazaki fragment
Okazaki fragment
Okazaki fragments are short molecules of single-stranded DNA that are formed on the lagging strand during DNA replication. They are between 1,000 to 2,000 nucleotides long in Escherichia coli and are between 100 to 200 nucleotides long in eukaryotes....
s seen in DNA replication.
Transcription is divided into 5 stages: pre-initiation, initiation, promoter clearance, elongation and termination.
Pre-initiation
In eukaryotes, RNA polymeraseRNA polymerase
RNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...
, and therefore the initiation of transcription, requires the presence of a core promoter sequence in the DNA. Promoters are regions of DNA that promote transcription and, in eukaryotes, are found at -30, -75, and -90 base pairs upstream from the transcription start site (abbreviated to TSS). Core promoters are sequences within the promoter that are essential for transcription initiation. RNA polymerase is able to bind to core promoters in the presence of various specific transcription factor
Transcription factor
In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences, thereby controlling the flow of genetic information from DNA to mRNA...
s.
The most characterized type of core promoter in eukaryotes is a short DNA sequence known as a TATA box
TATA box
The TATA box is a DNA sequence found in the promoter region of genes in archaea and eukaryotes; approximately 24% of human genes contain a TATA box within the core promoter....
, found 25-30 base pairs upstream from the TSS. The TATA box, as a core promoter, is the binding site for a transcription factor known as TATA-binding protein (TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D
Transcription Factor II D
RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as SRB proteins...
(TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and RNA polymerase combine around the TATA box in a series of stages to form a preinitiation complex
Preinitiation complex
The preinitiation complex is a large complex of proteins that is necessary for the transcription of protein-coding genes in eukaryotes...
. One transcription factor, DNA helicase, has helicase
Helicase
Helicases are a class of enzymes vital to all living organisms. They are motor proteins that move directionally along a nucleic acid phosphodiester backbone, separating two annealed nucleic acid strands using energy derived from ATP hydrolysis.-Function:Many cellular processes Helicases are a...
activity and so is involved in the separating of opposing strands of double-stranded DNA to provide access to a single-stranded DNA template. However, only a low, or basal, rate of transcription is driven by the preinitiation complex alone. Other proteins known as activators
Activator (genetics)
An activator is a DNA-binding protein that regulates one or more genes by increasing the rate of transcription. The activator may increase transcription by virtue of a connected domain which assists in the formation of the RNA polymerase holoenzyme, or may operate through a coactivator. A...
and repressor
Repressor
In molecular genetics, a repressor is a DNA-binding protein that regulates the expression of one or more genes by binding to the operator and blocking the attachment of RNA polymerase to the promoter, thus preventing transcription of the genes. This blocking of expression is called...
s, along with any associated coactivators or corepressors, are responsible for modulating transcription rate.
Thus, preinitiation complex contains:
1. Core Promoter Sequence
2. Transcription Factors
3. DNA Helicase
4. RNA Polymerase
5. Activators and Repressors
The transcription preinitiation in archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...
is, in essence, homologous to that of eukaryotes, but is much less complex. The archaeal preinitiation complex assembles at a TATA-box binding site; however, in archaea, this complex is composed of only RNA polymerase II, TBP, and TFB (the archaeal homologue of eukaryotic transcription factor II B
Transcription Factor II B
Transcription factor II B is one of several general transcription factors that make up the RNA polymerase II preinitiation complex. It is encoded by the gene....
(TFIIB)).
Initiation
In bacteriaBacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...
, transcription begins with the binding of RNA polymerase to the promoter in DNA. RNA polymerase is a core enzyme
Core enzyme
A core enzyme consists of the subunits of an enzyme that are needed for catalytic activity, as in the core enzyme RNA polymerase.An example of a core enzyme is a RNA polymerase enzyme without the sigma factor . This enzyme consists of only two alpha , one beta and one beta prime . This is just one...
consisting of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. At the start of initiation, the core enzyme is associated with a sigma factor
Sigma factor
A sigma factor is a bacterial transcription initiation factor that enables specific binding of RNA polymerase to gene promoters. Different sigma factors are activated in response to different environmental conditions...
that aids in finding the appropriate -35 and -10 base pairs downstream of promoter sequences. When the sigma factor and RNA polymerase combine, they form a holoenzyme.
Transcription initiation is more complex in eukaryotes. Eukaryotic RNA polymerase does not directly recognize the core promoter sequences. Instead, a collection of proteins called transcription factor
Transcription factor
In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences, thereby controlling the flow of genetic information from DNA to mRNA...
s mediate the binding of RNA polymerase and the initiation of transcription. Only after certain transcription factors are attached to the promoter does the RNA polymerase bind to it. The completed assembly of transcription factors and RNA polymerase bind to the promoter, forming a transcription initiation complex. Transcription in the archaea domain is similar to transcription in eukaryotes.
Promoter clearance
After the first bond is synthesized, the RNA polymerase must clear the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation and is common for both eukaryotes and prokaryotes. Abortive initiation continues to occur until the σ factor rearranges, resulting in the transcription elongation complex (which gives a 35 bp moving footprint). The σ factor is released before 80 nucleotides of mRNA are synthesized. Once the transcript reaches approximately 23 nucleotides, it no longer slips and elongation can occur. This, like most of the remainder of transcription, is an energyEnergy
In physics, energy is an indirectly observed quantity. It is often understood as the ability a physical system has to do work on other physical systems...
-dependent process, consuming adenosine triphosphate
Adenosine triphosphate
Adenosine-5'-triphosphate is a multifunctional nucleoside triphosphate used in cells as a coenzyme. It is often called the "molecular unit of currency" of intracellular energy transfer. ATP transports chemical energy within cells for metabolism...
(ATP).
Promoter clearance coincides with phosphorylation of serine 5 on the carboxy terminal domain of RNA Pol in eukaryotes, which is phosphorylated by TFIIH.
Elongation
One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy. Although RNA polymerase traverses the template strand from 3' → 5', the coding (non-template) strand and newly-formed RNA can also be used as reference points, so transcription can be described as occurring 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymineThymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...
s are replaced with uracil
Uracil
Uracil is one of the four nucleobases in the nucleic acid of RNA that are represented by the letters A, G, C and U. The others are adenine, cytosine, and guanine. In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine.Uracil is a common and...
s, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one less oxygen atom) in its sugar-phosphate backbone).
Unlike DNA replication, mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene.
Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.
Termination
Bacteria use two different strategies for transcription termination. In Rho-independent transcription termination, RNA transcription stops when the newly synthesized RNA molecule forms a G-C-rich hairpin loop followed by a run of Us. When the hairpin forms, the mechanical stress breaks the weak rU-dA bonds, now filling the DNA-RNA hybrid. This pulls the poly-U transcript out of the active site of the RNA polymerase, in effect, terminating transcription. In the "Rho-dependent" type of termination, a protein factor called "Rho" destabilizes the interaction between the template and the mRNA, thus releasing the newly synthesized mRNA from the elongation complex.Transcription termination in eukaryotes is less understood but involves cleavage of the new transcript followed by template-independent addition of As at its new 3' end, in a process called polyadenylation
Polyadenylation
Polyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...
.
Measuring and detecting transcription
Transcription can be measured and detected in a variety of ways:- Nuclear Run-on assayNuclear run-onA nuclear run-on assay is conducted to identify the genes that are being transcribed at a certain time. Cell nuclei are isolated rapidly, and incubated with labelled nucleotides and the results are hybridized to a slot blot, which is then exposed to film. It was originally developed by Gariglio et...
: measures the relative abundance of newly formed transcripts - RNase protection assay and ChIP-Chip of RNAP: detect active transcription sites
- RT-PCR: measures the absolute abundance of total or nuclear RNA levels, which may however differ from transcription rates
- DNA microarrays: measures the relative abundance of the global total or nuclear RNA levels; however, these may differ from transcription rates
- In situ hybridizationIn situ hybridizationIn situ hybridization is a type of hybridization that uses a labeled complementary DNA or RNA strand to localize a specific DNA or RNA sequence in a portion or section of tissue , or, if the tissue is small enough , in the entire tissue...
: detects the presence of a transcript - MS2 taggingMS2 taggingTechniques based upon the interaction of the MS2 bacteriophage coat protein with a stem loop structure from the phage genome. Used for biochemical purification of RNA-protein complexes and partnered to GFP for detection of RNA in living cells...
: by incorporating RNA stem loops, such as MS2, into a gene, these become incorporated into newly synthesized RNA. The stem loops can then be detected using a fusion of GFP and the MS2 coat protein, which has a high affinity, sequence-specific interaction with the MS2 stem loops. The recruitment of GFP to the site of transcription is visualised as a single fluorescent spot. This remarkable new approach has revealed that transcription occurs in discontinuous bursts, or pulses (see Transcriptional burstingTranscriptional burstingTranscriptional bursting, also known as transcriptional pulsing, is a fundamental property of genes from bacteria to humans. Transcription of genes, the process which transforms the stable code written in DNA into the mobile RNA message can occur in "bursts" or "pulses"...
). With the notable exception of in situ techniques, most other methods provide cell population averages, and are not capable of detecting this fundamental property of genes. - Northern blotNorthern blotThe northern blot is a technique used in molecular biology research to study gene expression by detection of RNA in a sample. With northern blotting it is possible to observe cellular control over structure and function by determining the particular gene expression levels during differentiation,...
: the traditional method, and until the advent of RNA-SeqRNA-SeqRNA-seq, also called "Whole Transcriptome Shotgun Sequencing" and dubbed "a revolutionary tool for transcriptomics", refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA content, a technique that is quickly becoming...
, the most quantitative - RNA-SeqRNA-SeqRNA-seq, also called "Whole Transcriptome Shotgun Sequencing" and dubbed "a revolutionary tool for transcriptomics", refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA content, a technique that is quickly becoming...
: applies next-generation sequencing techniques to sequence whole transcriptomes, which allows the measurement of relative abundance of RNA, as well as the detection of additional variations such as fusion genes, post-translational edits and novel splice sites
Transcription factories
Active transcription units are clustered in the nucleus, in discrete sites called transcription factoriesTranscription factories
In genetics, a transcription factory is an active gene transcription unit that is clustered in a discrete site within the eukaryotic nucleus. Such sites can be visualized by allowing engaged polymerases to extend their transcripts with tagged precursors and immuno-labeling the tagged nascent RNA...
or euchromatin
Euchromatin
Euchromatin is a lightly packed form of chromatin that is rich in gene concentration, and is often under active transcription. Unlike heterochromatin, it is found in both cells with nuclei and cells without nuclei...
. Such sites can be visualized by allowing engaged polymerases to extend their transcripts in tagged precursors (Br-UTP or Br-U) and immuno-labeling the tagged nascent RNA. Transcription factories can also be localized using fluorescence in situ hybridization or marked by antibodies directed against polymerases. There are ~10,000 factories in the nucleoplasm of a HeLa cell, among which are ~8,000 polymerase II factories and ~2,000 polymerase III factories. Each polymerase II factory contains ~8 polymerases. As most active transcription units are associated with only one polymerase, each factory usually contains ~8 different transcription units. These units might be associated through promoters and/or enhancers, with loops forming a ‘cloud’ around the factor.
History
A molecule that allows the genetic material to be realized as a protein was first hypothesized by François JacobFrançois Jacob
François Jacob is a French biologist who, together with Jacques Monod, originated the idea that control of enzyme levels in all cells occurs through feedback on transcription. He shared the 1965 Nobel Prize in Medicine with Jacques Monod and André Lwoff.-Childhood and education:François Jacob is...
and Jacques Monod
Jacques Monod
Jacques Lucien Monod was a French biologist who was awarded a Nobel Prize in Physiology or Medicine in 1965, sharing it with François Jacob and Andre Lwoff "for their discoveries concerning genetic control of enzyme and virus synthesis"...
. RNA synthesis by RNA polymerase was established in vitro
In vitro
In vitro refers to studies in experimental biology that are conducted using components of an organism that have been isolated from their usual biological context in order to permit a more detailed or more convenient analysis than can be done with whole organisms. Colloquially, these experiments...
by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.
In 1972, Walter Fiers became the first person to actually prove the existence of the terminating enzyme.
Roger D. Kornberg
Roger D. Kornberg
Roger David Kornberg is an American biochemist and professor of structural biology at Stanford University School of Medicine.Kornberg was awarded the Nobel Prize in Chemistry in 2006 for his studies of the process by which genetic information from DNA is copied to RNA, "the molecular basis of...
won the 2006 Nobel Prize in Chemistry
Nobel Prize in Chemistry
The Nobel Prize in Chemistry is awarded annually by the Royal Swedish Academy of Sciences to scientists in the various fields of chemistry. It is one of the five Nobel Prizes established by the will of Alfred Nobel in 1895, awarded for outstanding contributions in chemistry, physics, literature,...
"for his studies of the molecular basis of eukaryotic transcription".
Reverse transcription
Some viruses (such as HIVHIV
Human immunodeficiency virus is a lentivirus that causes acquired immunodeficiency syndrome , a condition in humans in which progressive failure of the immune system allows life-threatening opportunistic infections and cancers to thrive...
, the cause of AIDS
AIDS
Acquired immune deficiency syndrome or acquired immunodeficiency syndrome is a disease of the human immune system caused by the human immunodeficiency virus...
), have the ability to transcribe RNA into DNA. HIV has an RNA genome that is duplicated into DNA. The resulting DNA can be merged with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA from an RNA template is called reverse transcriptase
Reverse transcriptase
In the fields of molecular biology and biochemistry, a reverse transcriptase, also known as RNA-dependent DNA polymerase, is a DNA polymerase enzyme that transcribes single-stranded RNA into single-stranded DNA. It also helps in the formation of a double helix DNA once the RNA has been reverse...
. In the case of HIV, reverse transcriptase is responsible for synthesizing a complementary DNA
Complementary DNA
In genetics, complementary DNA is DNA synthesized from a messenger RNA template in a reaction catalyzed by the enzyme reverse transcriptase and the enzyme DNA polymerase. cDNA is often used to clone eukaryotic genes in prokaryotes...
strand (cDNA) to the viral RNA genome. An associated enzyme, ribonuclease H, digests the RNA strand, and reverse transcriptase synthesises a complementary strand of DNA to form a double helix DNA structure. This cDNA is integrated into the host cell's genome via another enzyme (integrase
Integrase
Retroviral integrase is an enzyme produced by a retrovirus that enables its genetic material to be integrated into the DNA of the infected cell...
) causing the host cell to generate viral proteins that reassemble into new viral particles. In HIV, subsequent to this, the host cell undergoes programmed cell death, apoptosis
Apoptosis
Apoptosis is the process of programmed cell death that may occur in multicellular organisms. Biochemical events lead to characteristic cell changes and death. These changes include blebbing, cell shrinkage, nuclear fragmentation, chromatin condensation, and chromosomal DNA fragmentation...
of T cell
T cell
T cells or T lymphocytes belong to a group of white blood cells known as lymphocytes, and play a central role in cell-mediated immunity. They can be distinguished from other lymphocytes, such as B cells and natural killer cells , by the presence of a T cell receptor on the cell surface. They are...
s. However, in other retroviruses, the host cell remains intact as the virus buds out of the cell.
Some eukaryotic cells contain an enzyme with reverse transcription activity called telomerase
Telomerase
Telomerase is an enzyme that adds DNA sequence repeats to the 3' end of DNA strands in the telomere regions, which are found at the ends of eukaryotic chromosomes. This region of repeated nucleotide called telomeres contains non-coding DNA material and prevents constant loss of important DNA from...
. Telomerase is a reverse transcriptase that lengthens the ends of linear chromosomes. Telomerase carries an RNA template from which it synthesizes DNA repeating sequence, or "junk" DNA. This repeated sequence of DNA is important because, every time a linear chromosome is duplicated, it is shortened in length. With "junk" DNA at the ends of chromosomes, the shortening eliminates some of the non-essential, repeated sequence rather than the protein-encoding DNA sequence farther away from the chromosome end. Telomerase is often activated in cancer cells to enable cancer cells to duplicate their genomes indefinitely without losing important protein-coding DNA sequence. Activation of telomerase could be part of the process that allows cancer cells to become immortal. However, the true in vivo
In vivo
In vivo is experimentation using a whole, living organism as opposed to a partial or dead organism, or an in vitro controlled environment. Animal testing and clinical trials are two forms of in vivo research...
significance of telomerase has still not been empirical
Empirical
The word empirical denotes information gained by means of observation or experimentation. Empirical data are data produced by an experiment or observation....
ly proven.
See also
- RNA PolymeraseRNA polymeraseRNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...
- TranslationTranslation (genetics)In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
- process of decoding RNA to form polypeptides - SplicingSplicing (genetics)In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...
- process of removing intronIntronAn intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...
s from precursor messenger RNA (pre-mRNA) to make messenger RNA (mRNA) - Reverse transcription - process virusVirusA virus is a small infectious agent that can replicate only inside the living cells of organisms. Viruses infect all types of organisms, from animals and plants to bacteria and archaea...
es use to make DNA from RNA - Crick's central dogma - DNA is transcribed to RNA, which is translated to polypeptides, never the other way around.
- Gene regulation
External links
- Interactive Java simulation of transcription initiation. From Center for Models of Life at the Niels Bohr Institute.
- Interactive Java simulation of transcription interference--a game of promoter dominance in bacterial virus. From Center for Models of Life at the Niels Bohr Institute.
- Biology animations about this topic under Chapter 15 and Chapter 18
- Virtual Cell Animation Collection, Introducing Transcription