Gene expression
Overview
 
Gene expression is the process by which information from a gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

 is used in the synthesis of a functional gene product
Gene product
A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlated with disease-causing alleles, such as the...

. These products are often protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s, but in non-protein coding genes such as ribosomal RNA (rRNA), transfer RNA (tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

. The process of gene expression is used by all known life - eukaryotes (including multicellular organisms), prokaryotes (bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...

), possibly induced by virus
Virus
A virus is a small infectious agent that can replicate only inside the living cells of organisms. Viruses infect all types of organisms, from animals and plants to bacteria and archaea...

es - to generate the macromolecular
Macromolecule
A macromolecule is a very large molecule commonly created by some form of polymerization. In biochemistry, the term is applied to the four conventional biopolymers , as well as non-polymeric molecules with large molecular mass such as macrocycles...

 machinery for life.
Several steps in the gene expression process may be modulated, including the transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

, RNA splicing
RNA splicing
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

, translation, and post-translational modification of a protein.
Encyclopedia
Gene expression is the process by which information from a gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

 is used in the synthesis of a functional gene product
Gene product
A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlated with disease-causing alleles, such as the...

. These products are often protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s, but in non-protein coding genes such as ribosomal RNA (rRNA), transfer RNA (tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

. The process of gene expression is used by all known life - eukaryotes (including multicellular organisms), prokaryotes (bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...

), possibly induced by virus
Virus
A virus is a small infectious agent that can replicate only inside the living cells of organisms. Viruses infect all types of organisms, from animals and plants to bacteria and archaea...

es - to generate the macromolecular
Macromolecule
A macromolecule is a very large molecule commonly created by some form of polymerization. In biochemistry, the term is applied to the four conventional biopolymers , as well as non-polymeric molecules with large molecular mass such as macrocycles...

 machinery for life.
Several steps in the gene expression process may be modulated, including the transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

, RNA splicing
RNA splicing
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

, translation, and post-translational modification of a protein. Gene regulation
Regulation of gene expression
Gene modulation redirects here. For information on therapeutic regulation of gene expression, see therapeutic gene modulation.Regulation of gene expression includes the processes that cells and viruses use to regulate the way that the information in genes is turned into gene products...

 gives the cell
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 control over structure and function, and is the basis for cellular differentiation
Cellular differentiation
In developmental biology, cellular differentiation is the process by which a less specialized cell becomes a more specialized cell type. Differentiation occurs numerous times during the development of a multicellular organism as the organism changes from a simple zygote to a complex system of...

, morphogenesis
Morphogenesis
Morphogenesis , is the biological process that causes an organism to develop its shape...

 and the versatility and adaptability
Adaptability
Adaptability is a feature of a system or of a process. This word has been put to use as a specialised term in different disciplines and in business operations. Word definitions of adaptability as a specialised term differ little from dictionary definitions...

 of any organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...

. Gene regulation may also serve as a substrate for evolutionary change, since control of the timing, location, and amount of gene expression can have a profound effect on the functions (actions) of the gene in a cell or in a multicellular organism.

In genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

, gene expression is the most fundamental level at which the genotype
Genotype
The genotype is the genetic makeup of a cell, an organism, or an individual usually with reference to a specific character under consideration...

 gives rise to the phenotype
Phenotype
A phenotype is an organism's observable characteristics or traits: such as its morphology, development, biochemical or physiological properties, behavior, and products of behavior...

. The genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....

 stored in DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 is "interpreted" by gene expression, and the properties of the expression give rise to the organism's phenotype.

Transcription


The gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

 itself is typically a long stretch of DNA which carries genetic information encoded by genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....

. Every molecule of DNA consists of two strands, each of them having 5' and 3' ends oriented in anti-parallel direction. The coding strand
Coding strand
When referring to DNA transcription, the coding strand is the DNA strand which has the same base sequence as the RNA transcript produced...

 contains the genetic information while template strand (non-coding strand) serves as a blueprint for the production of RNA. The production of RNA copies of the DNA is called transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

, and is performed by RNA polymerase
RNA polymerase
RNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...

, which adds one RNA nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...

 at a time to a growing RNA strand. This RNA is complementary
Complementarity (molecular biology)
In molecular biology, complementarity is a property of double-stranded nucleic acids such as DNA, as well as DNA:RNA duplexes. Each strand is complementary to the other in that the base pairs between them are non-covalently connected via two or three hydrogen bonds...

 to the template 3' → 5' DNA strand, which is itself complementary to the coding 5' → 3' DNA strand. Therefore, the resulting 5' → 3' RNA strand is identical to the coding DNA strand with the exception that thymine
Thymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...

s (T) are replaced with uracil
Uracil
Uracil is one of the four nucleobases in the nucleic acid of RNA that are represented by the letters A, G, C and U. The others are adenine, cytosine, and guanine. In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine.Uracil is a common and...

s (U) in the RNA. A coding DNA strand reading "ATG" is transcribed as "AUG" in RNA.

Transcription in prokaryotes is carried out by a single type of RNA polymerase, which needs DNA sequence called Pribnow box
Pribnow box
The Pribnow box is the sequence TATAAT of six nucleotides that is an essential part of a promoter site on DNA for transcription to occur in bacteria...

 and sigma factor
Sigma factor
A sigma factor is a bacterial transcription initiation factor that enables specific binding of RNA polymerase to gene promoters. Different sigma factors are activated in response to different environmental conditions...

 (σ factor) to start transcription. In eukaryotes, the transcription is done by three types of RNA polymerases, each of them needs special DNA sequence called promoter and a set of DNA-binding proteins - transcription factors to initiate the process. RNA polymerase I
RNA polymerase I
RNA polymerase I is, in eukaryotes, the enzyme that only transcribes ribosomal RNA , a type of RNA that accounts for over 50% of the total RNA synthesized in a cell....

 is responsible for transcription of rRNA genes, while RNA polymerase II
RNA polymerase II
RNA polymerase II is an enzyme found in eukaryotic cells. It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase...

 transcribes all protein-coding genes but also some non-coding RNAs (e.g. snRNAs, snoRNAs or long non-coding RNAs) as well. It contains special part called C-terminal domain (CTD) that is rich of serines, which after being phosphorylated accumulate factors necessary for RNA modification and maturation. RNA polymerase III
RNA polymerase III
RNA polymerase III transcribes DNA to synthesize ribosomal 5S rRNA, tRNA and other small RNAs. The genes transcribed by RNA Pol III fall in the category of "housekeeping" genes whose expression is required in all cell types and most environmental conditions...

 transcribes 5S rRNA and tRNA genes but also some small non-coding RNA genes (e.g. 7SK
7SK RNA
-External links:...

). Transcription ends on a special sequence called terminator
Terminator (genetics)
In genetics, a terminator, or transcription terminator is a section of genetic sequence that marks the end of gene or operon on genomic DNA for transcription.In prokaryotes, two classes of transcription terminators are known:...

.

RNA processing

While transcription of prokaryotic protein-coding genes creates messenger RNA (mRNA) which is ready for translation, transcription of eukaryotic genes leaves a primary transcript
Primary transcript
A primary transcript is an RNA molecule that has not yet undergone any modification after its synthesis. For example, a precursor messenger RNA is a primary transcript that becomes a messenger RNA after processing, and a primary microRNA precursor becomes a microRNA after processing....

 of RNA (pre-mRNA
Precursor mRNA
Precursor mRNA is an immature single strand of messenger ribonucleic acid . pre-mRNA is synthesized from a DNA template in the cell nucleus by transcription. Pre-mRNA comprises the bulk of heterogeneous nuclear RNA...

), which first has to undergo series of modification to become a mature mRNA.

These include 5' capping, which is set of enzymatic reactions that add 7-methylguanosine (m7G) to the 5' end of pre-mRNA and thus protect the RNA from degradation by exonucleases. The m7G cap is then bound by cap binding complex
Cap binding complex
The 5' cap of eukaryotic messenger RNA is bound at all times by various Cap-binding complexes.-Nuclear cap-binding complex:In the nucleus freshly transcribed mRNA molecules are bound on the 5' cap by the nuclear cap-binding complex of Cbc1/Cbc2 in yeast or CBC20/CBC80 in metazoans. These aid in...

 heterodimer (CBC20/CBC80) which aids in mRNA export to cytoplasm and also protect the RNA from decapping.

Another modification is 3' cleavage and polyadenylation. They occur if polyadenylation signal sequence (5'- AAUAAA-3') is present in pre-mRNA,which is usually between protein-coding sequence and terminator. The pre-mRNA is first cleaved and then a series of ~200 adenines (A) are added to form poly(A) tail which protects the RNA from degradation. Poly(A) tail is bound by multiple poly(A)-binding protein
Poly(A)-binding protein
Poly-binding protein is a RNA-binding protein which binds to the poly tail of mRNA. The poly tail is located on the 3' end of mRNA...

s (PABP) necessary for mRNA export and translation re-initiation.

Very important modification of eukaryotic pre-mRNA is RNA splicing
RNA splicing
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

. Majority of eukaryotic pre-mRNAs consist of alternating segments called exons and introns. During the process of splicing, RNA-protein catalytical complex known as spliceosome
Spliceosome
A spliceosome is a complex of snRNA and protein subunits that removes introns from a transcribed pre-mRNA segment. This process is generally referred to as splicing.-Composition:...

, catalyze two transesterification reactions, which remove intron and release it in form of lariat structure and then splice neighbouring exons together. In certain cases, some introns or exons can be either removed or retained in mature mRNA. This so-called alternative splicing
Alternative splicing
Alternative splicing is a process by which the exons of the RNA produced by transcription of a gene are reconnected in multiple ways during RNA splicing...

 creates series of different transcripts originating from a single gene. Because these transcripts can be potentially translated into different proteins, splicing extends the complexity of eukaryotic gene expression.

Extensive RNA processing may be an evolutionary advantage
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...

 made possible by the nucleus of eukaryotes. In prokaryotes transcription and translation happen together whilst in eukaryotes the nuclear membrane separates the two processes giving time for RNA processing to occur.

non-coding RNA maturation

In most organisms non-coding genes (ncRNA) are transcribed as precursors which undergo further processing. In the case of ribosomal RNAs (rRNA), they are often transcribed as a pre-rRNA which contains one or more rRNAs, the pre-rRNA is cleaved and modified (2′-O-methylation and pseudouridine
Pseudouridine
Pseudouridine is the C-glycoside isomer of the nucleoside uridine, and it is the most prevalent of the over one hundred different modified nucleosides found in RNA. Ψ is found in all species and in many classes of RNA except mRNA...

 formation) at a specific sites by approximately 150 different small nucleolus-restricted RNA species, called snoRNAs. SnoRNAs associate with proteins, forming snoRNPs. While snoRNA part basepair with the target RNA and thus position the modification to precise site, the protein part performs the catalytical reaction. In eukaryotes, in particular a snoRNP, called RNase MRP cleaves the 45S pre-rRNA into the 28S, 5.8S, and 18S rRNAs. The rRNA and RNA processing factors form large aggregates called the nucleolus
Nucleolus
The nucleolus is a non-membrane bound structure composed of proteins and nucleic acids found within the nucleus. Ribosomal RNA is transcribed and assembled within the nucleolus...

.

In the case of transfer RNA (tRNA), for example, the 5' sequence is removed by RNase P
RNase P
Ribonuclease P is a type of ribonuclease which cleaves RNA. RNase P is unique from other RNases in that it is a ribozyme – a ribonucleic acid that acts as a catalyst in the same way that a protein based enzyme would. Its function is to cleave off an extra, or precursor, sequence of RNA on tRNA...

, whereas the 3' end is removed by the tRNase Z enzyme and the non-templated 3' CCA tail is added by a nucleotidyl transferase. In the case of micro RNA (miRNA)
Mirna
Mirna may refer to:geographical entities* Mirna , a river in Istria, Croatia* Mirna , a river in Slovenia, tributary of the river Sava* Mirna , a settlement in the municipality of Mirna in Southeastern Sloveniapeople...

, miRNAs are first transcribed as primary transcripts or pri-miRNA with a cap and poly-A tail and processed to short, 70-nucleotide stem-loop structures known as pre-miRNA in the cell nucleus by the enzymes Drosha
Drosha
Drosha is a Class 2 RNase III enzyme responsible for initiating the processing of microRNA , or short RNA molecules naturally expressed by the cell that regulate a wide variety of other genes by interacting with the RNA-induced silencing complex to induce cleavage of complementary messenger RNA ...

 and Pasha
Pasha (protein)
Pasha , also known as DGCR8 in vertebrates organisms, is a protein localized to the cell nucleus that is required for microRNA processing. It binds to Drosha, an RNase III enzyme, to form the Microprocessor complex that cleaves a primary transcript known as pri-miRNA to a characteristic stem-loop...

. After being exported, it is then processed to mature miRNAs in the cytoplasm by interaction with the endonuclease Dicer
Dicer
Dicer is an endoribonuclease in the RNase III family that cleaves double-stranded RNA and pre-microRNA into short double-stranded RNA fragments called small interfering RNA about 20-25 nucleotides long, usually with a two-base overhang on the 3' end...

, which also initiates the formation of the RNA-induced silencing complex (RISC)
RNA-induced silencing complex
RNA-Induced Silencing Complex, or RISC, is a multiprotein complex that incorporates one strand of a small interfering RNA or micro RNA . RISC uses the siRNA or miRNA as a template for recognizing complementary mRNA. When it finds a complementary strand, it activates RNase and cleaves the RNA...

, composed of the Argonaute
Argonaute
Argonaute proteins are the catalytic components of the RNA-induced silencing complex , the protein complex responsible for the gene silencing phenomenon known as RNA interference . Argonaute proteins bind different classes of small non-coding RNAs, including microRNAs , small interfering RNAs and...

 protein.

Even snRNAs and snoRNAs themselves undergo series of modification before they become part of functional RNP complex. This is done either in the nucleoplasm or in the specialized compartments called Cajal bodies
Cajal body
Cajal bodies are spherical sub-organelles of 0.3-1.0 µm in diameter found in the nucleus of proliferative cells like embryonic cells and tumor cells, or metabolically active cells like neurons. In contrast to cytoplasmic organelles, CBs lack any phospholipid membrane which would separate their...

. Their bases are methylated or pseudouridinilated by a group of small Cajal body-specific RNAs (scaRNAs)
Small Cajal body-specific RNA
Small Cajal body-specific RNAs are a class of small nucleolar RNAs which specifically localise to the Cajal body, a nuclear organelle involved in the biogenesis of small nuclear ribonucleoproteins...

 which are structurally similar to snoRNAs.

RNA export

In eukaryotes most mature RNA must be exported to the cytoplasm from the nucleus
Cell nucleus
In cell biology, the nucleus is a membrane-enclosed organelle found in eukaryotic cells. It contains most of the cell's genetic material, organized as multiple long linear DNA molecules in complex with a large variety of proteins, such as histones, to form chromosomes. The genes within these...

. While some RNAs function in the nucleus, many RNAs are transported through the nuclear pore
Nuclear pore
Nuclear pores are large protein complexes that cross the nuclear envelope, which is the double membrane surrounding the eukaryotic cell nucleus. There are about on average 2000 nuclear pore complexes in the nuclear envelope of a vertebrate cell, but it varies depending on cell type and the stage in...

s and into the cytosol
Cytosol
The cytosol or intracellular fluid is the liquid found inside cells, that is separated into compartments by membranes. For example, the mitochondrial matrix separates the mitochondrion into compartments....

. Notably this includes all RNA types involved in protein synthesis. In some cases RNAs are additionally transported to a specific part of the cytoplasm, such as a synapse
Synapse
In the nervous system, a synapse is a structure that permits a neuron to pass an electrical or chemical signal to another cell...

; they are then towed by motor proteins that bind through linker proteins to specific sequences (called "zipcodes") on the RNA.

Translation


For some RNA (non-coding RNA) the mature RNA is the final gene product. In the case of messenger RNA (mRNA) the RNA is an information carrier coding for the synthesis of one or more proteins. mRNA carrying a single protein sequence (common in eukaryotes) is monocistronic whilst mRNA carrying multiple protein sequences (common in prokaryotes) is known as polycistronic.

Every mRNA consists of three parts - 5' untranslated region (5'UTR), protein-coding region or open reading frame (ORF) and 3' untranslated region (3'UTR). Coding region carries information for protein synthesis encoded by genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....

 into form of triplets. Each triplet of nucleotides of the coding region
Coding region
The coding region of a gene, also known as the coding sequence or CDS, is that portion of a gene's DNA or RNA, composed of exons, that codes for protein. The region is bounded nearer the 5' end by a start codon and nearer the 3' end with a stop codon...

 is called codon and corresponds to a binding site complementary to an anticodon triplet in transfer RNA. Transfer RNAs with the same anticodon sequence always carry identical type of amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

. Amino acids are then chained together by the ribosome
Ribosome
A ribosome is a component of cells that assembles the twenty specific amino acid molecules to form the particular protein molecule determined by the nucleotide sequence of an RNA molecule....

 according to order of triplets in the coding region. The ribosome helps transfer RNA to bind to messenger RNA and takes the amino acid from each transfer RNA and makes a structure-less protein out of it. Each mRNA molecule is translated into many protein molecules, on average ~900 in mammals.

In prokaryotes translation generally occurs at the point of transcription (co-transcriptionally), often using a messenger RNA which is still in the process of being created. In eukaryotes translation can occur in a variety of regions of the cell depending on where the protein being written is supposed to be. Major locations are the cytoplasm
Cytoplasm
The cytoplasm is a small gel-like substance residing between the cell membrane holding all the cell's internal sub-structures , except for the nucleus. All the contents of the cells of prokaryote organisms are contained within the cytoplasm...

 for soluble cytoplasmic proteins and the membrane of endoplasmic reticulum
Endoplasmic reticulum
The endoplasmic reticulum is an organelle of cells in eukaryotic organisms that forms an interconnected network of tubules, vesicles, and cisternae...

 for proteins which are for export from the cell or insertion into a cell membrane
Lipid bilayer
The lipid bilayer is a thin membrane made of two layers of lipid molecules. These membranes are flat sheets that form a continuous barrier around cells. The cell membrane of almost all living organisms and many viruses are made of a lipid bilayer, as are the membranes surrounding the cell nucleus...

. Proteins which are supposed to be expressed at the endoplasmic reticulum are recognised part-way through the translation process. This is governed by the signal recognition particle
Signal recognition particle
The signal recognition particle is an abundant, cytosolic, universally conserved ribonucleoprotein that recognizes and targets specific proteins to the endoplasmic reticulum in eukaryotes and the plasma membrane in prokaryotes....

 - a protein which binds to the ribosome and directs it to the endoplasmic reticulum when it finds a signal sequence on the growing (nascent) amino acid chain.

Folding

The polypeptide
Peptide
Peptides are short polymers of amino acid monomers linked by peptide bonds. They are distinguished from proteins on the basis of size, typically containing less than 50 monomer units. The shortest peptides are dipeptides, consisting of two amino acids joined by a single peptide bond...

 folds into its characteristic and functional three-dimensional structure
Protein structure
Proteins are an important class of biological macromolecules present in all organisms. Proteins are polymers of amino acids. Classified by their physical size, proteins are nanoparticles . Each protein polymer – also known as a polypeptide – consists of a sequence formed from 20 possible L-α-amino...

 from random coil
Random coil
A random coil is a polymer conformation where the monomer subunits are oriented randomly while still being bonded to adjacent units. It is not one specific shape, but a statistical distribution of shapes for all the chains in a population of macromolecules...

.
Each protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

s. This polypeptide lacks any developed three-dimensional structure (the left hand side of the neighboring figure). Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein (the right hand side of the figure), known as the native state
Native state
In biochemistry, the native state of a protein is its operative or functional form. While all protein molecules begin as simple unbranched chains of amino acids, once completed they assume highly specific three-dimensional shapes; that ultimate shape, known as tertiary structure, is the folded...

. The resulting three-dimensional structure is determined by the amino acid sequence (Anfinsen's dogma
Anfinsen's dogma
Anfinsen's dogma is a postulate in molecular biology championed by the Nobel Prize Laureate Christian B. Anfinsen...

).

The correct three-dimensional structure is essential to function, although some parts of functional proteins may remain unfolded
Intrinsically unstructured proteins
Intrinsically unstructured proteins, often referred to as naturally unfolded proteins or disordered proteins, are proteins characterized by lack of stable tertiary structure when the protein exists as an isolated polypeptide chain under physiological conditions in vitro...

 Failure to fold into the intended shape usually produces inactive proteins with different properties including toxic prion
Prion
A prion is an infectious agent composed of protein in a misfolded form. This is in contrast to all other known infectious agents which must contain nucleic acids . The word prion, coined in 1982 by Stanley B. Prusiner, is a portmanteau derived from the words protein and infection...

s. Several neurodegenerative and other disease
Disease
A disease is an abnormal condition affecting the body of an organism. It is often construed to be a medical condition associated with specific symptoms and signs. It may be caused by external factors, such as infectious disease, or it may be caused by internal dysfunctions, such as autoimmune...

s are believed to result from the accumulation of misfolded (incorrectly folded) proteins. Many allergies are caused by the folding of the proteins, for the immune system does not produce antibodies for certain protein structures.

Enzymes called chaperones assist the newly formed protein to attain (fold
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

 into) the 3-dimensional structure it needs to function. Similarly, RNA chaperones help RNAs attain their functional shapes. Assisting protein folding is one of the main roles of the endoplasmic reticulum in eukaryotes.

Protein transport

Many proteins are destined for other parts of the cell than the cytosol and a wide range of signalling sequences are used to direct proteins to where they are supposed to be. In prokaryotes this is normally a simple process due to limited compartmentalisation of the cell. However in eukaryotes there is a great variety of different targeting processes to ensure the protein arrives at the correct organelle.

Not all proteins remain within the cell and many are exported, for example digestive enzymes, hormone
Hormone
A hormone is a chemical released by a cell or a gland in one part of the body that sends out messages that affect cells in other parts of the organism. Only a small amount of hormone is required to alter cell metabolism. In essence, it is a chemical messenger that transports a signal from one...

s and extracellular matrix
Extracellular matrix
In biology, the extracellular matrix is the extracellular part of animal tissue that usually provides structural support to the animal cells in addition to performing various other important functions. The extracellular matrix is the defining feature of connective tissue in animals.Extracellular...

 proteins. In eukaryotes the export pathway is well developed and the main mechanism for the export of these proteins is translocation to the endoplasmic reticulum, followed by transport via the Golgi apparatus
Golgi apparatus
The Golgi apparatus is an organelle found in most eukaryotic cells. It was identified in 1898 by the Italian physician Camillo Golgi, after whom the Golgi apparatus is named....

.

Regulation of gene expression


Regulation of gene expression refers to the control of the amount and timing of appearance of the functional product of a gene. Control of expression is vital to allow a cell to produce the gene products it needs when it needs them; in turn this gives cells the flexibility to adapt to a variable environment, external signals, damage to the cell, etc. Some simple examples of where gene expression is important are:
  • Control of insulin
    Insulin
    Insulin is a hormone central to regulating carbohydrate and fat metabolism in the body. Insulin causes cells in the liver, muscle, and fat tissue to take up glucose from the blood, storing it as glycogen in the liver and muscle....

     expression so it gives a signal for blood glucose regulation
  • X chromosome inactivation
    X-inactivation
    X-inactivation is a process by which one of the two copies of the X chromosome present in female mammals is inactivated. The inactive X chromosome is silenced by packaging into transcriptionally inactive heterochromatin...

     in female mammals to prevent an "overdose" of the genes it contains.
  • Cyclin
    Cyclin
    Cyclins are a family of proteins that control the progression of cells through the cell cycle by activating cyclin-dependent kinase enzymes.- Function :...

     expression levels control progression through the eukaryotic cell cycle
    Cell cycle
    The cell cycle, or cell-division cycle, is the series of events that takes place in a cell leading to its division and duplication . In cells without a nucleus , the cell cycle occurs via a process termed binary fission...


More generally gene regulation gives the cell control over all structure and function, and is the basis for cellular differentiation
Cellular differentiation
In developmental biology, cellular differentiation is the process by which a less specialized cell becomes a more specialized cell type. Differentiation occurs numerous times during the development of a multicellular organism as the organism changes from a simple zygote to a complex system of...

, morphogenesis
Morphogenesis
Morphogenesis , is the biological process that causes an organism to develop its shape...

 and the versatility and adaptability of any organism.

Any step of gene expression may be modulated, from the DNA-RNA transcription step to post-translational modification of a protein. The stability of the final gene product, whether it is RNA or protein, also contributes to the expression level of the gene - an unstable product results in a low expression level. In general gene expression is regulated through changes in the number and type of interactions between molecules that collectively influence transcription of DNA and translation of RNA.

Numerous terms are used to describe types of genes depending on how they are regulated, these include:
  • A constitutive gene is a gene that is transcribed continually compared to a facultative gene which is only transcribed when needed.
  • A housekeeping gene
    Housekeeping gene
    A housekeeping gene is typically a constitutive gene that is required for the maintenance of basic cellular function, and are found in all cells of an organism. Although some housekeeping genes are expressed at relatively constant levels , other housekeeping genes may vary depending on...

    is typically a constitutive gene that is transcribed at a relatively constant level. The housekeeping gene's products are typically needed for maintenance of the cell. It is generally assumed that their expression is unaffected by experimental conditions. Examples include actin
    Actin
    Actin is a globular, roughly 42-kDa moonlighting protein found in all eukaryotic cells where it may be present at concentrations of over 100 μM. It is also one of the most highly-conserved proteins, differing by no more than 20% in species as diverse as algae and humans...

    , GAPDH and ubiquitin
    Ubiquitin
    Ubiquitin is a small regulatory protein that has been found in almost all tissues of eukaryotic organisms. Among other functions, it directs protein recycling.Ubiquitin can be attached to proteins and label them for destruction...

    .
  • A facultative gene is a gene which is only transcribed when needed compared to a constitutive gene.
  • An inducible gene is a gene whose expression is either responsive to environmental change or dependent on the position in the cell cycle.

Transcriptional regulation

Regulation of transcription can be broken down into three main routes of influence; genetic (direct interaction of a control factor with the gene), modulation (interaction of a control factor with the transcription machinery) and epigenetic (non-sequence changes in DNA structure which influence transcription).

Direct interaction with DNA is the simplest and the most direct method by which a protein can change transcription levels. Genes often have several protein binding sites around the coding region with the specific function of regulating transcription. There are many classes of regulatory DNA binding sites known as enhancer
Enhancer (genetics)
In genetics, an enhancer is a short region of DNA that can be bound with proteins to enhance transcription levels of genes in a gene cluster...

s, insulator
Insulator (genetics)
An insulator is a genetic boundary element that plays two distinct roles in gene expression, either as an enhancer-blocking element, or more rarely as a barrier against condensed chromatin proteins spreading onto active chromatin...

s, repressor
Repressor
In molecular genetics, a repressor is a DNA-binding protein that regulates the expression of one or more genes by binding to the operator and blocking the attachment of RNA polymerase to the promoter, thus preventing transcription of the genes. This blocking of expression is called...

s and silencers. The mechanisms for regulating transcription are very varied, from blocking key binding sites on the DNA for RNA polymerase
RNA polymerase
RNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...

 to acting as an activator
Activator (genetics)
An activator is a DNA-binding protein that regulates one or more genes by increasing the rate of transcription. The activator may increase transcription by virtue of a connected domain which assists in the formation of the RNA polymerase holoenzyme, or may operate through a coactivator. A...

 and promoting transcription by assisting RNA polymerase binding.

The activity of transcription factors is further modulated by intracellular signals causing protein post-translational modification including phosphorylated
Phosphorylation
Phosphorylation is the addition of a phosphate group to a protein or other organic molecule. Phosphorylation activates or deactivates many protein enzymes....

, acetylated
Acetylation
Acetylation describes a reaction that introduces an acetyl functional group into a chemical compound...

, or glycosylated
Glycosylation
Glycosylation is the reaction in which a carbohydrate, i.e. a glycosyl donor, is attached to a hydroxyl or other functional group of another molecule . In biology glycosylation refers to the enzymatic process that attaches glycans to proteins, lipids, or other organic molecules...

. These changes influence a transcription factor's ability to bind, directly or indirectly, to promoter DNA, to recruit RNA polymerase, or to favor elongation of a newly synthetized RNA molecule.

The nuclear membrane in eukaryotes allows further regulation of transcription factors by the duration of their presence in the nucleus which is regulated by reversible changes in their structure and by binding of other proteins. Environmental stimuli or endocrine signals may cause modification of regulatory proteins eliciting cascades of intracellular signals, which result in regulation of gene expression.

More recently it has become apparent that there is a huge influence of non-DNA-sequence specific effects on translation. These effects are referred to as epigenetic and involve the higher order structure of DNA, non-sequence specific DNA binding proteins and chemical modification of DNA. In general epigenetic effects alter the accessibility of DNA to proteins and so modulate transcription.

DNA methylation
DNA methylation
DNA methylation is a biochemical process that is important for normal development in higher organisms. It involves the addition of a methyl group to the 5 position of the cytosine pyrimidine ring or the number 6 nitrogen of the adenine purine ring...

 is a widespread mechanism for epigenetic influence on gene expression and is seen in bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and eukaryotes and has roles in heritable transcription silencing and transcription regulation. In eukaryotes the structure of chromatin
Chromatin
Chromatin is the combination of DNA and proteins that make up the contents of the nucleus of a cell. The primary functions of chromatin are; to package DNA into a smaller volume to fit in the cell, to strengthen the DNA to allow mitosis and meiosis and prevent DNA damage, and to control gene...

, controlled by the histone code
Histone code
The histone code is a hypothesis that the transcription of genetic information encoded in DNA is in part regulated by chemical modifications to histone proteins, primarily on their unstructured ends. Together with similar modifications such as DNA methylation it is part of the epigenetic code...

, regulates access to DNA with significant impacts on the expression of genes in euchromatin
Euchromatin
Euchromatin is a lightly packed form of chromatin that is rich in gene concentration, and is often under active transcription. Unlike heterochromatin, it is found in both cells with nuclei and cells without nuclei...

 and heterochromatin
Heterochromatin
Heterochromatin is a tightly packed form of DNA, which comes in different varieties. These varieties lie on a continuum between the two extremes of constitutive and facultative heterochromatin...

 areas.

Post-transcriptional regulation

In eukaryotes, where export of RNA is required before translation is possible, nuclear export is thought to provide additional control over gene expression. All transport in and out of the nucleus is via the nuclear pore
Nuclear pore
Nuclear pores are large protein complexes that cross the nuclear envelope, which is the double membrane surrounding the eukaryotic cell nucleus. There are about on average 2000 nuclear pore complexes in the nuclear envelope of a vertebrate cell, but it varies depending on cell type and the stage in...

 and transport is controlled by a wide range of importin
Importin
Importin is a type of protein that moves other protein molecules into the nucleus by binding to a specific recognition sequence, called the nuclear localization signal . Importin is classified as a karyopherin....

 and exportin proteins.

Expression of a gene coding for a protein is only possible if the messenger RNA carrying the code survives long enough to be translated. In a typical cell an RNA molecule is only stable if specifically protected from degradation. RNA degradation has particular importance in regulation of expression in eukaryotic cells where mRNA has to travel significant distances before being translated. In eukaryotes RNA is stabilised by certain post-transcriptional modifications, particularly the 5' cap
5' cap
The 5' cap is a specially altered nucleotide on the 5' end of precursor messenger RNA and some other primary RNA transcripts as found in eukaryotes. The process of 5' capping is vital to creating mature messenger RNA, which is then able to undergo translation...

 and poly-adenylated tail
Polyadenylation
Polyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...

.

Intentional degradation of mRNA is used not just as a defence mechanism from foreign RNA (normally from viruses) but also as a route of mRNA destabilisation. If an mRNA molecule has a complementary sequence to a small interfering RNA
Small interfering RNA
Small interfering RNA , sometimes known as short interfering RNA or silencing RNA, is a class of double-stranded RNA molecules, 20-25 nucleotides in length, that play a variety of roles in biology. The most notable role of siRNA is its involvement in the RNA interference pathway, where it...

 then it is targeted for destruction via the RNA interference pathway.

Translational regulation


Direct regulation of translation is less prevalent than control of transcription or mRNA stability but is occasionally used. Inhibition of protein translation is a major target for toxin
Toxin
A toxin is a poisonous substance produced within living cells or organisms; man-made substances created by artificial processes are thus excluded...

s and antibiotic
Antibiotic
An antibacterial is a compound or substance that kills or slows down the growth of bacteria.The term is often used synonymously with the term antibiotic; today, however, with increased knowledge of the causative agents of various infectious diseases, antibiotic has come to denote a broader range of...

s in order to kill a cell by overriding its normal gene expression control. Protein synthesis inhibitor
Protein synthesis inhibitor
A protein synthesis inhibitor is a substance that stops or slows the growth or proliferation of cells by disrupting the processes that lead directly to the generation of new proteins....

s include the antibiotic neomycin
Neomycin
Neomycin is an aminoglycoside antibiotic that is found in many topical medications such as creams, ointments, and eyedrops. The discovery of Neomycin dates back to 1949. It was discovered in the lab of Selman Waksman, who was later awarded the Nobel Prize in Physiology and medicine in 1951...

 and the toxin ricin
Ricin
Ricin , from the castor oil plant Ricinus communis, is a highly toxic, naturally occurring protein. A dose as small as a few grains of salt can kill an adult. The LD50 of ricin is around 22 micrograms per kilogram Ricin , from the castor oil plant Ricinus communis, is a highly toxic, naturally...

.

Protein degradation

Once protein synthesis is complete the level of expression of that protein can be reduced by protein degradation. There are major protein degradation pathways in all prokaryotes and eukaryotes of which the proteasome
Proteasome
Proteasomes are very large protein complexes inside all eukaryotes and archaea, and in some bacteria.  In eukaryotes, they are located in the nucleus and the cytoplasm.  The main function of the proteasome is to degrade unneeded or damaged proteins by proteolysis, a chemical reaction that breaks...

 is a common component. An unneeded or damaged protein is often labelled for degradation by addition of ubiquitin.

Measurement

Measuring gene expression is an important part of many life sciences - the ability to quantify the level at which a particular gene is expressed within a cell, tissue or organism can give a huge amount of information. For example measuring gene expression can:
  • Identify viral infection of a cell (viral protein
    Viral protein
    A viral protein is a protein generated by a virus.Many are structural, forming the viral envelope and capsid. However, there are also viral nonstructural proteins and viral regulatory and accessory proteins.More than 490 have been identified....

     expression)
  • Determine an individual's susceptibility to cancer
    Cancer
    Cancer , known medically as a malignant neoplasm, is a large group of different diseases, all involving unregulated cell growth. In cancer, cells divide and grow uncontrollably, forming malignant tumors, and invade nearby parts of the body. The cancer may also spread to more distant parts of the...

     (oncogene
    Oncogene
    An oncogene is a gene that has the potential to cause cancer. In tumor cells, they are often mutated or expressed at high levels.An oncogene is a gene found in the chromosomes of tumor cells whose activation is associated with the initial and continuing conversion of normal cells into cancer...

     expression)
  • Find if a bacterium is resistant to penicillin
    Penicillin
    Penicillin is a group of antibiotics derived from Penicillium fungi. They include penicillin G, procaine penicillin, benzathine penicillin, and penicillin V....

     (beta-lactamase
    Beta-lactamase
    Beta-lactamases are enzymes produced by some bacteria and are responsible for their resistance to beta-lactam antibiotics like penicillins, cephamycins, and carbapenems . These antibiotics have a common element in their molecular structure: a four-atom ring known as a beta-lactam...

     expression)

Similarly the analysis of the location of expression protein is a powerful tool and this can be done on an organism or cellular scale. Investigation of localisation is particularly important for study of development
Developmental biology
Developmental biology is the study of the process by which organisms grow and develop. Modern developmental biology studies the genetic control of cell growth, differentiation and "morphogenesis", which is the process that gives rise to tissues, organs and anatomy.- Related fields of study...

 in multicellular organisms and as an indicator of protein function in single cells. Ideally measurement of expression is done by detecting the final gene product (for many genes this is the protein) however it is often easier to detect one of the precursors, typically mRNA, and infer gene expression level.

mRNA quantification

Levels of mRNA can be quantitatively measured by northern blotting which gives size and sequence information about the mRNA molecules. A sample of RNA is separated on an agarose gel and hybridized to a radio-labeled RNA probe that is complementary to the target sequence. The radio-labeled RNA is then detected by an autoradiograph
Autoradiograph
An autoradiograph is an image on an x-ray film or nuclear emulsion produced by the pattern of decay emissions from a distribution of a radioactive substance...

. The main problems with Northern blotting stem from the use of radioactive reagents (which make the procedure time consuming and potentially dangerous) and lower quality quantification than more modern methods (due to the fact that quantification is done by measuring band strength in an image of a gel). Northern blotting is, however, still widely used as the additional mRNA size information allows the discrimination of alternately spliced transcripts.

A more modern low-throughput approach for measuring mRNA abundance is reverse transcription quantitative polymerase chain reaction
Polymerase chain reaction
The polymerase chain reaction is a scientific technique in molecular biology to amplify a single or a few copies of a piece of DNA across several orders of magnitude, generating thousands to millions of copies of a particular DNA sequence....

 (RT-PCR followed with qPCR). RT-PCR first generates a DNA template from the mRNA by reverse transcription, which is called cDNA
Complementary DNA
In genetics, complementary DNA is DNA synthesized from a messenger RNA template in a reaction catalyzed by the enzyme reverse transcriptase and the enzyme DNA polymerase. cDNA is often used to clone eukaryotic genes in prokaryotes...

. This cDNA template is then used for qPCR where the change in fluorescence
Fluorescence
Fluorescence is the emission of light by a substance that has absorbed light or other electromagnetic radiation of a different wavelength. It is a form of luminescence. In most cases, emitted light has a longer wavelength, and therefore lower energy, than the absorbed radiation...

 of a probe changes as the DNA amplification process progresses. With a carefully constructed standard curve qPCR can produce an absolute measurement such as number of copies of mRNA, typically in units of copies per nanolitre of homogenized tissue or copies per cell. qPCR is very sensitive (detection of a single mRNA molecule is possible), but can be expensive due to the fluorescent probes required.

An even more advanced approach is to individually tag single mRNA molecules with fluorescent barcodes (nanostrings), which can be detected one-by-one and counted for direct digital quantification. The advantage of this approach is that it does not rely on analog quantification of fluorescent intensity, which can be problematic due to noise, lack of linearity, and narrow dynamic range. Instead, the technique relies on fluorescence to detect simply the presence of a single mRNA molecule in a binary ("yes" or "no") mode. This method was invented by Dr. Krassen Dimitrov at the Institute for Systems Biology
Institute for Systems Biology
The Institute for Systems Biology is a non-profit research institution, located in Seattle, Washington, United States. Leroy Hood co-founded the Institute with Alan Aderem and Ruedi Aebersold in 2000....

 and commercialized through his start-up company, NanoString Technologies
NanoString Technologies
NanoString Technologies is a privately held life sciences company that develops and manufactures solutions for detecting and counting large sets of target molecules in biological samples. The company was founded by in 2003 and is based in Seattle, WA....



Northern blots and RT-qPCR are good for detecting whether a single gene is being expressed, but it quickly becomes impractical if many genes within the sample are being studied. Using DNA microarrays, transcript levels for many genes at once (expression profiling
Expression profiling
In the field of molecular biology, gene expression profiling is the measurement of the activity of thousands of genes at once, to create a global picture of cellular function. These profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a...

) can be measured. Recent advances in microarray technology allow for the quantification, on a single array, of transcript levels for every known gene in several organism's genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

s, including human
Human
Humans are the only living species in the Homo genus...

s.

Alternatively "tag based" technologies like Serial analysis of gene expression
Serial Analysis of Gene Expression
Serial analysis of gene expression is a technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. The original technique was developed by Dr. Victor Velculescu...

 (SAGE), which can provide a relative measure of the cellular concentration
Concentration
In chemistry, concentration is defined as the abundance of a constituent divided by the total volume of a mixture. Four types can be distinguished: mass concentration, molar concentration, number concentration, and volume concentration...

 of different mRNAs, can be used. The great advantage of tag-based methods is the "open architecture", allowing for the exact measurement of any transcript, with a known or unknown sequence.

Protein quantification

For genes encoding proteins the expression level can be directly assessed by a number of means with some clear analogies to the techniques for mRNA quantification.

The most commonly used method is to perform a Western blot
Western blot
The western blot is a widely used analytical technique used to detect specific proteins in the given sample of tissue homogenate or extract. It uses gel electrophoresis to separate native proteins by 3-D structure or denatured proteins by the length of the polypeptide...

 against the protein of interest - this gives information on the size of the protein in addition to its identity. A sample (often cellular lysate) is separated on a polyacrylamide gel, transferred to a membrane and then probed with an antibody
Antibody
An antibody, also known as an immunoglobulin, is a large Y-shaped protein used by the immune system to identify and neutralize foreign objects such as bacteria and viruses. The antibody recognizes a unique part of the foreign target, termed an antigen...

 to the protein of interest. The antibody can either be conjugated to a fluorophore
Fluorophore
A fluorophore, in analogy to a chromophore, is a component of a molecule which causes a molecule to be fluorescent. It is a functional group in a molecule which will absorb energy of a specific wavelength and re-emit energy at a different wavelength...

 or to horseradish peroxidase
Horseradish peroxidase
The enzyme horseradish peroxidase , found in horseradish, is used extensively in biochemistry applications primarily for its ability to amplify a weak signal and increase detectability of a target molecule.-Applications:...

 for imaging and/or quantification. The gel-based nature of this assay makes quantification less accurate but it has the advantage of being able to identify later modifications to the protein, for example proteolysis
Proteolysis
Proteolysis is the directed degradation of proteins by cellular enzymes called proteases or by intramolecular digestion.-Purposes:Proteolysis is used by the cell for several purposes...

 or ubiquitination, from changes in size.

Localisation

Analysis of expression is not limited to only quantification; localisation can also be determined. mRNA can be detected with a suitably labelled complementary mRNA strand and protein can be detected via labelled antibodies. The probed sample is then observed by microscopy to identify where the mRNA or protein is.

By replacing the gene with a new version fused a green fluorescent protein
Green fluorescent protein
The green fluorescent protein is a protein composed of 238 amino acid residues that exhibits bright green fluorescence when exposed to blue light. Although many other marine organisms have similar green fluorescent proteins, GFP traditionally refers to the protein first isolated from the...

 (or similar) marker expression may be directly quantified in live cells. This is done by imaging using a fluorescence microscope
Fluorescence microscope
A fluorescence microscope is an optical microscope used to study properties of organic or inorganic substances using the phenomena of fluorescence and phosphorescence instead of, or in addition to, reflection and absorption...

. It is very difficult to clone a GFP-fused protein into its native location in the genome without affecting expression levels so this method often cannot be used to measure endogenous gene expression. It is, however, widely used to measure the expression of a gene artificially introduced into the cell, for example via an expression vector
Expression vector
An expression vector, otherwise known as an expression construct, is generally a plasmid that is used to introduce a specific gene into a target cell. Once the expression vector is inside the cell, the protein that is encoded by the gene is produced by the cellular-transcription and translation...

. It is important to note that by fusing a target protein to a fluorescent reporter the protein's behavior, including its cellular localization and expression level, can be significantly changed.

The enzyme-linked immunosorbent assay works by using antibodies immobilised on a microtiter plate
Microtiter plate
A Microtiter plate or microplate or microwell plate, is a flat plate with multiple "wells" used as small test tubes. The microplate has become a standard tool in analytical research and clinical diagnostic testing laboratories...

 to capture proteins of interest from samples added to the well. Using a detection antibody conjugated to an enzyme or fluorophore the quantity of bound protein can be accurately measured by fluorometric or colourimetric
Colorimetry (chemical method)
In physical and analytical chemistry, colorimetry or colourimetry is a technique "used to determine the concentration of colored compounds in solution."...

 detection. The detection process is very similar to that of a Western blot, but by avoiding the gel steps more accurate quantification can be achieved.

Expression system

An expression system is a system specifically designed for the production of a gene product of choice. This is normally a protein although may also be RNA, such as tRNA or a ribozyme
Ribozyme
A ribozyme is an RNA molecule with a well defined tertiary structure that enables it to catalyze a chemical reaction. Ribozyme means ribonucleic acid enzyme. It may also be called an RNA enzyme or catalytic RNA. Many natural ribozymes catalyze either the hydrolysis of one of their own...

. An expression system consists of a gene, normally encoded by DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

, and the molecular machinery required to transcribe
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

 the DNA into mRNA and translate the mRNA into protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 using the reagents provided. In the broadest sense this includes every living cell but the term is more normally used to refer to expression as a laboratory tool. An expression system is therefore often artificial in some manner. Expression systems are, however, a fundamentally natural process. Viruses are an excellent example where they replicate by using the host cell as an expression system for the viral proteins and genome.

Inducible Expression

Doxycycline
Doxycycline
Doxycycline INN is a member of the tetracycline antibiotics group, and is commonly used to treat a variety of infections. Doxycycline is a semisynthetic tetracycline invented and clinically developed in the early 1960s by Pfizer Inc. and marketed under the brand name Vibramycin. Vibramycin...

 is also used in "Tet-on" and "Tet-off" tetracycline controlled transcriptional activation to regulate transgene
Transgene
A transgene is a gene or genetic material that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another....

 expression in organisms and cell culture
Cell culture
Cell culture is the complex process by which cells are grown under controlled conditions. In practice, the term "cell culture" has come to refer to the culturing of cells derived from singlecellular eukaryotes, especially animal cells. However, there are also cultures of plants, fungi and microbes,...

s

In nature

In addition to these biological tools, certain naturally observed configurations of DNA (genes, promoters, enhancers, repressors) and the associated machinery itself are referred to as an expression system. This term is normally used in the case where a gene or set of genes is switched on under well defined conditions. For example the simple repressor switch expression system in Lambda phage
Lambda phage
Enterobacteria phage λ is a temperate bacteriophage that infects Escherichia coli.Lambda phage is a virus particle consisting of a head, containing double-stranded linear DNA as its genetic material, and a tail that can have tail fibers. The phage particle recognizes and binds to its host, E...

 and the lac operator system in bacteria. Several natural expression systems are directly used or modified and used for artificial expression systems such as the Tet-on and Tet-off expression system.

Gene networks

Genes have sometimes been regarded as nodes in a network, with inputs being proteins such as transcription factor
Transcription factor
In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences, thereby controlling the flow of genetic information from DNA to mRNA...

s, and outputs being the level of gene expression. The node itself performs a function, and the operation of these functions have been interpreted as performing a kind of information processing
Information processing
Information processing is the change of information in any manner detectable by an observer. As such, it is a process which describes everything which happens in the universe, from the falling of a rock to the printing of a text file from a digital computer system...

 within cell and determine cellular behavior.

Gene networks can also be constructed without formulating an explicit causal model. This is often the case when assembling networks from large expression data sets. Covariation and correlation of expression is computed across a large sample of cases and measurements (often transcriptome
Transcriptome
The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in one or a population of cells.-Scope:...

 or proteome
Proteome
The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism. More specifically, it is the set of expressed proteins in a given type of cells or an organism at a given time under defined conditions. The term is a portmanteau of proteins and genome.The term has been...

 data). The source of variation can be either experimental or natural (observational). There are several ways to construct gene expression networks, but one common approach is to compute a matrix of all pair-wise correlations of expression across conditions, time points, or individuals and convert the matrix (after thresholding at some cut-off value) into a graphical representation in which nodes represent genes, transcripts, or proteins and edges connecting these nodes represent the strength of association (see http://www.genenetwork.org).

Techniques and tools

The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies. They are divided into two groups based on their degree of multiplexity
Multiplex (assay)
A multiplex assay is a type of laboratory procedure that simultaneously measures multiple analytes in a single assay. It is distinguished from procedures that measure one or a few analytes at a time...

.
  • Low-to-mid-plex techniques:
    • Reporter gene
      Reporter gene
      In molecular biology, a reporter gene is a gene that researchers attach to a regulatory sequence of another gene of interest in cell culture, animals or plants. Certain genes are chosen as reporters because the characteristics they confer on organisms expressing them are easily identified and...

    • Northern blot
      Northern blot
      The northern blot is a technique used in molecular biology research to study gene expression by detection of RNA in a sample. With northern blotting it is possible to observe cellular control over structure and function by determining the particular gene expression levels during differentiation,...

    • Western blot
      Western blot
      The western blot is a widely used analytical technique used to detect specific proteins in the given sample of tissue homogenate or extract. It uses gel electrophoresis to separate native proteins by 3-D structure or denatured proteins by the length of the polypeptide...

    • Fluorescent in situ hybridization
      Fluorescent in situ hybridization
      FISH is a cytogenetic technique developed by biomedical researchers in the early 1980s that is used to detect and localize the presence or absence of specific DNA sequences on chromosomes. FISH uses fluorescent probes that bind to only those parts of the chromosome with which they show a high...

    • Reverse transcription PCR
      Reverse transcription polymerase chain reaction
      Reverse transcription polymerase chain reaction is a variant of polymerase chain reaction , a laboratory technique commonly used in molecular biology to generate many copies of a DNA sequence, a process termed "amplification"...

    • Digital counting of single transcript molecules, see NanoString Technologies
      NanoString Technologies
      NanoString Technologies is a privately held life sciences company that develops and manufactures solutions for detecting and counting large sets of target molecules in biological samples. The company was founded by in 2003 and is based in Seattle, WA....


  • Higher-plex techniques:
    • SAGE
      Serial Analysis of Gene Expression
      Serial analysis of gene expression is a technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. The original technique was developed by Dr. Victor Velculescu...

    • DNA microarray
      DNA microarray
      A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome...

    • Tiling array
      Tiling array
      Tiling Arrays are a subtype of microarray chips. Like traditional microarrays, they function by hybridizing labeled DNA or RNA target molecules to probes fixed onto a solid surface. Tiling arrays differ from traditional microarrays in the nature of the probes...

    • RNA-Seq
      RNA-Seq
      RNA-seq, also called "Whole Transcriptome Shotgun Sequencing" and dubbed "a revolutionary tool for transcriptomics", refers to the use of high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA content, a technique that is quickly becoming...


See also

  • Transcriptional noise
    Transcriptional noise
    A primary cause of the variability in gene expression occurring between cells in isogenic populations . A major source of transcriptional noise is likely to be transcriptional bursting although other sources of heterogeneity, such as unequal separation of cell contents at mitosis are also likely...

  • Transcriptional bursting
    Transcriptional bursting
    Transcriptional bursting, also known as transcriptional pulsing, is a fundamental property of genes from bacteria to humans. Transcription of genes, the process which transforms the stable code written in DNA into the mobile RNA message can occur in "bursts" or "pulses"...

  • Bookmarking
    Bookmarking
    In genetics and epigenetics, bookmarking is a biological phenomenon believed to function as an epigenetic mechanism for transmitting cellular memory of the pattern of gene expression in a cell, throughout mitosis, to its daughter cells...

  • Expression profiling
    Expression profiling
    In the field of molecular biology, gene expression profiling is the measurement of the activity of thousands of genes at once, to create a global picture of cellular function. These profiles can, for example, distinguish between cells that are actively dividing, or show how the cells react to a...

  • Expressed sequence tag
    Expressed sequence tag
    An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence. They may be used to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination. The identification of ESTs has proceeded rapidly, with approximately 65.9 million ESTs now available in...

  • Paramutation
    Paramutation
    In epigenetics, paramutation is an interaction between two alleles of a single locus, resulting in a heritable change of one allele that is induced by the other allele...

  • Sequence profiling tool
    Sequence profiling tool
    A sequence profiling tool in bioinformatics is a type of software that presents information related to a genetic sequence, gene name, or keyword input. Such tools generally take a query such as a DNA, RNA, or protein sequence or ‘keyword’ and search one or more databases for information related to...

  • Genetically modified organism
    Genetically modified organism
    A genetically modified organism or genetically engineered organism is an organism whose genetic material has been altered using genetic engineering techniques. These techniques, generally known as recombinant DNA technology, use DNA molecules from different sources, which are combined into one...

  • Genetic engineering
    Genetic engineering
    Genetic engineering, also called genetic modification, is the direct human manipulation of an organism's genome using modern DNA technology. It involves the introduction of foreign DNA or synthetic genes into the organism of interest...

  • Epigenetics
    Epigenetics
    In biology, and specifically genetics, epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence – hence the name epi- -genetics...

  • List of human genes
  • Oscillating gene
    Oscillating gene
    In molecular biology, an oscillating gene or clock gene is a gene that is expressed in an oscillating pattern, often circadian....

  • Ridges
    Ridge (biology)
    Ridges are domains of the genome with a high gene expression; the opposite of ridges are antiridges. The term was first used by Caron et al. in 2001...

  • AlloMap Molecular Expression Testing
    AlloMap Molecular Expression Testing
    AlloMap molecular expression testing, developed and commercialized by XDx, is a gene expression profiling test to identify heart transplant recipients with a low probability of one type of transplant rejection. The test is performed on a blood sample, providing a non-invasive test to help manage...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK