Gene
Overview


A gene is a molecular unit of heredity
Heredity
Heredity is the passing of traits to offspring . This is the process by which an offspring cell or organism acquires or becomes predisposed to the characteristics of its parent cell or organism. Through heredity, variations exhibited by individuals can accumulate and cause some species to evolve...

 of a living organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...

. It is a name given to some stretches of DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 and RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 that code for a type of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 or for an RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains. Genes hold the information to build and maintain an organism's cells
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 and pass genetic traits
Trait (biology)
A trait is a distinct variant of a phenotypic character of an organism that may be inherited, environmentally determined or be a combination of the two...

 to offspring, although some organelles (e.g.
Encyclopedia


A gene is a molecular unit of heredity
Heredity
Heredity is the passing of traits to offspring . This is the process by which an offspring cell or organism acquires or becomes predisposed to the characteristics of its parent cell or organism. Through heredity, variations exhibited by individuals can accumulate and cause some species to evolve...

 of a living organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...

. It is a name given to some stretches of DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 and RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 that code for a type of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 or for an RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains. Genes hold the information to build and maintain an organism's cells
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 and pass genetic traits
Trait (biology)
A trait is a distinct variant of a phenotypic character of an organism that may be inherited, environmentally determined or be a combination of the two...

 to offspring, although some organelles (e.g. mitochondria) are self-replicating and are not coded for by the organism's DNA. All organisms have many genes corresponding to various biological traits, some of which are immediately visible, such as eye color
Eye color
Eye color is a polygenic phenotypic character and is determined by two distinct factors: the pigmentation of the eye's iris and the frequency-dependence of the scattering of light by the turbid medium in the stroma of the iris....

 or number of limbs, and some of which are not, such as blood type
Blood type
A blood type is a classification of blood based on the presence or absence of inherited antigenic substances on the surface of red blood cells . These antigens may be proteins, carbohydrates, glycoproteins, or glycolipids, depending on the blood group system...

 or increased risk for specific diseases, or the thousands of basic biochemical
Biochemistry
Biochemistry, sometimes called biological chemistry, is the study of chemical processes in living organisms, including, but not limited to, living matter. Biochemistry governs all living organisms and living processes...

 processes that comprise life
Life
Life is a characteristic that distinguishes objects that have signaling and self-sustaining processes from those that do not, either because such functions have ceased , or else because they lack such functions and are classified as inanimate...

.
A modern working definition of a gene is "a locatable region
Locus (genetics)
In the fields of genetics and genetic computation, a locus is the specific location of a gene or DNA sequence on a chromosome. A variant of the DNA sequence at a given locus is called an allele. The ordered list of loci known for a particular genome is called a genetic map...

 of genomic
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

 sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions, and or other functional sequence regions
". Colloquial usage of the term gene (e.g. "good genes", "hair color gene") may actually refer to an allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...

: a gene is the basic instruction—a sequence of nucleic acids (DNA or, in the case of certain virus
Virus
A virus is a small infectious agent that can replicate only inside the living cells of organisms. Viruses infect all types of organisms, from animals and plants to bacteria and archaea...

es RNA), while an allele is one variant of that gene. Referring to having a gene for a trait is no longer the scientifically accepted usage. In most cases, all people would have a gene for the trait in question, but certain people will have a specific allele of that gene, which results in the trait variant. Further, genes code for proteins, which might result in identifiable traits, but it is the gene, not the trait, which is inherited.

RNA genes and genomes

When proteins are manufactured, the gene is first copied into RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 as an intermediate product. In other cases, the RNA molecules are the actual functional products. For example, RNAs known as ribozyme
Ribozyme
A ribozyme is an RNA molecule with a well defined tertiary structure that enables it to catalyze a chemical reaction. Ribozyme means ribonucleic acid enzyme. It may also be called an RNA enzyme or catalytic RNA. Many natural ribozymes catalyze either the hydrolysis of one of their own...

s are capable of enzymatic function
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...

, and microRNA has a regulatory role. The DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 sequences from which such RNAs are transcribed are known as RNA genes.

Some virus
Virus
A virus is a small infectious agent that can replicate only inside the living cells of organisms. Viruses infect all types of organisms, from animals and plants to bacteria and archaea...

es store their entire genomes in the form of RNA, and contain no DNA at all. Because they use RNA to store genes, their cellular
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 hosts
Host (biology)
In biology, a host is an organism that harbors a parasite, or a mutual or commensal symbiont, typically providing nourishment and shelter. In botany, a host plant is one that supplies food resources and substrate for certain insects or other fauna...

 may synthesize their proteins as soon as they are infected
Infection
An infection is the colonization of a host organism by parasite species. Infecting parasites seek to use the host's resources to reproduce, often resulting in disease...

 and without the delay in waiting for transcription. On the other hand, RNA retrovirus
Retrovirus
A retrovirus is an RNA virus that is duplicated in a host cell using the reverse transcriptase enzyme to produce DNA from its RNA genome. The DNA is then incorporated into the host's genome by an integrase enzyme. The virus thereafter replicates as part of the host cell's DNA...

es, such as HIV
HIV
Human immunodeficiency virus is a lentivirus that causes acquired immunodeficiency syndrome , a condition in humans in which progressive failure of the immune system allows life-threatening opportunistic infections and cancers to thrive...

, require the reverse transcription of their genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

 from RNA into DNA before their proteins can be synthesized.

In 2006, French researchers came across a puzzling example of RNA-mediated inheritance in mice. Mice with a loss-of-function mutation in the gene Kit have white tails. Offspring of these mutants can have white tails despite having only normal Kit genes. The research team traced this effect back to mutated Kit RNA. While RNA is common as genetic storage material in viruses, in mammals in particular RNA inheritance has been observed very rarely.

Functional structure of a gene

The vast majority of living organisms encode their genes in long strands of DNA. DNA (deoxyribonucleic acid) consists of a chain made from four types of nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...

 subunits, each composed of: a five-carbon sugar (2'-deoxyribose
Deoxyribose
Deoxyribose, more, precisely 2-deoxyribose, is a monosaccharide with idealized formula H---3-H. Its name indicates that it is a deoxy sugar, meaning that it is derived from the sugar ribose by loss of an oxygen atom...

), a phosphate
Phosphate
A phosphate, an inorganic chemical, is a salt of phosphoric acid. In organic chemistry, a phosphate, or organophosphate, is an ester of phosphoric acid. Organic phosphates are important in biochemistry and biogeochemistry or ecology. Inorganic phosphates are mined to obtain phosphorus for use in...

 group, and one of the four bases
Nucleobase
Nucleobases are a group of nitrogen-based molecules that are required to form nucleotides, the basic building blocks of DNA and RNA. Nucleobases provide the molecular structure necessary for the hydrogen bonding of complementary DNA and RNA strands, and are key components in the formation of stable...

 adenine
Adenine
Adenine is a nucleobase with a variety of roles in biochemistry including cellular respiration, in the form of both the energy-rich adenosine triphosphate and the cofactors nicotinamide adenine dinucleotide and flavin adenine dinucleotide , and protein synthesis, as a chemical component of DNA...

, cytosine
Cytosine
Cytosine is one of the four main bases found in DNA and RNA, along with adenine, guanine, and thymine . It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached . The nucleoside of cytosine is cytidine...

, guanine
Guanine
Guanine is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine . In DNA, guanine is paired with cytosine. With the formula C5H5N5O, guanine is a derivative of purine, consisting of a fused pyrimidine-imidazole ring system with...

, and thymine
Thymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...

. The most common form of DNA in a cell is in a double helix structure, in which two individual DNA strands twist around each other in a right-handed spiral. In this structure, the base pairing rules specify that guanine pairs with cytosine and adenine pairs with thymine. The base pairing between guanine and cytosine forms three hydrogen bond
Hydrogen bond
A hydrogen bond is the attractive interaction of a hydrogen atom with an electronegative atom, such as nitrogen, oxygen or fluorine, that comes from another molecule or chemical group. The hydrogen must be covalently bonded to another electronegative atom to create the bond...

s, whereas the base pairing between adenine and thymine forms two hydrogen bonds. The two strands in a double helix must therefore be complementary, that is, their bases must align such that the adenines of one strand are paired with the thymines of the other strand, and so on.

Due to the chemical composition of the pentose
Pentose
A pentose is a monosaccharide with five carbon atoms. Pentoses are organized into two groups. Aldopentoses have an aldehyde functional group at position 1...

 residues of the bases, DNA strands have directionality. One end of a DNA polymer contains an exposed hydroxyl
Hydroxyl
A hydroxyl is a chemical group containing an oxygen atom covalently bonded with a hydrogen atom. In inorganic chemistry, the hydroxyl group is known as the hydroxide ion, and scientists and reference works generally use these different terms though they refer to the same chemical structure in...

 group on the deoxyribose
Deoxyribose
Deoxyribose, more, precisely 2-deoxyribose, is a monosaccharide with idealized formula H---3-H. Its name indicates that it is a deoxy sugar, meaning that it is derived from the sugar ribose by loss of an oxygen atom...

; this is known as the 3' end of the molecule. The other end contains an exposed phosphate
Phosphate
A phosphate, an inorganic chemical, is a salt of phosphoric acid. In organic chemistry, a phosphate, or organophosphate, is an ester of phosphoric acid. Organic phosphates are important in biochemistry and biogeochemistry or ecology. Inorganic phosphates are mined to obtain phosphorus for use in...

 group; this is the 5' end. The directionality of DNA is vitally important to many cellular processes, since double helices are necessarily directional (a strand running 5'-3' pairs with a complementary strand running 3'-5'), and processes such as DNA replication
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...

 occur in only one direction. All nucleic acid synthesis in a cell occurs in the 5'-3' direction, because new monomers are added via a dehydration
Dehydration
In physiology and medicine, dehydration is defined as the excessive loss of body fluid. It is literally the removal of water from an object; however, in physiological terms, it entails a deficiency of fluid within an organism...

 reaction that uses the exposed 3' hydroxyl as a nucleophile
Nucleophile
A nucleophile is a species that donates an electron-pair to an electrophile to form a chemical bond in a reaction. All molecules or ions with a free pair of electrons can act as nucleophiles. Because nucleophiles donate electrons, they are by definition Lewis bases.Nucleophilic describes the...

.

The expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

 of genes encoded in DNA begins by transcribing
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

 the gene into RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

, a second type of nucleic acid
Nucleic acid
Nucleic acids are biological molecules essential for life, and include DNA and RNA . Together with proteins, nucleic acids make up the most important macromolecules; each is found in abundance in all living things, where they function in encoding, transmitting and expressing genetic information...

 that is very similar to DNA, but whose monomers contain the sugar ribose
Ribose
Ribose is an organic compound with the formula C5H10O5; specifically, a monosaccharide with linear form H––4–H, which has all the hydroxyl groups on the same side in the Fischer projection....

 rather than deoxyribose
Deoxyribose
Deoxyribose, more, precisely 2-deoxyribose, is a monosaccharide with idealized formula H---3-H. Its name indicates that it is a deoxy sugar, meaning that it is derived from the sugar ribose by loss of an oxygen atom...

. RNA also contains the base uracil
Uracil
Uracil is one of the four nucleobases in the nucleic acid of RNA that are represented by the letters A, G, C and U. The others are adenine, cytosine, and guanine. In RNA, uracil binds to adenine via two hydrogen bonds. In DNA, the uracil nucleobase is replaced by thymine.Uracil is a common and...

 in place of thymine
Thymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...

. RNA molecules are less stable than DNA and are typically single-stranded. Genes that encode proteins are composed of a series of three-nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...

 sequences called codons, which serve as the words in the genetic language. The genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....

 specifies the correspondence during protein translation
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...

 between codons and amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

s. The genetic code is nearly the same for all known organisms.

All genes have regulatory regions in addition to regions that explicitly code for a protein or RNA product. A regulatory region
Regulatory sequence
A regulatory sequence is a segment of DNA where regulatory proteins such as transcription factors bind preferentially. These regulatory proteins bind to short stretches of DNA called regulatory regions, which are appropriately positioned in the genome, usually a short distance 'upstream' of the...

 shared by almost all genes is known as the promoter, which provides a position that is recognized by the transcription machinery when a gene is about to be transcribed and expressed. A gene can have more than one promoter, resulting in RNAs that differ in how far they extend in the 5' end. Although promoter regions have a consensus sequence
Consensus sequence
In molecular biology and bioinformatics, consensus sequence refers to the most common nucleotide or amino acid at a particular position after multiple sequences are aligned. A consensus sequence is a way of representing the results of a multiple sequence alignment, where related sequences are...

 that is the most common sequence at this position, some genes have "strong" promoters that bind the transcription machinery well, and others have "weak" promoters that bind poorly. These weak promoters usually permit a lower rate of transcription than the strong promoters, because the transcription machinery binds to them and initiates transcription less frequently. Other possible regulatory regions include enhancers
Enhancer (genetics)
In genetics, an enhancer is a short region of DNA that can be bound with proteins to enhance transcription levels of genes in a gene cluster...

, which can compensate for a weak promoter. Most regulatory regions are "upstream"—that is, before or toward the 5' end of the transcription initiation site. Eukaryotic
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

 promoter regions are much more complex and difficult to identify than prokaryotic
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...

 promoters.

Many prokaryotic genes are organized into operon
Operon
In genetics, an operon is a functioning unit of genomic DNA containing a cluster of genes under the control of a single regulatory signal or promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo trans-splicing to create...

s, or groups of genes whose products have related functions and which are transcribed as a unit. By contrast, eukaryotic genes are transcribed only one at a time, but may include long stretches of DNA called intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

s which are transcribed but never translated into protein (they are spliced out before translation). Splicing can also occur in prokaryotic genes, but is less common than in eukaryotes.

Chromosomes

The total complement of genes in an organism or cell is known as its genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

, which may be stored on one or more chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s; the region of the chromosome at which a particular gene is located is called its locus
Locus (genetics)
In the fields of genetics and genetic computation, a locus is the specific location of a gene or DNA sequence on a chromosome. A variant of the DNA sequence at a given locus is called an allele. The ordered list of loci known for a particular genome is called a genetic map...

. A chromosome consists of a single, very long DNA helix on which thousands of genes are encoded. Prokaryote
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...

s—bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...

—typically store their genomes on a single large, circular chromosome, sometimes supplemented by additional small circles of DNA called plasmid
Plasmid
In microbiology and genetics, a plasmid is a DNA molecule that is separate from, and can replicate independently of, the chromosomal DNA. They are double-stranded and, in many cases, circular...

s, which usually encode only a few genes and are easily transferable between individuals. For example, the genes for antibiotic resistance
Antibiotic resistance
Antibiotic resistance is a type of drug resistance where a microorganism is able to survive exposure to an antibiotic. While a spontaneous or induced genetic mutation in bacteria may confer resistance to antimicrobial drugs, genes that confer resistance can be transferred between bacteria in a...

 are usually encoded on bacterial plasmids and can be passed between individual cells, even those of different species, via horizontal gene transfer
Horizontal gene transfer
Horizontal gene transfer , also lateral gene transfer , is any process in which an organism incorporates genetic material from another organism without being the offspring of that organism...

.

Although some simple eukaryotes also possess plasmids with small numbers of genes, the majority of eukaryotic genes are stored on multiple linear chromosomes, which are packed within the nucleus
Cell nucleus
In cell biology, the nucleus is a membrane-enclosed organelle found in eukaryotic cells. It contains most of the cell's genetic material, organized as multiple long linear DNA molecules in complex with a large variety of proteins, such as histones, to form chromosomes. The genes within these...

 in complex with storage proteins called histone
Histone
In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and play a role in gene regulation...

s. The manner in which DNA is stored on the histone, as well as chemical modifications of the histone itself, are regulatory mechanisms governing whether a particular region of DNA is accessible for gene expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

. The ends of eukaryotic chromosomes are capped by long stretches of repetitive sequences called telomere
Telomere
A telomere is a region of repetitive DNA sequences at the end of a chromosome, which protects the end of the chromosome from deterioration or from fusion with neighboring chromosomes. Its name is derived from the Greek nouns telos "end" and merοs "part"...

s, which do not code for any gene product but are present to prevent degradation of coding and regulatory regions during DNA replication
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...

. The length of the telomeres tends to decrease each time the genome is replicated in preparation for cell division; the loss of telomeres has been proposed as an explanation for cellular senescence
Senescence
Senescence or biological aging is the change in the biology of an organism as it ages after its maturity. Such changes range from those affecting its cells and their function to those affecting the whole organism...

, or the loss of the ability to divide, and by extension for the aging process in organisms.

Whereas the chromosomes of prokaryotes are relatively gene-dense, those of eukaryotes often contain so-called "junk DNA", or regions of DNA that serve no obvious function. Simple single-celled eukaryotes have relatively small amounts of such DNA, whereas the genomes of complex multicellular organism
Multicellular organism
Multicellular organisms are organisms that consist of more than one cell, in contrast to single-celled organisms. Most life that can be seen with the the naked eye is multicellular, as are all animals and land plants.-Evolutionary history:Multicellularity has evolved independently dozens of times...

s, including humans, contain an absolute majority of DNA without an identified function. However it now appears that, although protein-coding DNA makes up barely 2% of the human genome
Human genome
The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs plus the small mitochondrial DNA. 22 of the 23 chromosomes are autosomal chromosome pairs, while the remaining pair is sex-determining...

, about 80% of the bases in the genome may be expressed, so the term "junk DNA" may be a misnomer.

Gene expression

In all organisms, there are two major steps separating a protein-coding gene from its protein: First, the DNA on which the gene resides must be transcribed
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

from DNA to messenger RNA
Messenger RNA
Messenger RNA is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcribed from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes. Here, the nucleic acid polymer is translated into a polymer of amino acids: a protein...

 (mRNA); and, second, it must be translated
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...

from mRNA to protein. RNA-coding genes must still go through the first step, but are not translated into protein. The process of producing a biologically functional molecule of either RNA or protein is called gene expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

, and the resulting molecule itself is called a gene product
Gene product
A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlated with disease-causing alleles, such as the...

.

Genetic code

The genetic code is the set of rules by which a gene is translated into a functional protein. Each gene consists of a specific sequence of nucleotides encoded in a DNA (or sometimes RNA in some viruses) strand; a correspondence between nucleotides, the basic building blocks of genetic material, and amino acids, the basic building blocks of proteins, must be established for genes to be successfully translated into functional proteins. Sets of three nucleotides, known as codons, each correspond to a specific amino acid or to a signal; three codons are known as "stop codons" and, instead of specifying a new amino acid, alert the translation machinery that the end of the gene has been reached. There are 64 possible codons (four possible nucleotides at each of three positions, hence 43 possible codons) and only 20 standard amino acids; hence the code is redundant and multiple codons can specify the same amino acid. The correspondence between codons and amino acids is nearly universal among all known living organisms.

Transcription

The process of genetic transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

 produces a single-stranded RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 molecule known as messenger RNA
Messenger RNA
Messenger RNA is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcribed from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes. Here, the nucleic acid polymer is translated into a polymer of amino acids: a protein...

, whose nucleotide sequence is complementary to the DNA from which it was transcribed. The DNA strand whose sequence matches that of the RNA is known as the coding strand
Coding strand
When referring to DNA transcription, the coding strand is the DNA strand which has the same base sequence as the RNA transcript produced...

 and the strand from which the RNA was synthesized is the template strand. Transcription is performed by an enzyme
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...

 called an RNA polymerase
RNA polymerase
RNA polymerase is an enzyme that produces RNA. In cells, RNAP is needed for constructing RNA chains from DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses...

, which reads the template strand in the 3' to 5' direction and synthesizes the RNA from 5' to 3'. To initiate transcription, the polymerase first recognizes and binds a promoter region of the gene. Thus a major mechanism of gene regulation is the blocking or sequestering of the promoter region, either by tight binding by repressor
Repressor
In molecular genetics, a repressor is a DNA-binding protein that regulates the expression of one or more genes by binding to the operator and blocking the attachment of RNA polymerase to the promoter, thus preventing transcription of the genes. This blocking of expression is called...

 molecules that physically block the polymerase, or by organizing the DNA so that the promoter region is not accessible.

In prokaryote
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...

s, transcription occurs in the cytoplasm
Cytoplasm
The cytoplasm is a small gel-like substance residing between the cell membrane holding all the cell's internal sub-structures , except for the nucleus. All the contents of the cells of prokaryote organisms are contained within the cytoplasm...

; for very long transcripts, translation may begin at the 5' end of the RNA while the 3' end is still being transcribed. In eukaryote
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

s, transcription necessarily occurs in the nucleus, where the cell's DNA is sequestered; the RNA molecule produced by the polymerase is known as the primary transcript
Primary transcript
A primary transcript is an RNA molecule that has not yet undergone any modification after its synthesis. For example, a precursor messenger RNA is a primary transcript that becomes a messenger RNA after processing, and a primary microRNA precursor becomes a microRNA after processing....

 and must undergo post-transcriptional modification
Post-transcriptional modification
Post-transcriptional modification is a process in cell biology by which, in eukaryotic cells, primary transcript RNA is converted into mature RNA. A notable example is the conversion of precursor messenger RNA into mature messenger RNA , which includes splicing and occurs prior to protein synthesis...

s before being exported to the cytoplasm for translation. The splicing
Splicing (genetics)
In molecular biology and genetics, splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation...

 of intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

s present within the transcribed region is a modification unique to eukaryotes; alternative splicing
Alternative splicing
Alternative splicing is a process by which the exons of the RNA produced by transcription of a gene are reconnected in multiple ways during RNA splicing...

 mechanisms can result in mature transcripts from the same gene having different sequences and thus coding for different proteins. This is a major form of regulation in eukaryotic cells.

Translation

Translation
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...

 is the process by which a mature mRNA
Mature messenger RNA
Mature messenger RNA, often abbreviated as mature mRNA is a eukaryotic RNA transcript that has been spliced and processed and is ready for translation in the course of protein synthesis...

 molecule is used as a template for synthesizing a new protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

. Translation is carried out by ribosome
Ribosome
A ribosome is a component of cells that assembles the twenty specific amino acid molecules to form the particular protein molecule determined by the nucleotide sequence of an RNA molecule....

s, large complexes of RNA and protein responsible for carrying out the chemical reactions to add new amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

s to a growing polypeptide chain by the formation of peptide bond
Peptide bond
This article is about the peptide link found within biological molecules, such as proteins. A similar article for synthetic molecules is being created...

s. The genetic code is read three nucleotides at a time, in units called codons, via interactions with specialized RNA molecules called transfer RNA
Transfer RNA
Transfer RNA is an adaptor molecule composed of RNA, typically 73 to 93 nucleotides in length, that is used in biology to bridge the three-letter genetic code in messenger RNA with the twenty-letter code of amino acids in proteins. The role of tRNA as an adaptor is best understood by...

 (tRNA). Each tRNA has three unpaired bases known as the anticodon that are complementary to the codon it reads; the tRNA is also covalently attached to the amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

 specified by the complementary codon. When the tRNA binds to its complementary codon in an mRNA strand, the ribosome ligates its amino acid cargo to the new polypeptide chain, which is synthesized from amino terminus to carboxyl terminus. During and after its synthesis, the new protein must fold
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

 to its active three-dimensional structure
Tertiary structure
In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.-Relationship to primary structure:...

 before it can carry out its cellular function.

DNA replication and inheritance

The growth, development, and reproduction of organisms relies on cell division
Cell division
Cell division is the process by which a parent cell divides into two or more daughter cells . Cell division is usually a small segment of a larger cell cycle. This type of cell division in eukaryotes is known as mitosis, and leaves the daughter cell capable of dividing again. The corresponding sort...

, or the process by which a single cell
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 divides into two usually identical daughter cells. This requires first making a duplicate copy of every gene in the genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

 in a process called DNA replication
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...

. The copies are made by specialized enzyme
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...

s known as DNA polymerase
DNA polymerase
A DNA polymerase is an enzyme that helps catalyze in the polymerization of deoxyribonucleotides into a DNA strand. DNA polymerases are best known for their feedback role in DNA replication, in which the polymerase "reads" an intact DNA strand as a template and uses it to synthesize the new strand....

s, which "read" one strand of the double-helical DNA, known as the template strand, and synthesize a new complementary strand. Because the DNA double helix is held together by base pair
Base pair
In molecular biology and genetics, the linking between two nitrogenous bases on opposite complementary DNA or certain types of RNA strands that are connected via hydrogen bonds is called a base pair...

ing, the sequence of one strand completely specifies the sequence of its complement; hence only one strand needs to be read by the enzyme to produce a faithful copy. The process of DNA replication is semiconservative
Semiconservative replication
Semiconservative replication describes the mechanism by which DNA is replicated in all known cells.This mechanism of replication was one of three models originally proposedfor DNA replication:...

; that is, the copy of the genome inherited by each daughter cell contains one original and one newly synthesized strand of DNA.

After DNA replication is complete, the cell must physically separate the two copies of the genome and divide into two distinct membrane-bound cells. In prokaryote
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...

s - bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 and archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...

 - this usually occurs via a relatively simple process called binary fission, in which each circular genome attaches to the cell membrane
Cell membrane
The cell membrane or plasma membrane is a biological membrane that separates the interior of all cells from the outside environment. The cell membrane is selectively permeable to ions and organic molecules and controls the movement of substances in and out of cells. It basically protects the cell...

 and is separated into the daughter cells as the membrane invaginates
Invagination
Invagination means to fold inward or to sheath. In biology, this can refer to a number of processes.* Invagination is the morphogenetic processes by which an embryo takes form, and is the initial step of gastrulation, the massive reorganization of the embryo from a simple spherical ball of cells,...

 to split the cytoplasm
Cytoplasm
The cytoplasm is a small gel-like substance residing between the cell membrane holding all the cell's internal sub-structures , except for the nucleus. All the contents of the cells of prokaryote organisms are contained within the cytoplasm...

 into two membrane-bound portions. Binary fission is extremely fast compared to the rates of cell division in eukaryote
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

s. Eukaryotic cell division is a more complex process known as the cell cycle
Cell cycle
The cell cycle, or cell-division cycle, is the series of events that takes place in a cell leading to its division and duplication . In cells without a nucleus , the cell cycle occurs via a process termed binary fission...

; DNA replication occurs during a phase of this cycle known as S phase
S phase
S-phase is the part of the cell cycle in which DNA is replicated, occurring between G1 phase and G2 phase. Precise and accurate DNA replication is necessary to prevent genetic abnormalities which often lead to cell death or disease. Due to the importance, the regulatory pathways that govern this...

, whereas the process of segregating chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s and splitting the cytoplasm
Cytoplasm
The cytoplasm is a small gel-like substance residing between the cell membrane holding all the cell's internal sub-structures , except for the nucleus. All the contents of the cells of prokaryote organisms are contained within the cytoplasm...

 occurs during M phase. In many single-celled eukaryotes such as yeast
Yeast
Yeasts are eukaryotic micro-organisms classified in the kingdom Fungi, with 1,500 species currently described estimated to be only 1% of all fungal species. Most reproduce asexually by mitosis, and many do so by an asymmetric division process called budding...

, reproduction by budding
Budding
Budding is a form of asexual reproduction in which a new organism grows on another one. The new organism remains attached as it grows, separating from the parent organism only when it is mature. Since the reproduction is asexual, the newly created organism is a clone and is genetically identical...

 is common, which results in asymmetrical portions of cytoplasm in the two daughter cells.

Molecular inheritance

The duplication and transmission of genetic material from one generation of cells to the next is the basis for molecular inheritance, and the link between the classical and molecular pictures of genes. Organisms inherit the characteristics of their parents because the cells of the offspring contain copies of the genes in their parents' cells. In asexually reproducing
Asexual reproduction
Asexual reproduction is a mode of reproduction by which offspring arise from a single parent, and inherit the genes of that parent only, it is reproduction which does not involve meiosis, ploidy reduction, or fertilization. A more stringent definition is agamogenesis which is reproduction without...

 organisms, the offspring will be a genetic copy or clone of the parent organism. In sexually reproducing
Sexual reproduction
Sexual reproduction is the creation of a new organism by combining the genetic material of two organisms. There are two main processes during sexual reproduction; they are: meiosis, involving the halving of the number of chromosomes; and fertilization, involving the fusion of two gametes and the...

 organisms, a specialized form of cell division called meiosis
Meiosis
Meiosis is a special type of cell division necessary for sexual reproduction. The cells produced by meiosis are gametes or spores. The animals' gametes are called sperm and egg cells....

 produces cells called gamete
Gamete
A gamete is a cell that fuses with another cell during fertilization in organisms that reproduce sexually...

s or germ cell
Germ cell
A germ cell is any biological cell that gives rise to the gametes of an organism that reproduces sexually. In many animals, the germ cells originate near the gut of an embryo and migrate to the developing gonads. There, they undergo cell division of two types, mitosis and meiosis, followed by...

s that are haploid, or contain only one copy of each gene. The gametes produced by females are called eggs
Egg (biology)
An egg is an organic vessel in which an embryo first begins to develop. In most birds, reptiles, insects, molluscs, fish, and monotremes, an egg is the zygote, resulting from fertilization of the ovum, which is expelled from the body and permitted to develop outside the body until the developing...

 or ova, and those produced by males are called sperm
Sperm
The term sperm is derived from the Greek word sperma and refers to the male reproductive cells. In the types of sexual reproduction known as anisogamy and oogamy, there is a marked difference in the size of the gametes with the smaller one being termed the "male" or sperm cell...

. Two gametes fuse to form a fertilized egg, a single cell that once again has a diploid number of genes—each with one copy from the mother and one copy from the father.

During the process of meiotic cell division, an event called genetic recombination
Genetic recombination
Genetic recombination is a process by which a molecule of nucleic acid is broken and then joined to a different one. Recombination can occur between similar molecules of DNA, as in homologous recombination, or dissimilar molecules, as in non-homologous end joining. Recombination is a common method...

 or crossing-over can sometimes occur, in which a length of DNA on one chromatid
Chromatid
A chromatid is one of the two identical copies of DNA making up a duplicated chromosome, which are joined at their centromeres, for the process of cell division . They are called sister chromatids so long as they are joined by the centromeres...

 is swapped with a length of DNA on the corresponding sister chromatid. This has no effect if the allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...

s on the chromatids are the same, but results in reassortment of otherwise linked alleles if they are different. The Mendelian principle of independent assortment asserts that each of a parent's two genes for each trait will sort independently into gametes; which allele an organism inherits for one trait is unrelated to which allele it inherits for another trait. This is in fact only true for genes that do not reside on the same chromosome, or are located very far from one another on the same chromosome. The closer two genes lie on the same chromosome, the more closely they will be associated in gametes and the more often they will appear together; genes that are very close are essentially never separated because it is extremely unlikely that a crossover point will occur between them. This is known as genetic linkage
Genetic linkage
Genetic linkage is the tendency of certain loci or alleles to be inherited together. Genetic loci that are physically close to one another on the same chromosome tend to stay together during meiosis, and are thus genetically linked.-Background:...

.

History

The notion of a gene is evolving with the science of genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

, which began when Gregor Mendel
Gregor Mendel
Gregor Johann Mendel was an Austrian scientist and Augustinian friar who gained posthumous fame as the founder of the new science of genetics. Mendel demonstrated that the inheritance of certain traits in pea plants follows particular patterns, now referred to as the laws of Mendelian inheritance...

 noticed that biological variations are inherited from parent organisms as specific, discrete traits. The biological entity responsible for defining traits was later termed a gene, but the biological basis for inheritance remained unknown until DNA was identified as the genetic material in the 1940s. Prior to Mendel's work, the dominant theory of heredity was one of blending inheritance
Blending inheritance
Many biologists and other academics held to the idea of blending inheritance during the 19th century, prior to the discovery of genetics. Blending inheritance was merely a widespread hypothetical model, rather than a formalized scientific theory , in...

, which proposes that the traits of the parents blend or mix in a smooth, continuous gradient in the offspring. Although Mendel's work was largely unrecognized after its first publication in 1866, it was rediscovered in 1900 by three European scientists, Hugo de Vries
Hugo de Vries
Hugo Marie de Vries ForMemRS was a Dutch botanist and one of the first geneticists. He is known chiefly for suggesting the concept of genes, rediscovering the laws of heredity in the 1890s while unaware of Gregor Mendel's work, for introducing the term "mutation", and for developing a mutation...

, Carl Correns
Carl Correns
Carl Erich Correns was a German botanist and geneticist, who is notable primarily for his independent discovery of the principles of heredity, and for his rediscovery of Gregor Mendel's earlier paper on that subject, which he achieved simultaneously but independently of the botanists Erich...

, and Erich von Tschermak
Erich von Tschermak
Erich von Tschermak-Seysenegg was an Austrian agronomist who developed several new disease-resistant crops, including wheat-rye and oat hybrids. He was a son of the Moravia-born mineralogist Gustav Tschermak von Seysenegg...

, who had reached similar conclusions from their own research. However, these scientists were not yet aware of the identity of the 'discrete units' on which genetic material resides.

The existence of genes was first suggested by Gregor Mendel
Gregor Mendel
Gregor Johann Mendel was an Austrian scientist and Augustinian friar who gained posthumous fame as the founder of the new science of genetics. Mendel demonstrated that the inheritance of certain traits in pea plants follows particular patterns, now referred to as the laws of Mendelian inheritance...

 (1822–1884), who, in the 1860s, studied inheritance in pea
Pea
A pea is most commonly the small spherical seed or the seed-pod of the pod fruit Pisum sativum. Each pod contains several peas. Peapods are botanically a fruit, since they contain seeds developed from the ovary of a flower. However, peas are considered to be a vegetable in cooking...

plants (Pisum sativum) and hypothesized
Hypothesis
A hypothesis is a proposed explanation for a phenomenon. The term derives from the Greek, ὑποτιθέναι – hypotithenai meaning "to put under" or "to suppose". For a hypothesis to be put forward as a scientific hypothesis, the scientific method requires that one can test it...

 a factor that conveys traits from parent to offspring. He spent over 10 years of his life on one experiment. Although he did not use the term gene, he explained his results in terms of inherited characteristics. Mendel was also the first to hypothesize independent assortment, the distinction between dominant and recessive
Recessive
In genetics, the term "recessive gene" refers to an allele that causes a phenotype that is only seen in a homozygous genotype and never in a heterozygous genotype. Every person has two copies of every gene on autosomal chromosomes, one from mother and one from father...

 traits, the distinction between a heterozygote and homozygote, and the difference between what would later be described as genotype
Genotype
The genotype is the genetic makeup of a cell, an organism, or an individual usually with reference to a specific character under consideration...

 (the genetic material of an organism) and phenotype
Phenotype
A phenotype is an organism's observable characteristics or traits: such as its morphology, development, biochemical or physiological properties, behavior, and products of behavior...

 (the visible traits of that organism).

Charles Darwin
Charles Darwin
Charles Robert Darwin FRS was an English naturalist. He established that all species of life have descended over time from common ancestry, and proposed the scientific theory that this branching pattern of evolution resulted from a process that he called natural selection.He published his theory...

 used the term Gemmule
Gemmules
Gemmules were imagined particles of inheritance proposed byCharles Darwin as part of his Pangenesis theory. This appeared in his book The Variation of Animals and Plants under Domestication, published in 1868, nine years after the publication of his famous book On the Origin of Species.Gemmules,...

 to describe a microscopic unit of inheritance, and what would later become known as Chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s had been observed separating out during cell division by Wilhelm Hofmeister
Wilhelm Hofmeister
Wilhelm Friedrich Benedikt Hofmeister was a German biologist and botanist. He "stands as one of the true giants in the history of biology and belongs in the same pantheon as Darwin and Mendel." He was largely self-taught....

 as early as 1848. The idea that chromosomes are the carriers of inheritance was expressed in 1883 by Wilhelm Roux
Wilhelm Roux
Wilhelm Roux was a German zoologist and pioneer of experimental embryology.Roux was born and educated in Jena, Germany where he attended university and studied under Ernst Haeckel. He also attended university in Berlin and Strasbourg and studied under Gustav Schwalbe, Friedrich Daniel von...

. Darwin also coined the word pangenesis
Pangenesis
Pangenesis was Charles Darwin's hypothetical mechanism for heredity. He presented this 'provisional hypothesis' in his 1868 work The Variation of Animals and Plants under Domestication and felt that it brought 'together a multitude of facts which are at present left disconnected by any efficient...

by (1868). The word pangenesis is made from the Greek
Greek language
Greek is an independent branch of the Indo-European family of languages. Native to the southern Balkans, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history;...

 words pan (a prefix meaning "whole", "encompassing") and genesis ("birth") or genos ("origin").

Mendel's concept was given a name by Hugo de Vries
Hugo de Vries
Hugo Marie de Vries ForMemRS was a Dutch botanist and one of the first geneticists. He is known chiefly for suggesting the concept of genes, rediscovering the laws of heredity in the 1890s while unaware of Gregor Mendel's work, for introducing the term "mutation", and for developing a mutation...

 in 1889, in his book Intracellular Pangenesis; although probably unaware of Mendel's work at the time, he coined the term "pangen" for "the smallest particle [representing] one hereditary characteristic". Danish
Denmark
Denmark is a Scandinavian country in Northern Europe. The countries of Denmark and Greenland, as well as the Faroe Islands, constitute the Kingdom of Denmark . It is the southernmost of the Nordic countries, southwest of Sweden and south of Norway, and bordered to the south by Germany. Denmark...

 botanist Wilhelm Johannsen
Wilhelm Johannsen
Wilhelm Johannsen was a Danish botanist, plant physiologist and geneticist. He was born in Copenhagen. While very young, he was apprenticed to a pharmacist and worked in Denmark and Germany beginning in 1872 until passing his pharmacist's exam in 1879...

 coined the word "gene" ("gen" in Danish and German) in 1909 to describe the fundamental physical and functional units of heredity, while the related word genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

 was first used by William Bateson
William Bateson
William Bateson was an English geneticist and a Fellow of St. John's College, Cambridge...

 in 1905. He derived the word from de Vries' "pangen". In the early 1900s, Mendel's work received renewed attention from scientists. In 1910, Thomas Hunt Morgan
Thomas Hunt Morgan
Thomas Hunt Morgan was an American evolutionary biologist, geneticist and embryologist and science author who won the Nobel Prize in Physiology or Medicine in 1933 for discoveries relating the role the chromosome plays in heredity.Morgan received his PhD from Johns Hopkins University in zoology...

 showed that genes reside on specific chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s. He later showed that genes occupy specific locations on the chromosome. With this knowledge, Morgan and his students began the first chromosomal map of the fruit fly Drosophila
Drosophila melanogaster
Drosophila melanogaster is a species of Diptera, or the order of flies, in the family Drosophilidae. The species is known generally as the common fruit fly or vinegar fly. Starting from Charles W...

. In 1928, Frederick Griffith
Frederick Griffith
Frederick Griffith was a British bacteriologist whose focus was the epidemiology and pathology of bacterial pneumonia. In January 1928 he reported what is now known as Griffith's Experiment, the first widely accepted demonstrations of bacterial transformation, whereby a bacterium distinctly...

 showed that genes could be transferred. In what is now known as Griffith's experiment
Griffith's experiment
Griffith's experiment, reported in 1928 by Frederick Griffith, was one of the first experiments suggesting that bacteria are capable of transferring genetic information through a process known as transformation....

, injections into a mouse of a deadly strain of bacteria
Bacteria
Bacteria are a large domain of prokaryotic microorganisms. Typically a few micrometres in length, bacteria have a wide range of shapes, ranging from spheres to rods and spirals...

 that had been heat-killed transferred genetic information to a safe strain of the same bacteria, killing the mouse.

A series of subsequent discoveries led to the realization decades later that chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s within cells
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....

 are the carriers of genetic material, and that they are made of DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 (deoxyribonucleic acid), a polymer
Polymer
A polymer is a large molecule composed of repeating structural units. These subunits are typically connected by covalent chemical bonds...

ic molecule found in all cells on which the 'discrete units' of Mendelian inheritance are encoded. In 1941, George Wells Beadle and Edward Lawrie Tatum
Edward Lawrie Tatum
Edward Lawrie Tatum was an American geneticist. He shared half of the Nobel Prize in Physiology or Medicine in 1958 with George Wells Beadle for showing that genes control individual steps in metabolism...

 showed that mutations in genes caused errors in specific steps in metabolic pathway
Metabolic pathway
In biochemistry, metabolic pathways are series of chemical reactions occurring within a cell. In each pathway, a principal chemical is modified by a series of chemical reactions. Enzymes catalyze these reactions, and often require dietary minerals, vitamins, and other cofactors in order to function...

s. This showed that specific genes code for specific proteins, leading to the "one gene, one enzyme" hypothesis. Oswald Avery
Oswald Avery
Oswald Theodore Avery ForMemRS was a Canadian-born American physician and medical researcher. The major part of his career was spent at the Rockefeller University Hospital in New York City...

, Colin Munro MacLeod, and Maclyn McCarty
Maclyn McCarty
Maclyn McCarty was an American geneticist.Maclyn McCarty, who devoted his life as a physician-scientist to studying infectious disease organisms, was best known for his part in the monumental discovery that DNA, rather than protein, constituted the chemical nature of a gene...

 showed in 1944
Avery-MacLeod-McCarty experiment
The Avery–MacLeod–McCarty experiment was an experimental demonstration, reported in 1944 by Oswald Avery, Colin MacLeod, and Maclyn McCarty, that DNA is the substance that causes bacterial transformation...

 that DNA holds the gene's information. In 1953, James D. Watson
James D. Watson
James Dewey Watson is an American molecular biologist, geneticist, and zoologist, best known as one of the co-discoverers of the structure of DNA in 1953 with Francis Crick...

 and Francis Crick
Francis Crick
Francis Harry Compton Crick OM FRS was an English molecular biologist, biophysicist, and neuroscientist, and most noted for being one of two co-discoverers of the structure of the DNA molecule in 1953, together with James D. Watson...

 demonstrated the molecular structure of DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

. Together, these discoveries established the central dogma of molecular biology
Central dogma of molecular biology
The central dogma of molecular biology was first articulated by Francis Crick in 1958 and re-stated in a Nature paper published in 1970:In other words, the process of producing proteins is irreversible: a protein cannot be used to create DNA....

, which states that proteins are translated from RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 which is transcribed from DNA. This dogma has since been shown to have exceptions, such as reverse transcription in retrovirus
Retrovirus
A retrovirus is an RNA virus that is duplicated in a host cell using the reverse transcriptase enzyme to produce DNA from its RNA genome. The DNA is then incorporated into the host's genome by an integrase enzyme. The virus thereafter replicates as part of the host cell's DNA...

es.

In 1972, Walter Fiers
Walter Fiers
Walter Fiers is a Belgian molecular biologist.He obtained a degree of Engineer for Chemistry and Agricultural Industries at the University of Ghent in 1954, and started his research career as an enzymologist in the laboratory of Laurent Vandendriessche in Ghent. In 1956-57, he worked with Heinz...

 and his team at the Laboratory of Molecular Biology of the University of Ghent (Ghent
Ghent
Ghent is a city and a municipality located in the Flemish region of Belgium. It is the capital and biggest city of the East Flanders province. The city started as a settlement at the confluence of the Rivers Scheldt and Lys and in the Middle Ages became one of the largest and richest cities of...

, Belgium
Belgium
Belgium , officially the Kingdom of Belgium, is a federal state in Western Europe. It is a founding member of the European Union and hosts the EU's headquarters, and those of several other major international organisations such as NATO.Belgium is also a member of, or affiliated to, many...

) were the first to determine the sequence of a gene: the gene for Bacteriophage MS2
Bacteriophage MS2
The bacteriophage MS2 is an icosahedral, positive-sense single-stranded RNA virus that infects the bacterium Escherichia coli.-History:...

 coat protein. Richard J. Roberts
Richard J. Roberts
Sir Richard "Rich" John Roberts is a British biochemist and molecular biologist. He was awarded the 1993 Nobel Prize in Physiology or Medicine with Phillip Allen Sharp for the discovery of introns in eukaryotic DNA and the mechanism of gene-splicing.When he was 4, his family moved to Bath. In...

 and Phillip Sharp discovered in 1977 that genes can be split into segments. This led to the idea that one gene can make several proteins. Recently (as of 2003–2006), biological
Biology
Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, origin, evolution, distribution, and taxonomy. Biology is a vast subject containing many subdivisions, topics, and disciplines...

 results let the notion of gene appear more slippery. In particular, genes do not seem to sit side by side on DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 like discrete beads. Instead, region
Region
Region is most commonly found as a term used in terrestrial and astrophysics sciences also an area, notably among the different sub-disciplines of geography, studied by regional geographers. Regions consist of subregions that contain clusters of like areas that are distinctive by their uniformity...

s of the DNA producing distinct proteins may overlap, so that the idea emerges that "genes are one long continuum
Continuum (theory)
Continuum theories or models explain variation as involving a gradual quantitative transition without abrupt changes or discontinuities. It can be contrasted with 'categorical' models which propose qualitatively different states.-In physics:...

". It was first hypothesized in 1986 by Walter Gilbert
Walter Gilbert
Walter Gilbert is an American physicist, biochemist, molecular biology pioneer, and Nobel laureate.-Biography:Gilbert was born in Boston, Massachusetts, on March 21, 1932...

 that neither DNA nor protein would be required in such a primitive system as that of a very early stage of the earth if RNA could perform as simply a catalyst and genetic information storage processor.

The modern study of genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

 at the level of DNA is known as molecular genetics
Molecular genetics
Molecular genetics is the field of biology and genetics that studies the structure and function of genes at a molecular level. The field studies how the genes are transferred from generation to generation. Molecular genetics employs the methods of genetics and molecular biology...

 and the synthesis of molecular genetics with traditional Darwinian
Charles Darwin
Charles Robert Darwin FRS was an English naturalist. He established that all species of life have descended over time from common ancestry, and proposed the scientific theory that this branching pattern of evolution resulted from a process that he called natural selection.He published his theory...

 evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...

 is known as the modern evolutionary synthesis
Modern evolutionary synthesis
The modern evolutionary synthesis is a union of ideas from several biological specialties which provides a widely accepted account of evolution...

.

Mendelian inheritance and classical genetics

According to the theory of Mendelian inheritance
Mendelian inheritance
Mendelian inheritance is a scientific description of how hereditary characteristics are passed from parent organisms to their offspring; it underlies much of genetics...

, variations in phenotype
Phenotype
A phenotype is an organism's observable characteristics or traits: such as its morphology, development, biochemical or physiological properties, behavior, and products of behavior...

—the observable physical and behavioral characteristics of an organism—are due to variations in genotype
Genotype
The genotype is the genetic makeup of a cell, an organism, or an individual usually with reference to a specific character under consideration...

, or the organism's particular set of genes, each of which specifies a particular trait. Different forms of a gene, which may give rise to different phenotypes, are known as alleles. Organisms such as the pea plants Mendel worked on, along with many plants and animals, have two alleles for each trait, one inherited from each parent. Alleles may be dominant or recessive; dominant alleles give rise to their corresponding phenotypes when paired with any other allele for the same trait, whereas recessive alleles give rise to their corresponding phenotype only when paired with another copy of the same allele. For example, if the allele specifying tall stems in pea plants is dominant over the allele specifying short stems, then pea plants that inherit one tall allele from one parent and one short allele from the other parent will also have tall stems. Mendel's work demonstrated that alleles assort independently in the production of gamete
Gamete
A gamete is a cell that fuses with another cell during fertilization in organisms that reproduce sexually...

s, or germ cell
Germ cell
A germ cell is any biological cell that gives rise to the gametes of an organism that reproduces sexually. In many animals, the germ cells originate near the gut of an embryo and migrate to the developing gonads. There, they undergo cell division of two types, mitosis and meiosis, followed by...

s, ensuring variation in the next generation.

Mutation

DNA replication is for the most part extremely accurate, with an error rate per site of around 10−6 to 10−10 in eukaryote
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

s. Rare, spontaneous alterations in the base sequence of a particular gene arise from a number of sources, such as errors in DNA replication
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...

 and the aftermath of DNA damage. These errors are called mutation
Mutation
In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...

s. The cell contains many DNA repair
DNA repair
DNA repair refers to a collection of processes by which a cell identifies and corrects damage to the DNA molecules that encode its genome. In human cells, both normal metabolic activities and environmental factors such as UV light and radiation can cause DNA damage, resulting in as many as 1...

 mechanisms for preventing mutations and maintaining the integrity of the genome; however, in some cases—such as breaks in both DNA strands of a chromosome—repairing the physical damage to the molecule is a higher priority than producing an exact copy. Due to the degeneracy of the genetic code, some mutations in protein-coding genes are silent, or produce no change in the amino acid sequence
Peptide sequence
Peptide sequence or amino acid sequence is the order in which amino acid residues, connected by peptide bonds, lie in the chain in peptides and proteins. The sequence is generally reported from the N-terminal end containing free amino group to the C-terminal end containing free carboxyl group...

 of the protein for which they code; for example, the codons UCU and UUC both code for serine
Serine
Serine is an amino acid with the formula HO2CCHCH2OH. It is one of the proteinogenic amino acids. By virtue of the hydroxyl group, serine is classified as a polar amino acid.-Occurrence and biosynthesis:...

, so the U↔C mutation has no effect on the protein. Mutations that do have phenotypic effects are most often neutral or deleterious to the organism, but sometimes they confer benefits to the organism's fitness
Fitness (biology)
Fitness is a central idea in evolutionary theory. It can be defined either with respect to a genotype or to a phenotype in a given environment...

.

Mutations propagated to the next generation
Generation
Generation , also known as procreation in biological sciences, is the act of producing offspring....

 lead to variations within a species' population. Variants of a single gene are known as allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...

s, and differences in allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...

s may give rise to differences in traits. Although it is rare for the variants in a single gene to have clearly distinguishable phenotypic effects, certain well-defined traits are in fact controlled by single genetic loci. A gene's most common allele is called the wild type
Wild type
Wild type refers to the phenotype of the typical form of a species as it occurs in nature. Originally, the wild type was conceptualized as a product of the standard, "normal" allele at a locus, in contrast to that produced by a non-standard, "mutant" allele...

 allele, and rare alleles are called mutant
Mutant
In biology and especially genetics, a mutant is an individual, organism, or new genetic character, arising or resulting from an instance of mutation, which is a base-pair sequence change within the DNA of a gene or chromosome of an organism resulting in the creation of a new character or trait not...

s. However, this does not imply that the wild-type allele is the ancestor
Ancestor
An ancestor is a parent or the parent of an ancestor ....

 from which the mutant
Mutant
In biology and especially genetics, a mutant is an individual, organism, or new genetic character, arising or resulting from an instance of mutation, which is a base-pair sequence change within the DNA of a gene or chromosome of an organism resulting in the creation of a new character or trait not...

s are descended.

Chromosomal organization

The total complement of genes in an organism or cell is known as its genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

. In prokaryote
Prokaryote
The prokaryotes are a group of organisms that lack a cell nucleus , or any other membrane-bound organelles. The organisms that have a cell nucleus are called eukaryotes. Most prokaryotes are unicellular, but a few such as myxobacteria have multicellular stages in their life cycles...

s, the vast majority of genes are located on a single chromosome of circular DNA
Circular DNA
Circular DNA is a form of DNA that is found in viruses, bacteria and archaea as well as in eukaryotic cells in the form of either mitochondrial DNA or plastid DNA....

, while eukaryote
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

s usually possess multiple individual linear DNA helices packed into dense DNA-protein complexes called chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...

s. Genes that appear together on one chromosome of one species may appear on separate chromosomes in another species. Many species carry more than one copy of their genome within each of their somatic cell
Somatic cell
A somatic cell is any biological cell forming the body of an organism; that is, in a multicellular organism, any cell other than a gamete, germ cell, gametocyte or undifferentiated stem cell...

s. Cells or organisms with only one copy of each chromosome are called haploid; those with two copies are called diploid; and those with more than two copies are called polyploid
Polyploidy
Polyploid is a term used to describe cells and organisms containing more than two paired sets of chromosomes. Most eukaryotic species are diploid, meaning they have two sets of chromosomes — one set inherited from each parent. However polyploidy is found in some organisms and is especially common...

. The copies of genes on the chromosomes are not necessarily identical. In sexually reproducing organisms, one copy is normally inherited from each parent.

Number of genes

Early estimates of the number of human genes that used expressed sequence tag
Expressed sequence tag
An expressed sequence tag or EST is a short sub-sequence of a cDNA sequence. They may be used to identify gene transcripts, and are instrumental in gene discovery and gene sequence determination. The identification of ESTs has proceeded rapidly, with approximately 65.9 million ESTs now available in...

 data put it at 50 000–100 000. Following the sequencing of the human genome
Human Genome Project
The Human Genome Project is an international scientific research project with a primary goal of determining the sequence of chemical base pairs which make up DNA, and of identifying and mapping the approximately 20,000–25,000 genes of the human genome from both a physical and functional...

 and other genomes, it has been found that rather few genes (~20 000 in human, mouse and fly, ~13 000 in roundworm, >46 000 in rice) encode all the protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s in an organism. These protein-coding sequences make up 1–2% of the human genome. A large part of the genome is transcribed however, to intron
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...

s, retrotransposon
Retrotransposon
Retrotransposons are genetic elements that can amplify themselves in a genome and are ubiquitous components of the DNA of many eukaryotic organisms. They are a subclass of transposon. They are particularly abundant in plants, where they are often a principal component of nuclear DNA...

s and seemingly a large array of noncoding RNAs. Total number of proteins (the Earth's proteome
Proteome
The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism. More specifically, it is the set of expressed proteins in a given type of cells or an organism at a given time under defined conditions. The term is a portmanteau of proteins and genome.The term has been...

) is estimated to be 5 million sequences.

Genetic and genomic nomenclature

Gene nomenclature
Gene nomenclature
Gene nomenclature is the scientific naming of genes, the units of heredity in living organisms. An international committee published recommendations for genetic symbols and nomenclature in 1957. The need to develop formal guidelines for human gene names and symbols was recognized in the 1960s and...

 has been established by the HUGO
Human Genome Organisation
The Human Genome Organisation is an organization involved in the Human Genome Project, a project about mapping the human genome. HUGO was established in 1989 as an international organization, primarily to foster collaboration between genome scientists around the world...

 Gene Nomenclature Committee (HGNC) for each known human gene in the form of an approved gene name and symbol
Symbol
A symbol is something which represents an idea, a physical entity or a process but is distinct from it. The purpose of a symbol is to communicate meaning. For example, a red octagon may be a symbol for "STOP". On a map, a picture of a tent might represent a campsite. Numerals are symbols for...

 (short-form abbreviation
Abbreviation
An abbreviation is a shortened form of a word or phrase. Usually, but not always, it consists of a letter or group of letters taken from the word or phrase...

). All approved symbols are stored in the HGNC Database. Each symbol is unique and each gene is only given one approved gene symbol. This also facilitates electronic
Electronics
Electronics is the branch of science, engineering and technology that deals with electrical circuits involving active electrical components such as vacuum tubes, transistors, diodes and integrated circuits, and associated passive interconnection technologies...

 data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...

 retrieval from publications. In preference each symbol maintains parallel construction in different members of a gene family
Gene family
A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions...

 and can be used in other species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...

, especially the mouse
Mouse
A mouse is a small mammal belonging to the order of rodents. The best known mouse species is the common house mouse . It is also a popular pet. In some places, certain kinds of field mice are also common. This rodent is eaten by large birds such as hawks and eagles...

.

Evolutionary concept of a gene

George C. Williams
George C. Williams
Professor George Christopher Williams was an American evolutionary biologist.Williams was a professor emeritus of biology at the State University of New York at Stony Brook. He was best known for his vigorous critique of group selection. The work of Williams in this area, along with W. D...

 first explicitly advocated the gene-centric view of evolution
Gene-centered view of evolution
The gene-centered view of evolution, gene selection theory or selfish gene theory holds that evolution occurs through the differential survival of competing genes, increasing the frequency of those alleles whose phenotypic effects successfully promote their own propagation, with gene defined as...

 in his 1966 book Adaptation and Natural Selection
Adaptation and Natural Selection
Adaptation and Natural Selection: A Critique of Some Current Evolutionary Thought is a 1966 book by the American evolutionary biologist George C. Williams...

. He proposed an evolutionary concept of gene to be used when we are talking about natural selection
Natural selection
Natural selection is the nonrandom process by which biologic traits become either more or less common in a population as a function of differential reproduction of their bearers. It is a key mechanism of evolution....

 favoring some genes. The definition is: "that which segregates and recombines with appreciable frequency." According to this definition, even an asexual
Asexual reproduction
Asexual reproduction is a mode of reproduction by which offspring arise from a single parent, and inherit the genes of that parent only, it is reproduction which does not involve meiosis, ploidy reduction, or fertilization. A more stringent definition is agamogenesis which is reproduction without...

 genome could be considered a gene, insofar that it have an appreciable permanency through many generations.

The difference is: the molecular gene transcribes as a unit, and the evolutionary gene inherits as a unit.

Richard Dawkins
Richard Dawkins
Clinton Richard Dawkins, FRS, FRSL , known as Richard Dawkins, is a British ethologist, evolutionary biologist and author...

' books The Selfish Gene
The Selfish Gene
The Selfish Gene is a book on evolution by Richard Dawkins, published in 1976. It builds upon the principal theory of George C. Williams's first book Adaptation and Natural Selection. Dawkins coined the term "selfish gene" as a way of expressing the gene-centred view of evolution as opposed to the...

(1976) and The Extended Phenotype
The Extended Phenotype
The Extended Phenotype is a biological concept introduced by Richard Dawkins in a 1982 book with the same title. The main idea is that phenotype should not be limited to biological processes such as protein biosynthesis or tissue growth, but extended to include all effects that a gene has on its...

(1982) defended the idea that the gene is the only replicator
DNA replication
DNA replication is a biological process that occurs in all living organisms and copies their DNA; it is the basis for biological inheritance. The process starts with one double-stranded DNA molecule and produces two identical copies of the molecule...

 in living systems. This means that only genes transmit their structure largely intact and are potentially immortal in the form of copies. So, genes should be the unit of selection
Unit of selection
A unit of selection is a biological entity within the hierarchy of biological organisation that is subject to natural selection...

. In The Selfish Gene Dawkins attempts to redefine the word 'gene' to mean "an inheritable unit" instead of the generally accepted definition of "a section of DNA coding for a particular protein". In River Out of Eden
River out of Eden
River Out of Eden: A Darwinian View of Life is a 1995 popular science book by Richard Dawkins. The book is about Darwinian evolution and includes summaries of the topics covered in his earlier books, The Selfish Gene, The Extended Phenotype and The Blind Watchmaker. It is part of the Science...

, Dawkins further refined the idea of gene-centric selection by describing life as a river of compatible genes flowing through geological time. Scoop up a bucket of genes from the river of genes, and we have an organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...

 serving as temporary bodies or survival machines. A river of genes may fork into two branches representing two non-interbreeding species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...

 as a result of geographical separation.

Gene targeting and implications

Gene targeting is commonly referred to techniques for altering or disrupting mouse genes and provides the mouse models for studying the roles of individual genes in embryonic development, human disorders, aging and diseases. The mouse models, where one or more of its genes are deactivated or made inoperable, are called knockout mice. Since the first reports in which homologous recombination
Homologous recombination
Homologous recombination is a type of genetic recombination in which nucleotide sequences are exchanged between two similar or identical molecules of DNA. It is most widely used by cells to accurately repair harmful breaks that occur on both strands of DNA, known as double-strand breaks...

 in embryonic stem cell
Embryonic stem cell
Embryonic stem cells are pluripotent stem cells derived from the inner cell mass of the blastocyst, an early-stage embryo. Human embryos reach the blastocyst stage 4–5 days post fertilization, at which time they consist of 50–150 cells...

s was used to generate gene-targeted mice, gene targeting has proven to be a powerful means of precisely manipulating the mammalian genome, producing at least ten thousand mutant mouse strains and it is now possible to introduce mutations that can be activated at specific time points, or in specific cells or organs, both during development and in the adult animal.

Gene targeting strategies have been expanded to all kinds of modifications, including point mutation
Point mutation
A point mutation, or single base substitution, is a type of mutation that causes the replacement of a single base nucleotide with another nucleotide of the genetic material, DNA or RNA. Often the term point mutation also includes insertions or deletions of a single base pair...

s, isoform deletions, mutant allele correction, large pieces of chromosomal DNA insertion and deletion, tissue specific disruption combined with spatial and temporal regulation and so on. It is predicted that the ability to generate mouse models with predictable phenotypes will have a major impact on studies of all phases of development, immunology
Immunology
Immunology is a broad branch of biomedical science that covers the study of all aspects of the immune system in all organisms. It deals with the physiological functioning of the immune system in states of both health and diseases; malfunctions of the immune system in immunological disorders ; the...

, neurobiology, oncology
Oncology
Oncology is a branch of medicine that deals with cancer...

, physiology
Physiology
Physiology is the science of the function of living systems. This includes how organisms, organ systems, organs, cells, and bio-molecules carry out the chemical or physical functions that exist in a living system. The highest honor awarded in physiology is the Nobel Prize in Physiology or...

, metabolism
Metabolism
Metabolism is the set of chemical reactions that happen in the cells of living organisms to sustain life. These processes allow organisms to grow and reproduce, maintain their structures, and respond to their environments. Metabolism is usually divided into two categories...

, and human diseases. Gene targeting is also in theory applicable to species from which totipotent embryonic stem cells can be established, and therefore may offer a potential to the improvement of domestic animals and plants.

Changing concept

The concept of the gene has changed considerably (see history section). From the original definition of a "unit of inheritance", the term evolved to mean a DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

-based unit that can exert its effects on the organism through RNA
RNA
Ribonucleic acid , or RNA, is one of the three major macromolecules that are essential for all known forms of life....

 or protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 products. It was also previously believed that one gene makes one protein; this concept was overthrown by the discovery of alternative splicing
Alternative splicing
Alternative splicing is a process by which the exons of the RNA produced by transcription of a gene are reconnected in multiple ways during RNA splicing...

 and trans-splicing
Trans-splicing
Trans-splicing is a special form of RNA processing in eukaryotes where exons from two different primary RNA transcripts are joined end to end and ligated....

.

The definition of a gene is still changing. The first cases of RNA-based inheritance have been discovered in mammals. Evidence is also accumulating that the control regions
Enhancer (genetics)
In genetics, an enhancer is a short region of DNA that can be bound with proteins to enhance transcription levels of genes in a gene cluster...

 of a gene do not necessarily have to be close to the coding sequence on the linear molecule or even on the same chromosome. Spilianakis and colleagues discovered that the promoter region of the interferon-gamma
Interferon-gamma
Interferon-gamma is a dimerized soluble cytokine that is the only member of the type II class of interferons. This interferon was originally called macrophage-activating factor, a term now used to describe a larger family of proteins to which IFN-γ belongs...

 gene on chromosome 10 and the regulatory regions of the T(H)2 cytokine
Cytokine
Cytokines are small cell-signaling protein molecules that are secreted by the glial cells of the nervous system and by numerous cells of the immune system and are a category of signaling molecules used extensively in intercellular communication...

 locus on chromosome 11 come into close proximity in the nucleus
Cell nucleus
In cell biology, the nucleus is a membrane-enclosed organelle found in eukaryotic cells. It contains most of the cell's genetic material, organized as multiple long linear DNA molecules in complex with a large variety of proteins, such as histones, to form chromosomes. The genes within these...

 possibly to be jointly regulated.

The concept that genes are clearly delimited is also being eroded. There is evidence for fused proteins stemming from two adjacent genes that can produce two separate protein products. While it is not clear whether these fusion proteins are functional, the phenomenon is more frequent than previously thought. Even more ground-breaking than the discovery of fused genes is the observation that some proteins can be composed of exon
Exon
An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing. The mature RNA molecule can be a messenger RNA...

s from far away regions and even different chromosomes. This new data has led to an updated, and probably tentative, definition of a gene as "a union of genomic sequences encoding a coherent set of potentially overlapping functional products". This new definition categorizes genes by functional products, whether they be proteins or RNA, rather than specific DNA loci; all regulatory elements of DNA are therefore classified as gene-associated regions.

See also

  • Big gene
    Big gene
    The 'big genes' are a class of genes. A big gene is a single gene whose nuclear transcript spans 500 kb or more of chromosomal DNA.- Character :The largest of the big genes is the gene for dystrophin, which spans 2.3 Mb...

  • Copy number variation
  • DNA
    DNA
    Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

  • Epigenetics
    Epigenetics
    In biology, and specifically genetics, epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence – hence the name epi- -genetics...

  • Full genome sequencing
    Full genome sequencing
    Full genome sequencing , also known as whole genome sequencing , complete genome sequencing, or entire genome sequencing, is a laboratory process that determines the complete DNA sequence of an organism's genome at a single time...

  • Gene-centric view of evolution
    Gene-centered view of evolution
    The gene-centered view of evolution, gene selection theory or selfish gene theory holds that evolution occurs through the differential survival of competing genes, increasing the frequency of those alleles whose phenotypic effects successfully promote their own propagation, with gene defined as...

  • Gene dosage
    Gene dosage
    Gene dosage is the number of copies of a gene present in a cell or nucleus. An increase in gene dosage can cause higher levels of gene product if the gene is not subject to regulation from elsewhere in the body....

  • Gene expression
    Gene expression
    Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

  • Gene family
    Gene family
    A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions...

  • Gene patent
  • Gene pool
    Gene pool
    In population genetics, a gene pool is the complete set of unique alleles in a species or population.- Description :A large gene pool indicates extensive genetic diversity, which is associated with robust populations that can survive bouts of intense selection...

  • Gene redundancy
    Gene redundancy
    Gene redundancy is the existence of several genes in the genome of an organism that perform the same role to some extent. This is the case for many sets of paralogous genes...

  • Gene therapy
    Gene therapy
    Gene therapy is the insertion, alteration, or removal of genes within an individual's cells and biological tissues to treat disease. It is a technique for correcting defective genes that are responsible for disease development...

  • Genetic algorithm
    Genetic algorithm
    A genetic algorithm is a search heuristic that mimics the process of natural evolution. This heuristic is routinely used to generate useful solutions to optimization and search problems...

  • Genetic engineering
    Genetic engineering
    Genetic engineering, also called genetic modification, is the direct human manipulation of an organism's genome using modern DNA technology. It involves the introduction of foreign DNA or synthetic genes into the organism of interest...

  • Genetics
    Genetics
    Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....

  • Genome
    Genome
    In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

  • Genomics
    Genomics
    Genomics is a discipline in genetics concerning the study of the genomes of organisms. The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts. The field also includes studies of intragenomic phenomena such as heterosis,...

  • List of gene prediction software
  • List of notable genes
  • Meme
    Meme
    A meme is "an idea, behaviour or style that spreads from person to person within a culture."A meme acts as a unit for carrying cultural ideas, symbols or practices, which can be transmitted from one mind to another through writing, speech, gestures, rituals or other imitable phenomena...

  • Predictive medicine
    Predictive medicine
    Predictive medicine is a rapidly emerging field of medicine that entails predicting disease and instituting preventive measures in order to either prevent the disease altogether or significantly decrease its impact upon the patient...

  • Pseudogene
    Pseudogene
    Pseudogenes are dysfunctional relatives of known genes that have lost their protein-coding ability or are otherwise no longer expressed in the cell...


Further reading

Google Book Search; first published 1976.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK