SET domain
Encyclopedia
The SET domain is a protein domain
Domain (biology)
In biological taxonomy, a domain is the highest taxonomic rank of organisms, higher than a kingdom. According to the three-domain system of Carl Woese, introduced in 1990, the Tree of Life consists of three domains: Archaea, Bacteria and Eukarya...

 . It was originally identified as part of a larger conserved
Conserved sequence
In biology, conserved sequences are similar or identical sequences that occur within nucleic acid sequences , protein sequences, protein structures or polymeric carbohydrates across species or within different molecules produced by the same organism...

 region present in the Drosophila
Drosophila
Drosophila is a genus of small flies, belonging to the family Drosophilidae, whose members are often called "fruit flies" or more appropriately pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species to linger around overripe or rotting fruit...

Trithorax protein and was subsequently identified in the Drosophila Su(var)3-9 and 'Enhancer of zeste' proteins, from which the acronym SET is derived.

The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structure
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...

s of very different protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone
Histone
In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and play a role in gene regulation...

 H3 on lysine
Lysine
Lysine is an α-amino acid with the chemical formula HO2CCH4NH2. It is an essential amino acid, which means that the human body cannot synthesize it. Its codons are AAA and AAG....

 9,human SET7 (also called SET9), which methylates
Methylation
In the chemical sciences, methylation denotes the addition of a methyl group to a substrate or the substitution of an atom or group by a methyl group. Methylation is a form of alkylation with, to be specific, a methyl group, rather than a larger carbon chain, replacing a hydrogen atom...

 H3 on lysine
Lysine riboswitch
The Lysine riboswitch is a metabolite binding RNA element found within certain messenger RNAs that serve as a precision sensor for the amino acid lysine. Allosteric rearrangement of mRNA structure is mediated by ligand binding, and this results in modulation of gene expression. This riboswitch is...

 4 and garden pea Rubisco LSMT, an enzyme
Enzyme
Enzymes are proteins that catalyze chemical reactions. In enzymatic reactions, the molecules at the beginning of the process, called substrates, are converted into different molecules, called products. Almost all chemical reactions in a biological cell need enzymes in order to occur at rates...

 that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure
Cis-regulatory element
A cis-regulatory element or cis-element is a region of DNA or RNA that regulates the expression of genes located on that same molecule of DNA . This term is constructed from the Latin word cis, which means "on the same side as". These cis-regulatory elements are often binding sites for one or...

. Although in all three studies, electron
Electron
The electron is a subatomic particle with a negative elementary electric charge. It has no known components or substructure; in other words, it is generally thought to be an elementary particle. An electron has a mass that is approximately 1/1836 that of the proton...

 density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase
Methyltransferase
A methyltransferase is a type of transferase enzyme that transfers a methyl group from a donor to an acceptor.Methylation often occurs on nucleic bases in DNA or amino acids in protein structures...

 fold
Protein folding
Protein folding is the process by which a protein structure assumes its functional shape or conformation. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil....

. Strictly conserved in the C-terminal motif of the SET domain tyrosine
Tyrosine
Tyrosine or 4-hydroxyphenylalanine, is one of the 22 amino acids that are used by cells to synthesize proteins. Its codons are UAC and UAU. It is a non-essential amino acid with a polar side group...

 could be involved in abstracting a proton
Proton
The proton is a subatomic particle with the symbol or and a positive electric charge of 1 elementary charge. One or more protons are present in the nucleus of each atom, along with neutrons. The number of protons in each atom is its atomic number....

 from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group
Methyl group
Methyl group is a functional group derived from methane, containing one carbon atom bonded to three hydrogen atoms —CH3. The group is often abbreviated Me. Such hydrocarbon groups occur in many organic compounds. The methyl group can be found in three forms: anion, cation and radical. The anion...

 of the AdoMet cofactor
Cofactor (biochemistry)
A cofactor is a non-protein chemical compound that is bound to a protein and is required for the protein's biological activity. These proteins are commonly enzymes, and cofactors can be considered "helper molecules" that assist in biochemical transformations....

. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates
Substrate (biochemistry)
In biochemistry, a substrate is a molecule upon which an enzyme acts. Enzymes catalyze chemical reactions involving the substrate. In the case of a single substrate, the substrate binds with the enzyme active site, and an enzyme-substrate complex is formed. The substrate is transformed into one or...

 on top of the cofactor, it is noted from the Rubisco LSMT structure
Protein structure
Proteins are an important class of biological macromolecules present in all organisms. Proteins are polymers of amino acids. Classified by their physical size, proteins are nanoparticles . Each protein polymer – also known as a polypeptide – consists of a sequence formed from 20 possible L-α-amino...

 that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate
Substrate (biochemistry)
In biochemistry, a substrate is a molecule upon which an enzyme acts. Enzymes catalyze chemical reactions involving the substrate. In the case of a single substrate, the substrate binds with the enzyme active site, and an enzyme-substrate complex is formed. The substrate is transformed into one or...

 could be subjected to multiple rounds of methylation
Methylation
In the chemical sciences, methylation denotes the addition of a methyl group to a substrate or the substitution of an atom or group by a methyl group. Methylation is a form of alkylation with, to be specific, a methyl group, rather than a larger carbon chain, replacing a hydrogen atom...

 without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s modulates growth
Cell growth
The term cell growth is used in the contexts of cell development and cell division . When used in the context of cell division, it refers to growth of cell populations, where one cell grows and divides to produce two "daughter cells"...

 control. The SET domain-containing Drosophila
Drosophila
Drosophila is a genus of small flies, belonging to the family Drosophilidae, whose members are often called "fruit flies" or more appropriately pomace flies, vinegar flies, or wine flies, a reference to the characteristic of many species to linger around overripe or rotting fruit...

 melanogaster
(Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue
Homology (biology)
Homology forms the basis of organization for comparative biology. In 1843, Richard Owen defined homology as "the same organ in different animals under every variety of form and function". Organs as different as a bat's wing, a seal's flipper, a cat's paw and a human hand have a common underlying...

 may be involved in the regulation
Regulation
Regulation is administrative legislation that constitutes or constrains rights and allocates responsibilities. It can be distinguished from primary legislation on the one hand and judge-made law on the other...

 of gene transcription and chromatin
Chromatin
Chromatin is the combination of DNA and proteins that make up the contents of the nucleus of a cell. The primary functions of chromatin are; to package DNA into a smaller volume to fit in the cell, to strengthen the DNA to allow mitosis and meiosis and prevent DNA damage, and to control gene...

 structure
RNA structure
Biomolecular structure is the structure of biomolecules, mainly proteins and the nucleic acids DNA and RNA. The structure of these molecules is frequently decomposed into primary structure, secondary structure, tertiary structure, and quaternary structure. The scaffold for this structure is...

.

Histone lysine methylation is part of the histone code
Histone code
The histone code is a hypothesis that the transcription of genetic information encoded in DNA is in part regulated by chemical modifications to histone proteins, primarily on their unstructured ends. Together with similar modifications such as DNA methylation it is part of the epigenetic code...

 that regulated chromatin function and epigenetic
Epigenetics
In biology, and specifically genetics, epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence – hence the name epi- -genetics...

 control of gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

 function. Histone lysine methyltransferases (HMTase) differ both in their substrate
Substrate (biology)
In biology a substrate is the surface a plant or animal lives upon and grows on. A substrate can include biotic or abiotic materials and animals. For example, encrusting algae that lives on a rock can be substrate for another animal that lives on top of the algae. See also substrate .-External...

 specificity for the various acceptor lysines as well as in their product
Product (chemistry)
Product are formed during chemical reactions as reagents are consumed. Products have lower energy than the reagents and are produced during the reaction according to the second law of thermodynamics. The released energy comes from changes in chemical bonds between atoms in reagent molecules and...

 specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequence
Sequence (biology)
A sequence in biology is the one-dimensional ordering of monomers, covalently linked within in a biopolymer; it is also referred to as the primary structure of the biological macromolecule.-See also:* Protein sequence* DNA sequence...

s surrounding the SET domain. Structural
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...

 studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues
Residue (chemistry)
In chemistry, residue is the material remaining after a distillation or an evaporation, or to a portion of a larger molecule, such as a methyl group. It may also refer to the undesired byproducts of a reaction....

 in the SET domain in determining the methylation specificities.

The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine
Cysteine
Cysteine is an α-amino acid with the chemical formula HO2CCHCH2SH. It is a non-essential amino acid, which means that it is biosynthesized in humans. Its codons are UGU and UGC. The side chain on cysteine is thiol, which is polar and thus cysteine is usually classified as a hydrophilic amino acid...

 residue
Residue (chemistry)
In chemistry, residue is the material remaining after a distillation or an evaporation, or to a portion of a larger molecule, such as a methyl group. It may also refer to the undesired byproducts of a reaction....

s that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc
Zinc
Zinc , or spelter , is a metallic chemical element; it has the symbol Zn and atomic number 30. It is the first element in group 12 of the periodic table. Zinc is, in some respects, chemically similar to magnesium, because its ion is of similar size and its only common oxidation state is +2...

 ion
Ion
An ion is an atom or molecule in which the total number of electrons is not equal to the total number of protons, giving it a net positive or negative electrical charge. The name was given by physicist Michael Faraday for the substances that allow a current to pass between electrodes in a...

s to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral
Tetrahedron
In geometry, a tetrahedron is a polyhedron composed of four triangular faces, three of which meet at each vertex. A regular tetrahedron is one in which the four triangles are regular, or "equilateral", and is one of the Platonic solids...

 configuration. The function of this domain is structural, holding together 2 long segments of random coils.

The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure
Tertiary structure
In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.-Relationship to primary structure:...

 close to the SET domain active site
Active site
In biology the active site is part of an enzyme where substrates bind and undergo a chemical reaction. The majority of enzymes are proteins but RNA enzymes called ribozymes also exist. The active site of an enzyme is usually found in a cleft or pocket that is lined by amino acid residues that...

. The structured post-SET region brings in the C-terminal residue
Residue (chemistry)
In chemistry, residue is the material remaining after a distillation or an evaporation, or to a portion of a larger molecule, such as a methyl group. It may also refer to the undesired byproducts of a reaction....

s that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues
Residue (chemistry)
In chemistry, residue is the material remaining after a distillation or an evaporation, or to a portion of a larger molecule, such as a methyl group. It may also refer to the undesired byproducts of a reaction....

 are essential for HMTase activity, as replacement with serine
Serine
Serine is an amino acid with the formula HO2CCHCH2OH. It is one of the proteinogenic amino acids. By virtue of the hydroxyl group, serine is classified as a polar amino acid.-Occurrence and biosynthesis:...

 abolishes HMTase activity.

Examples

Human genes encoding proteins containing this domain include:
  • ASH1L
    ASH1L
    Probable histone-lysine N-methyltransferase ASH1L is an enzyme that in humans is encoded by the ASH1L gene.-Further reading:...

  • BAT8
  • EHMT1, EHMT2
    EHMT2
    Histone-lysine N-methyltransferase, H3 lysine-9 specific 3 is a histone methyltransferase that in humans is encoded by the EHMT2 gene.-Further reading:...

    , EZH1
    EZH1
    Histone-lysine N-methyltransferase EZH1 is an enzyme that in humans is encoded by the EZH1 gene.-Function:In mice, EZH1 and EZH2 cogovern histone H3K27 trimethylation and are essential for hair follicle homeostasis and wound repair. EZH1 also complements EZH2 in maintaining stem cell identity and...

    , EZH2
    EZH2
    Histone-lysine N-methyltransferase EZH2 is an enzyme that in humans is encoded by the EZH2 gene.This gene encodes a member of the Polycomb-group family. PcG family members form multimeric protein complexes, which are involved in maintaining the transcriptional repressive state of genes over...

  • FP13812
  • MLL
    MLL (gene)
    Histone-lysine N-methyltransferase HRX is an enzyme that in humans is encoded by the MLL gene.MLL is a histone methyltransferase deemed a positive global regulator of gene transcription...

    , MLL2
    MLL2
    Histone-lysine N-methyltransferase MLL2 is an enzyme that in humans is encoded by the MLL2 gene.This is a Trithorax-group proteinThe gene was originally named MLL2 after myeloid/lymphoid or mixed-lineage leukemia cases...

    , MLL3
    MLL3
    Histone-lysine N-methyltransferase MLL3 is an enzyme that in humans is encoded by the MLL3 gene.-Interactions:MLL3 has been shown to interact with NCOA6 and RBBP5.- External links :...

    , MLL5
  • NSD1
    NSD1
    NSD1 is a transcription coregulator protein associated with Sotos syndrome and Weaver syndrome.-External Links:*...

  • PRDM1
    PRDM1
    PR domain zinc finger protein 1 also known as BLIMP-1 is a protein that in humans is encoded by the PRDM1 gene. BLIMP-1 acts as a repressor of beta-interferon gene expression. The protein binds specifically to the PRDI of the β-IFN gene promoter...

    , PRDM2
    PRDM2
    PR domain zinc finger protein 2 is a protein that in humans is encoded by the PRDM2 gene.-Interactions:PRDM2 has been shown to interact with Estrogen receptor alpha and Retinoblastoma protein.- External links :...

    , PRDM5
  • SETD1A, SETD2
    SETD2
    Histone-lysine N-methyltransferase SETD2 is an enzyme that in humans is encoded by the SETD2 gene.-Interactions:SETD2 has been shown to interact with Huntingtin.-Further reading:...

    , SETD3, SETD4, SETD5, SETD6, SETD7
    SETD7
    Histone-lysine N-methyltransferase SETD7 is an enzyme that in humans is encoded by the SETD7 gene.-Further reading:...

    , SETD8
    SETD8
    Histone-lysine N-methyltransferase SETD8 is an enzyme that in humans is encoded by the SETD8 gene.-Further reading:...

    , SETDB1
    SETDB1
    Histone-lysine N-methyltransferase SETDB1 is an enzyme that in humans is encoded by the SETDB1 gene.- Function :The SET domain is a highly conserved, approximately 150-amino acid motif implicated in the modulation of chromatin structure...

    , SETDB2, SETMAR
    SETMAR
    Histone-lysine N-methyltransferase SETMAR is an enzyme that in humans is encoded by the SETMAR gene.-Further reading:...

    , SMYD1, SMYD3
    SMYD3
    SET and MYND domain-containing protein 3 is a protein that in humans is encoded by the SMYD3 gene.-Interactions:SMYD3 has been shown to interact with Heat shock protein 90kDa alpha , member A1 and POLR2A.-Further reading:...

    , SMYD4
    SMYD4
    SET and MYND domain-containing protein 4 is a protein that in humans is encoded by the SMYD4 gene.-Further reading:...

    , SMYD5, SUV39H1
    SUV39H1
    Histone-lysine N-methyltransferase SUV39H1 is an enzyme that in humans is encoded by the SUV39H1 gene.-Interactions:SUV39H1 has been shown to interact with HDAC9, HDAC1, Histone deacetylase 2, Retinoblastoma protein, CBX5, HDAC3, DNMT3A, MBD1, RUNX1, SBF1 and CBX1.-Further reading:...

    , SUV39H2
    SUV39H2
    Histone-lysine N-methyltransferase SUV39H2 is an enzyme that in humans is encoded by the SUV39H2 gene.-Further reading:...

    ,

SUV420H1
SUV420H1
Histone-lysine N-methyltransferase SUV420H1 is an enzyme that in humans is encoded by the SUV420H1 gene.-Further reading:...

, SUV420H2,
  • WBP7, WHSC1
    WHSC1
    Probable histone-lysine N-methyltransferase NSD2 is an enzyme that in humans is encoded by the WHSC1 gene.-Further reading:...

    , WHSC1L1
    WHSC1L1
    Histone-lysine N-methyltransferase NSD3 is an enzyme that in humans is encoded by the WHSC1L1 gene.-Further reading:...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK