Short linear motif
Encyclopedia
In molecular biology Short Linear Motifs (also known as SLiMs, Linear Motifs or minimotifs) are short stretches of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 sequence that mediate protein protein interaction.

The first definition was given by Tim Hunt
Tim Hunt
Sir Richard Timothy "Tim" Hunt, FRS is an English biochemist.Hunt was awarded the 2001 Nobel Prize in Physiology or Medicine with Paul Nurse and Leland H...

:
“The sequences of many proteins contain short, conserved motifs that are involved in recognition and targeting activities, often separate from other functional properties of the molecule in which they occur. These motifs are linear, in the sense that three-dimensional organization is not required to bring distant segments of the molecule together to make the recognizable unit. The conservation of these motifs varies: some are highly conserved while others, for example, allow substitutions that retain only a certain pattern of charge across the motif.”

Attributes

SLiMs are generally situated in intrinsically disordered
Intrinsically unstructured proteins
Intrinsically unstructured proteins, often referred to as naturally unfolded proteins or disordered proteins, are proteins characterized by lack of stable tertiary structure when the protein exists as an isolated polypeptide chain under physiological conditions in vitro...

 regions (over 80% of known SLiMs), however, upon interaction with a structured partner secondary structure
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...

 is often induced. The majority of annotated SLiMs consist of 3 to 11 contiguous amino acids
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...

, with an average of just over 6 residues. However, only few hotspot residues (on average 1 hotspot for each 3 residues in the motif) contribute the majority of the free energy of binding
Binding energy
Binding energy is the mechanical energy required to disassemble a whole into separate parts. A bound system typically has a lower potential energy than its constituent parts; this is what keeps the system together—often this means that energy is released upon the creation of a bound state...

 and determine most of the affinity and specificity of the interaction. Although most motifs have no positional preference, several of them are required to be localized at the protein termini in order to be functional .
The key defining attribute of SLiMs, having a limited number of residues that directly contact the binding partner, has two major consequences. First, only few or even a single mutation can result in the generation of a functional motif, with further mutations of flanking residues allowing tuning affinity and specificity. This results in SLiMs having an increased propensity to evolve convergently
Convergent evolution
Convergent evolution describes the acquisition of the same biological trait in unrelated lineages.The wing is a classic example of convergent evolution in action. Although their last common ancestor did not have wings, both birds and bats do, and are capable of powered flight. The wings are...

, which facilitates their proliferation, as is evidenced by their increased incidence in higher Eukaryotes
Eukaryote
A eukaryote is an organism whose cells contain complex structures enclosed within membranes. Eukaryotes may more formally be referred to as the taxon Eukarya or Eukaryota. The defining membrane-bound structure that sets eukaryotic cells apart from prokaryotic cells is the nucleus, or nuclear...

. It has been hypothesized that this might increase and restructure the connectivity of the interactome
Interactome
Interactome is defined as the whole set of molecular interactions in cells. It is usually displayed as a directed graph. Molecular interactions can occur between molecules belonging to different biochemical families and also within a given family...

. Second, SLiMs have relatively low affinity for their interaction partners (generally between 1 and 150 μM), which makes these interactions transient and reversible, and thus ideal to mediate dynamic processes such as cell signaling
Cell signaling
Cell signaling is part of a complex system of communication that governs basic cellular activities and coordinates cell actions. The ability of cells to perceive and correctly respond to their microenvironment is the basis of development, tissue repair, and immunity as well as normal tissue...

. In addition, this means that these interactions can be easily modulated by post-translational modifications
Posttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....

 that change the structural and physicochemical properties of the motif. Also, regions of high functional density can mediate molecular switching
Molecular switch
A molecular switch is a molecule that can be reversibly shifted between two or more stable states. The molecules may be shifted between the states in response to changes in e.g. pH, light, temperature, an electrical current, microenvironment, or the presence of a ligand. In some cases, a...

 by means of overlapping motifs (e.g. the C-terminal tails of integrin
Integrin
Integrins are receptors that mediate attachment between a cell and the tissues surrounding it, which may be other cells or the ECM. They also play a role in cell signaling and thereby regulate cellular shape, motility, and the cell cycle....

 beta subunits), or they can allow high avidity
Avidity
In proteins, avidity is a term used to describe the combined strength of multiple bond interactions. Avidity is distinct from affinity, which is a term used to describe the strength of a single bond...

 interactions by multiple low affinity motifs (e.g. multiple AP2-binding motifs in Eps15) .

Function

The molecular function of a SLiM is to deliver specific interactions with additional protein domain(s). In general, the SLiM itself serves as specific information mediator whereas the result may influence the SLiM-bearing protein as a complete entity.

Consequently, in a cellular context, this may result in different functions dependent on the actual kind of interaction domain. The common way of interaction is the bare binding of the SLiM to an interaction domain that may result in being part of a protein complex, may it be as effector or as central hub of such a complex. A subset of this are targeting SLiMs that enable the SLiM bearing protein to form complexes with cellular transporter hence being able to change cellular compartments.

In case of modifying domains the effect of SLiM recognition and interaction will be a modification of the sequence, e.g. a post translational modification
Posttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....

 (PTMs) or a sequence cleavage event. In this modified state the SLiM bearing protein may be involved in additional interactions with further downstream proteins of a pathway.

Overview of SLiM functions

Protein binding motifs

deliver binding specifity with domains of interacting proteins hence resulting in being part of a protein complex, may it be as effector or as central hub. They may also be involved into the co-operative assembly of scaffolds, with a typical example being SLiMs with Proline-rich sequences that are responsible for binding of SH3 domains
SH3 domain
The SRC Homology 3 Domain is a small protein domain of about 60 amino acids residues first identified as a conserved sequence in the viral adaptor protein v-Crk and the non-catalytic parts of enzymes such as phospholipase and several cytoplasmic tyrosine kinases such as Abl and Src...

.

Targeting motifs

are recognized by domains of cellular transporters leading to a switch in cellular compartmentalisation. Famous examples are Nuclear localisation signals (NLSs) and Nuclear export signals
Nuclear export signal
A nuclear export signal is a short amino acid sequence of 4 hydrophobic residues in a protein that targets it for export from the cell nucleus to the cytoplasm through the nuclear pore complex using nuclear transport. It has the opposite effect of a nuclear localization signal, which targets a...

 (NESs) together being capable to deliver the nuclear shuttling capablities of tumor suppressor proteins
Tumor suppressor gene
A tumor suppressor gene, or anti-oncogene, is a gene that protects a cell from one step on the path to cancer. When this gene is mutated to cause a loss or reduction in its function, the cell can progress to cancer, usually in combination with other genetic changes.-Two-hit hypothesis:Unlike...

 in a fine-tuned fashion.

Posttranslational modifications

may result in phosphorylation, myristoylation
Myristoylation
Myristoylation is an irreversible, co-translational protein modification found in animals, plants, fungi, protozoans and viruses. In this protein modification, a myristoyl group is covalently attached via an amide bond to the alpha-amino group of an N-terminal amino acid of a nascent polypeptide...

, N-linked glycosylation or other PTMs often being part of bigger signal communication.

Cleavage sites

are recognition sites of endo-peptidases. The products may also bear specific informational content, e.g. constituting terminal-specific degrons
Degron
A degron is a specific sequence of amino acids in a protein that directs the starting place of degradation. A degron sequence can occur at either the N or C-terminal region, these are called N-Degrons or C-degrons respectively....

.




Considering SLiM functions in a cellbiological perspective you would state their involvement in almost any pathway due to their critical role in protein-protein interaction and signal transduction.

Role in disease

Several diseases have been linked to mutations in SLiMs.
For instance, one cause of Noonan Syndrome
Noonan syndrome
Noonan Syndrome is a relatively common autosomal dominant congenital disorder considered to be a type of dwarfism, that affects both males and females equally. It used to be referred to as the male version of Turner's syndrome ; however, the genetic causes of Noonan syndrome and Turner syndrome...

 is a mutation in the protein Raf-1 which abrogates the interaction with 14-3-3 proteins mediated by corresponding short linear motifs and thereby deregulate the Raf-1
C-Raf
RAF proto-oncogene serine/threonine-protein kinase also known as proto-oncogene c-RAF or simply c-Raf is an enzyme that in humans is encoded by the RAF1 gene. The c-Raf protein functions in the MAPK/ERK signal transduction pathway as part of a protein kinase cascade...

 kinase
Kinase
In chemistry and biochemistry, a kinase is a type of enzyme that transfers phosphate groups from high-energy donor molecules, such as ATP, to specific substrates, a process referred to as phosphorylation. Kinases are part of the larger family of phosphotransferases...

 activity . Usher's Syndrome
Usher syndrome
Usher syndrome is a relatively rare genetic disorder that is a leading cause of deafblindness and that is associated with a mutation in any one of 10 genes. Other names for Usher syndrome include Hallgren syndrome, Usher-Hallgren syndrome, rp-dysacusis syndrome and dystrophia retinae dysacusis...

 is the most frequent cause of hereditary deaf-blindness in humans and can be caused by mutations in either PDZ domains
PDZ domain
The PDZ domain is a common structural domain of 80-90 amino-acids found in the signaling proteins of bacteria, yeast, plants, viruses and animals...

 in Harmonin or the corresponding PDZ interaction motifs in the SANS protein .
Finally, Liddle's Syndrome
Liddle's Syndrome
Liddle's syndrome, also called Liddle syndrome and pseudoaldosteronism, is an autosomal dominant disorder characterized by early, and frequently severe, hypertension associated with low plasma renin activity, metabolic alkalosis due to hypokalemia, and hypoaldosteronism...

 has been implicated with autosomal dominant activating mutations in the WW interaction motif in the β-(SCNNB_HUMA) and γ-(SCNNG_HUMA) subunits of the Epithelial sodium channel
Epithelial sodium channel
The epithelial sodium channel is a membrane-bound ion-channel that is permeable for Li+-ions, protons and especially Na+-ions. It is a constitutively active ion-channel...

 ENaC
Epithelial sodium channel
The epithelial sodium channel is a membrane-bound ion-channel that is permeable for Li+-ions, protons and especially Na+-ions. It is a constitutively active ion-channel...

  . These mutations abrogate the binding to the ubiquitin ligase NEDD4
NEDD4
E3 ubiquitin-protein ligase NEDD4 also known as neural precursor cell expressed developmentally down-regulated protein 4 is an enzyme that in humans is encoded by the NEDD4 gene....

, thereby inhibiting channel degradation and prolonging the half-life of ENaC
Epithelial sodium channel
The epithelial sodium channel is a membrane-bound ion-channel that is permeable for Li+-ions, protons and especially Na+-ions. It is a constitutively active ion-channel...

, ultimately resulting in increased Na+ reabsorption, plasma volume extension and hypertension .

Viruses often mimic human SLiMs to hijack and disrupt a host's cellular machinery, thereby adding functionality to their compact genomes without necessitating new virally encoded proteins. In fact, many motifs were originally discovered in viruses, such as the Retinoblastoma binding LxCxE motif and the UEV domain binding PTAP late domain. The short generation times and high mutation rates of viruses, in association with natural selection, has led to multiple examples of mimicry of host SLiMs in every step of the viral life cycle (Src binding motif PxxP in Nef modulates replication, WW domain binding PPxY mediates budding in Ebola virus, A Dynein Light Chain binding motif in Rabies virus is vital for host infection). The extent of human SLiM mimicry is surprising with many viral proteins containing several functional SLiMs, for example, the Adenovirus protein E1A.

Pathogenic bacteria also mimic host motifs (as well as having their own motifs), however, not to the same extent as the obligate parasite viruses. E. Coli injects a protein, EspF(U), that mimics an autoinhibitory element of N-WASP into the host cell to activate actin-nucleating factors WASP . The KDEL motif of the bacteriophage
Bacteriophage
A bacteriophage is any one of a number of viruses that infect bacteria. They do this by injecting genetic material, which they carry enclosed in an outer protein capsid...

 encoded cholera toxin mediates cell entry of the cholera bacterium .

Potential as leads for drug design

Linear motif mediated protein-protein interactions have shown promise in recent years as novel drug targets. Success stories include the MDM2
Mdm2
Mdm2 is an important negative regulator of the p53 tumor suppressor. It is the name of a gene as well as the protein encoded by that gene. Mdm2 protein functions both as an E3 ubiquitin ligase that recognizes the N-terminal trans-activation domain of the p53 tumor suppressor and an inhibitor of...

 motif analog Nutlin-3 and integrin
Integrin
Integrins are receptors that mediate attachment between a cell and the tissues surrounding it, which may be other cells or the ECM. They also play a role in cell signaling and thereby regulate cellular shape, motility, and the cell cycle....

 targeting RGD-mimetic Cilengitide
Cilengitide
Cilengitide is a molecule designed and synthesized at the Technical University Munich in collaboration with Merck KGaA in Darmstadt. It is based on the cyclic peptide cyclo, which is selective for αv integrins, which are important in angiogenesis...

: Nutlin-3 antagonises the interaction of MDM2's SWIB domain with p53
P53
p53 , is a tumor suppressor protein that in humans is encoded by the TP53 gene. p53 is crucial in multicellular organisms, where it regulates the cell cycle and, thus, functions as a tumor suppressor that is involved in preventing cancer...

 thus stabilising p53 and inducing senescence in cancer cells . Cilengitide
Cilengitide
Cilengitide is a molecule designed and synthesized at the Technical University Munich in collaboration with Merck KGaA in Darmstadt. It is based on the cyclic peptide cyclo, which is selective for αv integrins, which are important in angiogenesis...

 inhibits integrin
Integrin
Integrins are receptors that mediate attachment between a cell and the tissues surrounding it, which may be other cells or the ECM. They also play a role in cell signaling and thereby regulate cellular shape, motility, and the cell cycle....

-dependent signaling, causing the disassembly of cytoskeleton
Cytoskeleton
The cytoskeleton is a cellular "scaffolding" or "skeleton" contained within a cell's cytoplasm and is made out of protein. The cytoskeleton is present in all cells; it was once thought to be unique to eukaryotes, but recent research has identified the prokaryotic cytoskeleton...

, cellular detachment and the induction of apoptosis
Apoptosis
Apoptosis is the process of programmed cell death that may occur in multicellular organisms. Biochemical events lead to characteristic cell changes and death. These changes include blebbing, cell shrinkage, nuclear fragmentation, chromatin condensation, and chromosomal DNA fragmentation...

 in endothelial and glioma
Glioma
A glioma is a type of tumor that starts in the brain or spine. It is called a glioma because it arises from glial cells. The most common site of gliomas is the brain.-By type of cell:...

 cells.. In addition, peptides targeting the Grb2
Grb2
Growth factor receptor-bound protein 2 also known as Grb2 is an adaptor protein involved in signal transduction/cell communication. In humans, the GRB2 protein is encoded by the GRB2 gene....

 and Crk SH2
SH2
SH2 may stand for:* SH2 domain , a protein domain within the Src oncoprotein* SH-2, an iteration of the SuperH CPU core developed by Hitachi* SH-2 Seasprite, an American-built ship-based helicopter...

/ SH3
SH3
SH3 or SH-3 may refer to:* SH3 domain, a protein structural domain thought to be involved in the formation of productive protein-protein binding interactions...

 adaptor domains are also under investigation .

There are at present no drugs on the market specially targeting phosphorylation
Phosphorylation
Phosphorylation is the addition of a phosphate group to a protein or other organic molecule. Phosphorylation activates or deactivates many protein enzymes....

 sites, however, a number of drugs target the kinase
Kinase
In chemistry and biochemistry, a kinase is a type of enzyme that transfers phosphate groups from high-energy donor molecules, such as ATP, to specific substrates, a process referred to as phosphorylation. Kinases are part of the larger family of phosphotransferases...

 domain. This tactic has shown promise in the treatments of various forms of cancer . For example, Stutnet® is a receptor tyrosine kinase
Receptor tyrosine kinase
Receptor tyrosine kinases s are the high-affinity cell surface receptors for many polypeptide growth factors, cytokines, and hormones. Of the 90 unique tyrosine kinase genes identified in the human genome, 58 encode receptor tyrosine kinase proteins....

 (RTK) inhibitor for treating gastrointestinal cancer, Gleevec® specially targets bcr-abl and Sprycel® is a broad-based tyrosine kinase inhibitor whose targets include Bcr-Abl and Src
Src
Src may refer to:* Src , a family of proto-oncogenic tyrosine kinases* In computer programming, a common abbreviation for source codeSee also*SRC...

. Cleavage is another process directed by motif recognition with the proteases responsible for cleavage a good drug target. For example, Tritace®, Vasotec®, Accupril®, and Lotensin® are substrate mimetic Angiotensin
Angiotensin
Angiotensin, a peptide hormone, causes blood vessels to constrict, and drives blood pressure up. It is part of the renin-angiotensin system, which is a major target for drugs that lower blood pressure. Angiotensin also stimulates the release of aldosterone, another hormone, from the adrenal cortex...

 converting enzymes inhibitors. Other drugs that target post-translational modifications include Zovirax®, an antiviral myristoylation
Myristoylation
Myristoylation is an irreversible, co-translational protein modification found in animals, plants, fungi, protozoans and viruses. In this protein modification, a myristoyl group is covalently attached via an amide bond to the alpha-amino group of an N-terminal amino acid of a nascent polypeptide...

 inhibitor and Farnysyl Transferase inhibitors that block the lipidation modification to a CAAX-box motif.

Recommended further reading:

Databases

SLiMs are usually described by regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s in the motif literature with the important residues defined based on a combination of experimental, structural and evolutionary evidence. However, high throughput screening such as phage display has seen a large increase in the available information for many motifs classes allowing them to be described with sequence logos
Sequence logo
In bioinformatics, a sequence logo is a graphical representation of the sequence conservation of nucleotides or amino acids .-Logo creation:...

. Several diverse repositories currently curate the available motif data. In terms of scope, the Eukaryotic Linear Motif resource
Eukaryotic Linear Motif
The Eukaryotic Linear Motif resource is a computational biology resource for investigating short linear motifs in eukaryotic proteins...

 (ELM) and MiniMotif Miner
Minimotif Miner
Minimotif Miner is a program and database designed to identify minimotifs in any protein. Minimotifs are short contiguous peptide sequences that are known to have a function in at least one protein. Minimotifs are also called sequence motifs or short linear motifs or SLiMs. These are generally...

 (MnM) represent the two largest motif databases as they attempt to capture all motifs from the available literature. Several more specific and specialised databases also exist, PepCyber and ScanSite focus on smaller subsets of motifs, phosphopeptide binding and important signaling domains respectively. PDZBase focuses solely on PDZ domain ligands. Merops
Merops
Merops may refer to:* Merops , a genus of bee-eaters.* MEROPS, an on-line database for peptidases.It may also refer to several figures from Greek mythology:* King of Ethiopia, husband of Clymene, who lay with Helios and bore Phaethon...

 and CutDB curate available proteolytic event data including protease specificity and cleavage sites. There has been a large increase in the number of publications describing motif mediated interactions over past decade and as a result a large amount of the available literature remains to be curated. Recent work has created the tool MiMosa to expedite the annotation process and encourage semantically robust motif descriptions.

Discovery tools

SLiMs are short and degenerate and as a result the proteome is littered with stochastically occurring peptides that resemble functional motifs. The biologically relevant cellular partners can easily distinguish functional motifs, however computational tools have yet to reach a level of sophistication where motif discovery can be accomplished with high success rates.

Motif discovery tools can be split into two major categories, discovery of novel instance of known functional motifs class and discovery of functional motifs class, however, they all use a limited and overlapping set of attributes to discriminate true and false positives. The main discrimatory attributes used in motif discovery are:
  • Accessibility - the motif must be accessible for the binding partner. Intrinsic disorder
    Intrinsically unstructured proteins
    Intrinsically unstructured proteins, often referred to as naturally unfolded proteins or disordered proteins, are proteins characterized by lack of stable tertiary structure when the protein exists as an isolated polypeptide chain under physiological conditions in vitro...

     prediction tools (such as IUPred or GlobPlot), domain databases (such as Pfam
    Pfam
    Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models.- Features :For each family in Pfam one can:* Look at multiple alignments* View protein domain architectures...

     and SMART
    Simple Modular Architecture Research Tool
    Simple Modular Architecture Research Tool is a classification scheme used in the identification and analysis of protein domains....

    ) and experimentally derived structural data (from sources such as PDB
    PDB
    PDB may refer to:* Protein Data Bank* Chess Problem Database Server * Partei der deutschsprachigen Belgier, the Party of German-speaking Belgians* President's Daily Brief* Promised Day Brigades* 1,4-Dichlorobenzene...

    ) can be used to check the accessibility of predicted motif instances.
  • Conservation - the conservation of a motif correlates strongly with functionality and many experimental motifs are seen as islands of strong constraint in regions of weak conservation. Alignment of homologous proteins can be used to calculate conservation metric for a motif.
  • Physicochemical properties - Certain intrinsic properties of residues or stretches of amino acids are strong discriminators of functionality, for example, the propensity of a region of disorder to undergo a disorder to order transition.
  • Enrichment in groupings of similar proteins - Motif often evolve convergently to carry out similar tasks in different proteins such as mediating binding to a specific partner or targeting proteins to a particular subcellular localisation. Often in such cases these grouping the motif occurs more often than is expected by chance and can be detected by searching for enriched motifs.

Novel functional motifs instances

The Eukaryotic Linear Motif resource
Eukaryotic Linear Motif
The Eukaryotic Linear Motif resource is a computational biology resource for investigating short linear motifs in eukaryotic proteins...

 (ELM) and MiniMotif Miner
Minimotif Miner
Minimotif Miner is a program and database designed to identify minimotifs in any protein. Minimotifs are short contiguous peptide sequences that are known to have a function in at least one protein. Minimotifs are also called sequence motifs or short linear motifs or SLiMs. These are generally...

(MnM) both provide servers to search for novel instance of known functional motifs in protein sequences. SLiMSearch allows similar searches on a proteome-wide scale .

Novel functional motifs class

More recently computational methods have been developed that can identify new Short Linear Motifs de novo. Interactome-based tools rely on identifying a set of proteins that are likely to share a common function, such as binding the same protein or being cleaved by the same peptidase. Two examples of such software are DILIMOT and SLiMFinder.. Anchor and α-MoRF-Pred use physicochemical properties to search for motif-like peptides in disordered regions. ANCHOR identifies stretches of intrinsically disordered regions that cannot form favorable intrachain interactions to fold without additional stabilising energy contributed by a globular interaction partner. α-MoRF-Pred uses the inherent propensity of many SLiM to under go a disorder to order transition upon binding to discover α-helical forming stretches within disordered regions.

External links


SLiM databases


SLiM discovery tools

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK