Generic Model Organism Database
Encyclopedia
The Generic Model Organism Database (GMOD) Project began as an effort to create reusable software tools for developing Model Organism
Model organism
A model organism is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the organism model will provide insight into the workings of other organisms. Model organisms are in vivo models and are widely used to...

 Databases (MODs). MODs describe genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

 and other information about important experimental organism
Model organism
A model organism is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the organism model will provide insight into the workings of other organisms. Model organisms are in vivo models and are widely used to...

s in the life sciences. Also called organism-specific database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s, these databases capture the large volumes of data and information being generated by modern biology
Biology
Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, origin, evolution, distribution, and taxonomy. Biology is a vast subject containing many subdivisions, topics, and disciplines...

.

Behind every MOD is a software system that is designed to help manage the data within the MOD, and to help users query and access those data. In the past, every MOD project developed its own software tools. GMOD is a loose federation of software applications (components) aimed at providing functionality that is needed by many or all MODs. Some of these software components are linked together by their use of a common database schema known as Chado. This project is funded by the United States National Institutes of Health
National Institutes of Health
The National Institutes of Health are an agency of the United States Department of Health and Human Services and are the primary agency of the United States government responsible for biomedical and health-related research. Its science and engineering counterpart is the National Science Foundation...

, National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

 and the USDA Agricultural Research Service
Agricultural Research Service
The Agricultural Research Service is the principal in-house research agency of the United States Department of Agriculture . ARS is one of four agencies in USDA's Research, Education and Economics mission area...

.

Chado database schema

Chado makes extensive use of controlled vocabularies to type all entities in the database, so there is a feature table where gene, transcripts, exons, transposable elements, etc. are stored and their type is provided by the Sequence Ontology. When a new datatype comes along, the feature table requires no modification, only an update of the data in the database. The same is largely true of analysis data that can be stored in Chado as well.

The existing core modules of Chado are:
  • sequence - for sequences/features
  • cv - for controlled-vocabs/ontologies
  • general - currently just dbxrefs
  • organism - taxonomic data
  • pub - publication and references
  • companalysis - augments sequence module with computational analysis data
  • map - non-sequence maps
  • genetic - genetic and phenotypic data
  • expression - gene expression

Software

The full list of GMOD software components is found on the GMOD Components page. These components include:
  • GMOD Core (Chado database and tools)
    • Chado : the Chado schema and tools to install it.
    • XORT : a tool for loading and dumping chado-xml
    • GMODTools : extracts data from a Chado database into common genome bulk formats (GFF, Fasta, etc)
  • MOD website
    • Tripal : a web front end based on Drupal
      Drupal
      Drupal is a free and open-source content management system and content management framework written in PHP and distributed under the GNU General Public License. It is used as a back-end system for at least 1.5% of all websites worldwide ranging from personal blogs to corporate, political, and...

      .
  • Genome Editing and Visualization
    • Apollo : a Java application for viewing and editing genome annotations
    • GBrowse : a CGI application for displaying genome annotations
    • JBrowse : a JavaScript application for displaying genome annotations
    • Pathway Tools : a genome browser with a comparative mode
  • Comparative Genomics
    • GBrowse_syn : a GBrowse based synteny viewer
    • CMap : a CGI application for displaying comparative maps
  • Literature curation
    • Textpresso : a text mining system for scientific literature
  • Database querying tools
    • BioMart : a query-oriented data management system
    • InterMine
      InterMine
      InterMine is a powerful open source data warehouse system. Using InterMine, you can create databases of biological data accessed by sophisticated web query tools. InterMine can be used to create databases from a single data set or can integrate multiple sources of data. Support is provided for...

       : open source data warehouse system
  • Biological Pathways
    • Pathway Tools : tools for metabolic pathway information, and analysis of high-throughput functional genomics data
  • Regulatory Networks
    • Pathway Tools : supports definition of regulatory interactions and browsing of regulatory networks
  • Analysis
    • Galaxy
      Galaxy (computational biology)
      Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming experience...

    • MAKER






Participating databases

The following organism databases are contributing to and/or adopting GMOD components
for model organism databases.
ANISEED AntonosporaDB ATIDB Arabidopsis
BeeBase BeetleBase Bovine Genome Database (BGD)
BioHealthBase Bovine QTL Viewer Cattle EST Gene Family Database
CGD CGL ChromDB
Chromosome 7 Annotation Project CSHLmpd Database of Genomic Variants
DictyBase DroSpeGe EcoCyc
FlyBase Fungal Comparative Genomics Fungal Telomere Browser
Gallus Genome Browser GeneDB GrainGenes
Gramene HapMap Human 2q33
Human Genome Segmental Duplication Database IVDB MAGI
Marine Biological Lab Organism Databases MGI Non-Human Segmental Duplication Database
OMAP OryGenesDB Oryza Chromosome 8
Pathway Tools ParameciumDB PeanutMap
PlantsDB PlasmoDB PseudoCAP
PossumBase PUMAdb RGD
SGD SGD Lite SmedDB
Sol Genomics Network Soybase Soybean Gbrowse Database
T1DBase TAIR TGD
TGI TIGR TIGR Rice Genome Browser
ToxoDB TriAnnot BAC Viewer VectorBase
wFleaBase WormBase
XanthusBase Xenbase

Related projects

  • Bioperl
    BioPerl
    BioPerl is a collection of Perl modules that facilitate the development of Perl scripts for bioinformatics applications. It has played an integral role in the Human Genome Project....

    , BioJava
    BioJava
    The BioJava Project is an open source project dedicated to providing Java tools for processing biological data. This includes include objects for manipulating sequences, protein structures, file parsers, CORBA interoperability, DAS, access to AceDB, dynamic programming, and simple statistical...

    , Biopython
    BioPython
    The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology, as well as bioinformatics.-References:*refer to the Biopython website for other , and a list of over one hundred ....

    , BioRuby
    BioRuby
    BioRuby is a package of Open Source Ruby code, with classes for DNA and protein sequence analysis, alignment, database parsing, and other Bioinformatics tools. Recently, tools for structural biology have been added.-External links:* ,*...

    , etc.
  • Ensembl
    Ensembl
    Ensembl is a joint scientific project between the European Bioinformatics Institute and the Wellcome Trust Sanger Institute, which was launched in 1999 in response to the imminent completion of the Human Genome Project...

  • Gene Ontology Software
    Gene Ontology
    The Gene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species...

    http://www.godatabase.org/dev/doc/www-intro.html
  • DAShttp://biodas.org/
  • The Genomics Unified Schemahttp://www.gusdb.org/
  • Manatee: Manual Annotation Tool Etc, Etc...http://manatee.sourceforge.net/
  • Biocurator.orghttp://biocurator.org/
  • Open Biomedical Ontologies
    Open Biomedical Ontologies
    Open Biomedical Ontologies is an effort to create controlled vocabularies for shared use across different biological and medical domains. As of 2006, OBO forms part of the resources of the U.S...

  • The Sequence Ontology Project
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK