Microarray databases
Encyclopedia
The term microarray database is usually used to describe a repository containing microarray
gene expression
data. The key features of a microarray database are to store the measurement data, manage a searchable index, and make the data available to other applications for analysis and interpretation (either directly, or via user downloads).
Microarray databases can fall into two distinct classes:
Some of the most known public, curated microarray databases are:
DNA microarray
A DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome...
gene expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...
data. The key features of a microarray database are to store the measurement data, manage a searchable index, and make the data available to other applications for analysis and interpretation (either directly, or via user downloads).
Microarray databases can fall into two distinct classes:
- A peer reviewed, public repository that adheres to academic or industry standards and is designed to be used by many analysis applications and groups. A good example of this is the Gene Expression Omnibus (GEO) from NCBINational Center for Biotechnology InformationThe National Center for Biotechnology Information is part of the United States National Library of Medicine , a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper...
or ArrayExpress from EBIEuropean Bioinformatics InstituteThe European Bioinformatics Institute is a centre for research and services in bioinformatics, and is part of European Molecular Biology Laboratory...
. - A specialized repository associated primarily with the brand of a particular entity (lab, company, university, consortium, group), an application suite, a topic, or an analysis method, whether it is commercial, non-profit, or academic. These databases might have one or more of the following characteristics:
- A subscription or license may be needed to gain full access,
- The content may come primarily from a specific group (e.g. SMD, or UPSC-BASE),
- There may be constraints on who can use the data or for what purpose data can be used,
- Special permission may be required to submit new data, or there may be no obvious process at all,
- Only certain applications may be equipped to use the data, often also associated with the same entity (for example, caArray at NCI is specialized for the caBIGCaBIGThe cancer Biomedical Informatics Grid is an open source, open access information network with the mission of enabling secure data exchange throughout the cancer community...
), - Further processing or reformatting of the data may be required for standard applications or analysis,
- They claim to address the 'urgent need' to have a standard, centralized repository for microarray data. (See YMD, last updated in 2003, for example),
- There is a claim to an incremental improvement over one of the public repositories,
- A meta-analysis application, which incorporates studies from one or more public databases (e.g. Gemma primarily uses GEO studies; NextBio uses various sources)
Some of the most known public, curated microarray databases are:
Database | Scope | Microarray experiment sets | Sample profiles | As of date |
Gene Expression Omnibus - NCBI | any curated MIAME MIAME MIAME is a standard created by the FGED Society for reporting microarray experiments. It is intended to specify all the information necessary to interpret the results of the experiment unambiguously and to potentially reproduce the experiment... compliant molecular abundance study |
25859 | 641770 | October 28, 2011 |
Stanford Microarray database | private and published microarray and molecule abundance database | 82542 | ? | October 23, 2011 |
GeneNetwork system | Open access standard arrays, exons arrays, and RNA-seq data for genetic analysis (eQTL studies) with analysis suite | ~100 | ~10000 | July, 2010 |
Genevestigator database | Manually curated microarray data for expression meta-analysis | 1500 | 44000 | July, 2010 |
ArrayExpress at EBI | Any curated MIAME MIAME MIAME is a standard created by the FGED Society for reporting microarray experiments. It is intended to specify all the information necessary to interpret the results of the experiment unambiguously and to potentially reproduce the experiment... or MINSEQE compliant transcriptomics data |
24838 | 708914 | October 28, 2011 |
UPenn RAD database | MIAME MIAME MIAME is a standard created by the FGED Society for reporting microarray experiments. It is intended to specify all the information necessary to interpret the results of the experiment unambiguously and to potentially reproduce the experiment... compliant public and private studies, associated with ArrayExpress |
~100 | ~2500 | Sept. 1, 2007 |
UNC Microarray database | provides the service for microarray data storage, retrieval, analysis, and visualization | ~31 | 2093 | April 1, 2007 |
UNC modENCODE Microarray database | Nimblegen customer 2.1 million array | ~6 | 180 | July 17, 2009 |
MUSC database | The database is a repository for DNA microarray data generated by MUSC investigators as well as researchers in the global research community. | ~45 | 555 | April 1, 2007 |
caArray at NCI | Cancer data, prepared for analysis on caBIG CaBIG The cancer Biomedical Informatics Grid is an open source, open access information network with the mission of enabling secure data exchange throughout the cancer community... |
41 | 1741 | November 15, 2006 |
UPSC-BASE | data generated by microarray analysis within Umeå Plant Science Centre (UPSC). | ~100 | ? | November 15, 2007 |
ArrayTrack | ArrayTrack ArrayTrack ArrayTrack is a multi-purpose bioinformatics tool primarily used for microarray data management, analysis, and interpretation. ArrayTrack was developed to support in-house filter array research for the U.S. Food and Drug Administration in 2001 and was made freely available to the public as an... hosts both public and private data, including MAQC benchmark data, with integrated analysis tools |
1497 | 43,823 | July 26, 2011 |
- For a directory of Microarray Databases, see:
See also
- Biological databaseBiological databaseBiological databases are libraries of life sciences information, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analyses. They contain information from research areas including genomics, proteomics, metabolomics, microarray...
- DNA microarrayDNA microarrayA DNA microarray is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome...
- DNA microarray#Data warehousing