Proteomics Identifications Database
Encyclopedia
The PRIDE is one of the most prominent public data repositories of mass spectrometry (MS) based proteomics
Proteomics
Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells. The term "proteomics" was first coined in 1997 to make an analogy with...

 data, and is maintained by the European Bioinformatics Institute
European Bioinformatics Institute
The European Bioinformatics Institute is a centre for research and services in bioinformatics, and is part of European Molecular Biology Laboratory...

 as part of the Proteomics Services Team.

PRIDE stores three different kinds of information: peptide and protein identifications derived from MS or MS/MS experiments, MS and MS/MS mass spectra as peak lists, and any and all associated metadata. Peptide sequences should be captured as parts of identifications .

By September 2010, PRIDE contained more than 13,000 experiments, 4 million protein identifications, 20 million peptide identifications and more than 104 million spectra. A typical PRIDE dataset or project contains more than one experiment (accession numbers or MS runs). As mass spectometry is increasingly used for capturing details of posttranslational modification
Posttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....

 PRIDE contains modification data in case of the peptides which were chemically modified.

PRIDE was established as a production service in 2005. Several other proteomics databases have been established over the past few years like GPMDB, PeptideAtlas, Proteinpedia and the NCBI Peptidome . Together with the NCBI Peptidome, the PRIDE database constitutes an actual structured data repository, storing the original experimental data from the researchers, and does not assume any editorial control over submitted data. In total, PRIDE contains data from about 60 species, the biggest fraction of it coming from human samples, followed by the fruitfly Drosophila melanogaster and mouse.

Formats & the submission process

Since detailed proteomics data currently cannot be curated from the existing literature the source of PRIDE data is solely submissions by academic researchers.

PRIDE is a standards compliant public repository meaning that its own XML-based data exchange format for submissions, PRIDE XML, was built around the Proteomics Standards Initiative
Proteomics Standards Initiative
The Proteomics Standards Initiative is a working group of HUPO. It aims to define data standards for proteomics in order to facilitate data comparison, exchange and verification.PSI focuses on the following subjects:...

 mzData standard for mass spectrometry. PRIDE is committed to implementing relevant new Proteomics Standards Initiative
Proteomics Standards Initiative
The Proteomics Standards Initiative is a working group of HUPO. It aims to define data standards for proteomics in order to facilitate data comparison, exchange and verification.PSI focuses on the following subjects:...

 standards as soon as possible.

As there are many types of different mass spectometry instruments and software formats are currently on the market, wet-lab scientists without a strong bioinformatics background or informatics support were having problems converting their data to PRIDE XML. The development of PRIDE Converter helped to tackle this situation. PRIDE Converter is a tool, written in the Java programming language, that converts 15 different input mass spectometry data formats into PRIDE XML via a wizard-like graphical user interface. It is freely available and is open source under the permissive Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....

.

Browsing, searching & data mining PRIDE

Currently, data can be queried from PRIDE via the PRIDE web and BioMart interfaces.

Additionally one can build complex queries with the PRIDE BioMart using BioMart which is a query-oriented data management system. The extensive use of controlled vocabularies (CVs) and ontologies for flexible yet context-sensitive annotation of data, along with the ability to perform intelligent queries by these annotations, are key features of PRIDE .

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK