DataONE
Encyclopedia
Data Observation Network for Earth (DataONE) is a project supported by the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

 under the DataNet program. DataONE will provide scientific data archiving
Scientific data archiving
Scientific data archiving refers to the long-term storage of scientific data and methods. The various scientific journals have differing policies regarding how much of their data and methods scientists are required to store in a public archive, and what is actually archived varies widely between...

 for ecological and environmental data produced by scientists worldwide. DataONE's stated goal is to preserve and provide access to multi-scale, multi-discipline, and multi-national data. The community of users for DataONE includes scientists, ecosystem managers, policy makers, students, educators, and the public.

DataONE will link together existing cyberinfrastructure
Cyberinfrastructure
United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services distributed over...

 to provide a distributed framework, sound management, and robust technologies that enable long-term preservation of diverse multi-scale, multi-discipline, and multi-national observational data. The distributed framework will be composed of Coordinating Nodes currently located at the Oak Ridge Campus, University of California Santa Barbara, and University of New Mexico
University of New Mexico
The University of New Mexico at Albuquerque is a public research university located in Albuquerque, New Mexico, in the United States. It is the state's flagship research institution...

, and many Member Nodes, located globally. DataONE will also provide an Investigator Tool Kit that will provide the DataONE users community with tools for accessing and using DataONE efficiently.

Coordinating Nodes

Coordinating Nodes will provide network-wide services to Member Nodes. They will be geographically replicated, with mirrored content and full copies of science metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

. The three Coordinating Nodes are:
  • University of New Mexico
    University of New Mexico
    The University of New Mexico at Albuquerque is a public research university located in Albuquerque, New Mexico, in the United States. It is the state's flagship research institution...

  • Oak Ridge Campus (partnership of Oak Ridge National Laboratory (ORNL) and University of Tennessee
    University of Tennessee
    The University of Tennessee is a public land-grant university headquartered at Knoxville, Tennessee, United States...

    )
  • University of California, Santa Barbara
    University of California, Santa Barbara
    The University of California, Santa Barbara, commonly known as UCSB or UC Santa Barbara, is a public research university and one of the 10 general campuses of the University of California system. The main campus is located on a site in Goleta, California, from Santa Barbara and northwest of Los...

    , UCSB

Member Nodes

Member Nodes will consist of Earth observing institutions, projects, and networks. They will provide resources for their own data and replicated data, and focus on serving their specific constituencies. These member nodes are geographically distributed and consist of diverse implementations. Current Member Nodes include:
  • Dryad
  • ORNL Distributed Active Archive Center
  • Knowledge Network for Biocomplexity

Investigator Tool Kit

The Tool Kit will provide tools for researchers to access DataONE. These will be both general purpose and discipline-specific tools, and DataONE developers will adapt existing tools where possible. The Tool Kit will include Java
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

 and Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 libraries, an R programming language plug-in for analysis, extensions for Excel
Microsoft Excel
Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

, the VisTrails
VisTrails
VisTrails is a scientific workflow management system developed at the Scientific Computing and Imaging Institute at the University of Utah that provides support for data exploration and visualization. It is written in Python and employs Qt via PyQt bindings. The system is open source, released...

 scientific workflow, and the Kepler scientific workflow system
Kepler scientific workflow system
Kepler is a free software system for designing, executing, reusing, evolving, archiving, and sharing scientific workflows.Kepler's facilities provide process and data monitoring, provenance information, and high-speed data movement solutions...

.

Data Management

DataONE will provide a place for scientists to store data and its associated metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

. The metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 will then make this data searchable and accessible to other scientists. Data management practices include
  • Data management planning
  • Data acquisition (techniques, protocols, methods)
  • Data protection (backing up)
  • Data entry and manipulation (naming files, organization)
  • Quality control on data
  • Data analysis
  • Workflow tools (VisTrails
    VisTrails
    VisTrails is a scientific workflow management system developed at the Scientific Computing and Imaging Institute at the University of Utah that provides support for data exploration and visualization. It is written in Python and employs Qt via PyQt bindings. The system is open source, released...

    , Kepler scientific workflow system
    Kepler scientific workflow system
    Kepler is a free software system for designing, executing, reusing, evolving, archiving, and sharing scientific workflows.Kepler's facilities provide process and data monitoring, provenance information, and high-speed data movement solutions...

    )
  • Data documentation (metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

    )
  • Data sharing, citation, and discovery
  • Data preservation & curation

DataONE Community

The DataONE community includes research networks, professional societies, libraries, academic institutions, data centers, data repositories, environmental observatory networks, educators, scientists, policy makers, administrators, citizen scientists, international organizations, NGOs, ecosystem managers, students, private companies and the public.

External links

  • http://www.unm.edu/~market/cgi-bin/archives/004536.html
  • http://www.nature.com/news/specials/datasharing/index.html
  • http://www.nsf.gov/pubs/2007/nsf07601/nsf07601.htm
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK