IPlant Collaborative
Encyclopedia
The iPlant Collaborative is a virtual organization
Virtual Organization (Grid computing)
In grid computing, a Virtual Organization refers to a dynamic set of individuals or institutions defined around a set of resource-sharing rules and conditions...

 created by a cooperative agreement funded by the US National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...

 (NSF) to create cyberinfrastructure
Cyberinfrastructure
United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services distributed over...

 for the plant sciences (botany
Botany
Botany, plant science, or plant biology is a branch of biology that involves the scientific study of plant life. Traditionally, botany also included the study of fungi, algae and viruses...

). The NSF compared cyberinfrastructure to physical infrastructure
Infrastructure
Infrastructure is basic physical and organizational structures needed for the operation of a society or enterprise, or the services and facilities necessary for an economy to function...

, "... the distributed computer
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

, information and communication technologies
Information technology
Information technology is the acquisition, processing, storage and dissemination of vocal, pictorial, textual and numerical information by a microelectronics-based combination of computing and telecommunications...

 combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor."

The project develops computing systems and software that combine computing resources, like those of TeraGrid
TeraGrid
TeraGrid is an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011....

, and bioinformatics
Bioinformatics
Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...

 and computational biology
Computational biology
Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems...

 software. Its goal is easier collaboration among researchers with improved data access and processing efficiency. Primarily centered in the United States, it collaborates internationally.

History

Biology is relying more and more on computers. Plant biology is changing with the rise of new technologies. With the advent of bioinformatics
Bioinformatics
Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...

, computational biology
Computational biology
Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems...

, DNA sequencing
DNA sequencing
DNA sequencing includes several methods and technologies that are used for determining the order of the nucleotide bases—adenine, guanine, cytosine, and thymine—in a molecule of DNA....

, geographic information systems and others computers can greatly assist researchers who study plant life looking for solutions to challenges in medicine
Medicine
Medicine is the science and art of healing. It encompasses a variety of health care practices evolved to maintain and restore health by the prevention and treatment of illness....

, biofuels, biodiversity
Biodiversity
Biodiversity is the degree of variation of life forms within a given ecosystem, biome, or an entire planet. Biodiversity is a measure of the health of ecosystems. Biodiversity is in part a function of climate. In terrestrial habitats, tropical regions are typically rich whereas polar regions...

, agriculture
Agriculture
Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

 and problems like drought tolerance
Drought tolerance
Drought tolerance refers to the degree to which a plant is adapted to arid or drought conditions. Desiccation tolerance is an extreme degree of drought tolerance...

, plant breeding
Plant breeding
Plant breeding is the art and science of changing the genetics of plants in order to produce desired characteristics. Plant breeding can be accomplished through many different techniques ranging from simply selecting plants with desirable characteristics for propagation, to more complex molecular...

, and sustainable farming. Many of these problems cross traditional disciplines and facilitating collaboration between plant scientists of diverse backgrounds and specialties is necessary.

In 2006, the NSF solicited proposals to create "a new type of organization – a cyberinfrastructure collaborative for plant science" with a program titled "Plant Science Cyberinfrastructure Collaborative" (PSCIC) with Christopher Greer as program director. A proposal was accepted (adopting the convention of using the word "Collaborative" as a noun) and iPlant was officially created on February 1, 2008.
Funding was estimated as $10 million per year over five years.

Richard Jorgensen led the team through the proposal stage and was the principal investigator
Principal investigator
A principal investigator is the lead scientist or engineer for a particular well-defined science project, such as a laboratory study or clinical trial....

 (PI) from 2008 to 2009. Gregory Andrews, Vicki Chandler, Sudha Ram and Lincoln Stein served as Co-Principal Investigators (Co-PIs) from 2008 to 2009. In late 2009, Stephen Goff was named PI and Daniel Stanzione was added as a Co-PI.

The iPlant project supports what has been called e-Science
E-Science
E-Science is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid...

, which is a use of information systems technology that is being adopted by the research community in efforts such as the National Center for Ecological Analysis and Synthesis
National Center for Ecological Analysis and Synthesis
The National Center for Ecological Analysis and Synthesis is a research center at the University of California, Santa Barbara, in Santa Barbara, California. Better known by its acronym, NCEAS opened in May 1995, and is funded by the U.S...

 (NCEAS), ELIXIR, and the Bamboo Technology Project that started in September 2010. iPlant is "designed to create the foundation to support the computational needs of the research community and facilitate progress toward solutions of major problems in plant biology."

The project works as a collaboration
Collaboration
Collaboration is working together to achieve a goal. It is a recursive process where two or more people or organizations work together to realize shared goals, — for example, an intriguing endeavor that is creative in nature—by sharing...

. It seeks input from the wider plant science community on what to build.
Based on that input, it has enabled easier use of large data sets, created a community-driven research environment to share existing data collections within a research area and between research areas and shares data with provenance
Provenance
Provenance, from the French provenir, "to come from", refers to the chronology of the ownership or location of an historical object. The term was originally mostly used for works of art, but is now used in similar senses in a wide range of fields, including science and computing...

 tracking.
One model studied for collaboration was Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...

.

Several more recent National Science Foundation awards mentioned iPlant explicitly in their descriptions, as either a design pattern to follow or a collaborator with whom the recipient will work.

Institutions

The primary institution for the iPlant project is the University of Arizona
University of Arizona
The University of Arizona is a land-grant and space-grant public institution of higher education and research located in Tucson, Arizona, United States. The University of Arizona was the first university in the state of Arizona, founded in 1885...

, located within the BIO5 Institute in Tucson. Since its inception in 2008, personnel worked at other institutions including Cold Spring Harbor Laboratory
Cold Spring Harbor Laboratory
The Cold Spring Harbor Laboratory is a private, non-profit institution with research programs focusing on cancer, neurobiology, plant genetics, genomics and bioinformatics. The Laboratory has a broad educational mission, including the recently established Watson School of Biological Sciences. It...

, University of North Carolina, Wilmington, and the University of Texas at Austin
University of Texas at Austin
The University of Texas at Austin is a state research university located in Austin, Texas, USA, and is the flagship institution of the The University of Texas System. Founded in 1883, its campus is located approximately from the Texas State Capitol in Austin...

 in the Texas Advanced Computing Center
Texas Advanced Computing Center
The Texas Advanced Computing Center at the University of Texas at Austin, United States, is a research center for advanced computational science, engineering and technology. TACC is located on UT's J.J. Pickle Research Campus....

.
Purdue University
Purdue University
Purdue University, located in West Lafayette, Indiana, U.S., is the flagship university of the six-campus Purdue University system. Purdue was founded on May 6, 1869, as a land-grant university when the Indiana General Assembly, taking advantage of the Morrill Act, accepted a donation of land and...

 and Arizona State University
Arizona State University
Arizona State University is a public research university located in the Phoenix Metropolitan Area of the State of Arizona...

 were part of the original project group.

Other collaborating institutions that received support from iPlant for their work on a Grand Challenge
Grand Challenge
Grand Challenges were USA policy terms set as goals in the late 1980s for funding high-performance computing and communications research in part in response to the Japanese 5th Generation 10-year project....

 in phylogenetics
Phylogenetics
In biology, phylogenetics is the study of evolutionary relatedness among groups of organisms , which is discovered through molecular sequencing data and morphological data matrices...

 starting in March 2009 included Yale University
Yale University
Yale University is a private, Ivy League university located in New Haven, Connecticut, United States. Founded in 1701 in the Colony of Connecticut, the university is the third-oldest institution of higher education in the United States...

, University of Florida
University of Florida
The University of Florida is an American public land-grant, sea-grant, and space-grant research university located on a campus in Gainesville, Florida. The university traces its historical origins to 1853, and has operated continuously on its present Gainesville campus since September 1906...

, and the University of Pennsylvania
University of Pennsylvania
The University of Pennsylvania is a private, Ivy League university located in Philadelphia, Pennsylvania, United States. Penn is the fourth-oldest institution of higher education in the United States,Penn is the fourth-oldest using the founding dates claimed by each institution...

.

A trait evolution group was led at the University of Tennessee
University of Tennessee
The University of Tennessee is a public land-grant university headquartered at Knoxville, Tennessee, United States...

.
A visualization project added Virginia Polytechnic Institute and State University
Virginia Polytechnic Institute and State University
Virginia Polytechnic Institute and State University, popularly known as Virginia Tech , is a public land-grant university with the main campus in Blacksburg, Virginia with other research and educational centers throughout the Commonwealth of Virginia, United States, and internationally.Founded in...

 (Virginia Tech).

The NSF requires that funding subcontracts stay within the United States, but international collaboration started in 2009 with the Technical University Munich and University of Toronto
University of Toronto
The University of Toronto is a public research university in Toronto, Ontario, Canada, situated on the grounds that surround Queen's Park. It was founded by royal charter in 1827 as King's College, the first institution of higher learning in Upper Canada...

 in 2010.
East Main Educational Consulting provides external oversight, advice, and assistance.

Services

The iPlant project makes its cyberinfrastructure available several different ways and offers services to make it the accessible to its primary audience. The design was meant to grow in response to needs of the research community it serves.

The Discovery Environment

The Discovery Environment integrates community-recommended software tools into a system that can handle terabyte
Terabyte
The terabyte is a multiple of the unit byte for digital information. The prefix tera means 1012 in the International System of Units , and therefore 1 terabyte is , or 1 trillion bytes, or 1000 gigabytes. 1 terabyte in binary prefixes is 0.9095 tebibytes, or 931.32 gibibytes...

s of data using high-performance supercomputers to perform these tasks much more quickly. It has an interface designed to hide the complexity needed to do this from the end user. The goal was to make the cyberinfrastructure available to non-technical end users who are not as comfortable using a command-line interface
Command-line interface
A command-line interface is a mechanism for interacting with a computer operating system or software by typing commands to perform specific tasks...

.

iPlant Foundational APIs

A set of application programming interface
Application programming interface
An application programming interface is a source code based specification intended to be used as an interface by software components to communicate with each other...

s (APIs) for developers allow access to iPlant services, including authentication, data management, high performance supercomputing resources from custom, locally produced software.

Atmosphere

Atmosphere is a cloud computing
Cloud computing
Cloud computing is the delivery of computing as a service rather than a product, whereby shared resources, software, and information are provided to computers and other devices as a utility over a network ....

 platform that provides easy access to pre-configured, frequently used analysis routines, relevant algorithms, and data sets, and accommodates computationally and data-intensive bioinformatics tasks.
It uses the Eucalyptus
Eucalyptus (computing)
Eucalyptus is a software platform for the implementation of private cloud computing on computer clusters. There is an open-core enterprise edition and an open-source edition. Currently, it exports a user-facing interface that is compatible with the Amazon EC2 and S3 services but the platform is...

 virtualization platform.

iPlant Semantic Web

The iPlant Semantic Web effort uses an iPlant-created architecture, protocol, and platform called the Simple Semantic Web Architecture and Protocol (SSWAP) for semantic web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

 linking using a plant science focused ontology
Ontology
Ontology is the philosophical study of the nature of being, existence or reality as such, as well as the basic categories of being and their relations...

.

Taxonomic Name Resolution Service

The Taxonomic Name Resolution Service (TNRS) is a free utility for correcting and standardizing plant names. This is needed because plant names that are misspelled, out of date (because a newer synonym is preferred), or incomplete make it impossible to use computers to process large lists.

My-Plant

My-Plant.org is a social networking community for plant biologists, educators and others to come together to share information and research, collaborate, and track the latest developments in plant science.
The My-Plant network uses the terminology clade
Clade
A clade is a group consisting of a species and all its descendants. In the terms of biological systematics, a clade is a single "branch" on the "tree of life". The idea that such a "natural group" of organisms should be grouped together and given a taxonomic name is central to biological...

s
to group users in a manner similar to phylogenetics
Phylogenetics
In biology, phylogenetics is the study of evolutionary relatedness among groups of organisms , which is discovered through molecular sequencing data and morphological data matrices...

 of plants themselves.
It was implemented using Drupal
Drupal
Drupal is a free and open-source content management system and content management framework written in PHP and distributed under the GNU General Public License. It is used as a back-end system for at least 1.5% of all websites worldwide ranging from personal blogs to corporate, political, and...

 as its content management system
Content management system
A content management system is a system providing a collection of procedures used to manage work flow in a collaborative environment. These procedures can be manual or computer-based...

.

DNA Subway

The DNA Subway website uses a graphical user interface
Graphical user interface
In computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...

 (GUI) to generate DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

 sequence annotations, explore plant genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....

s for members of gene and transposon
Transposon
Transposable elements are sequences of DNA that can move or transpose themselves to new positions within the genome of a single cell. The mechanism of transposition can be either "copy and paste" or "cut and paste". Transposition can create phenotypically significant mutations and alter the cell's...

 families, and conduct phylogenetic
Phylogenetics
In biology, phylogenetics is the study of evolutionary relatedness among groups of organisms , which is discovered through molecular sequencing data and morphological data matrices...

 analyses. It makes high-level DNA analysis available to faculty and students by simplifying annotation and comparative genomics workflows.
It was developed for iPlant by the Dolan DNA Learning Center
Dolan DNA Learning Center
DNA Learning Center is a genetics learning center affiliated with the Cold Spring Harbor Laboratory, in Cold Spring Harbor, New York. It is the world's first science center devoted entirely to genetics education and offers online education, class field trips, student summer day camps, and teacher...

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK