Nutritious Rice for the World
Encyclopedia
Nutritious Rice for the World is a World Community Grid
research project in the field of agronomy
led by the Samudrala Computational Biology Research Group at the University of Washington
. It was launched on May 12, 2008. The objective of this project is to predict the structure of protein
s of major strain
s of rice
. The intent is to help farmer
s breed
better rice strains with higher crop yield
s, promote greater disease and pest resistance, and utilize a full range of bioavailable
nutrient
s that can benefit people around the world, especially in regions where malnutrition
is a critical concern.
Determining the structure of proteins
is an extremely difficult and expensive process. Though it is possible to computationally predict a protein's structure from its corresponding DNA sequence
, there are thousands of distinct proteins found in rice. This presents a computational challenge that a single computer cannot solve within a reasonable timeframe.
Once that the entire rice genome
had been sequenced, the effort shifted to identifying gene
s that are involved in increased yield, disease resistance and nutritional value. This problem is made more difficult because very few cereal
plants have been sequenced, and therefore, many of the rice genes do not resemble any genes of known function. The Computational Biology Research Group at the University of Washington developed the Protinfo software, which can produce protein structures at a fraction of the cost and time.
Protinfo is being used to create three-dimensional models
of the tens of thousands of rice proteins. These models are then used to predict the function of each protein and to understand the role of the gene that encodes it. The models, and any analysis resulting from examining them, will be housed at the Bioverse database
and webserver, which is a comprehensive framework to relate molecules such as proteins and DNA to an organism
's pathways and systems.
Volunteers' computers on World Community Grid will run the Protinfo software to create models of all proteins encoded by the rice genome whose structure can be predicted reliably. These models will be analyzed to choose the best ones. From the resulting structures, prediction tools will determine the function of each protein and the role of the gene that encodes it. Using the power of Protinfo, World Community Grid will initially examine over 10,000 genes, and produce 100,000 models per gene.
Eventually, the structures of 30,000 to 60,000 proteins will be studied. Generating one billion models on the 320 CPU cluster at the Computational Biology Research Group was anticipated to take about 30 years to accomplish; however, using World Community Grid took only about two years working at 167 TFLOPS
. The distributed computing function was suspended in April 2010 while in-house analysis of results continues. The DC function will resume when funding is secured for further phases.
The resulting knowledge base will hopefully lead to the development of improved hybrids of rice strains with higher yield, greater disease and pest resistance, and a full range of bioavailable nutrients. This knowledge can also be extended to other food crops such as wheat
and maize
.
World Community Grid
World Community Grid is an effort to create the world's largest public computing grid to tackle scientific research projects that benefit humanity...
research project in the field of agronomy
Agronomy
Agronomy is the science and technology of producing and using plants for food, fuel, feed, fiber, and reclamation. Agronomy encompasses work in the areas of plant genetics, plant physiology, meteorology, and soil science. Agronomy is the application of a combination of sciences like biology,...
led by the Samudrala Computational Biology Research Group at the University of Washington
University of Washington
University of Washington is a public research university, founded in 1861 in Seattle, Washington, United States. The UW is the largest university in the Northwest and the oldest public university on the West Coast. The university has three campuses, with its largest campus in the University...
. It was launched on May 12, 2008. The objective of this project is to predict the structure of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
s of major strain
Strain (biology)
In biology, a strain is a low-level taxonomic rank used in three related ways.-Microbiology and virology:A strain is a genetic variant or subtype of a micro-organism . For example, a "flu strain" is a certain biological form of the influenza or "flu" virus...
s of rice
Rice
Rice is the seed of the monocot plants Oryza sativa or Oryza glaberrima . As a cereal grain, it is the most important staple food for a large part of the world's human population, especially in East Asia, Southeast Asia, South Asia, the Middle East, and the West Indies...
. The intent is to help farmer
Farmer
A farmer is a person engaged in agriculture, who raises living organisms for food or raw materials, generally including livestock husbandry and growing crops, such as produce and grain...
s breed
Plant breeding
Plant breeding is the art and science of changing the genetics of plants in order to produce desired characteristics. Plant breeding can be accomplished through many different techniques ranging from simply selecting plants with desirable characteristics for propagation, to more complex molecular...
better rice strains with higher crop yield
Crop yield
In agriculture, crop yield is not only a measure of the yield of cereal per unit area of land under cultivation, yield is also the seed generation of the plant itself...
s, promote greater disease and pest resistance, and utilize a full range of bioavailable
Bioavailability
In pharmacology, bioavailability is a subcategory of absorption and is used to describe the fraction of an administered dose of unchanged drug that reaches the systemic circulation, one of the principal pharmacokinetic properties of drugs. By definition, when a medication is administered...
nutrient
Nutrient
A nutrient is a chemical that an organism needs to live and grow or a substance used in an organism's metabolism which must be taken in from its environment. They are used to build and repair tissues, regulate body processes and are converted to and used as energy...
s that can benefit people around the world, especially in regions where malnutrition
Malnutrition
Malnutrition is the condition that results from taking an unbalanced diet in which certain nutrients are lacking, in excess , or in the wrong proportions....
is a critical concern.
Determining the structure of proteins
Protein structure
Proteins are an important class of biological macromolecules present in all organisms. Proteins are polymers of amino acids. Classified by their physical size, proteins are nanoparticles . Each protein polymer – also known as a polypeptide – consists of a sequence formed from 20 possible L-α-amino...
is an extremely difficult and expensive process. Though it is possible to computationally predict a protein's structure from its corresponding DNA sequence
DNA sequencing
DNA sequencing includes several methods and technologies that are used for determining the order of the nucleotide bases—adenine, guanine, cytosine, and thymine—in a molecule of DNA....
, there are thousands of distinct proteins found in rice. This presents a computational challenge that a single computer cannot solve within a reasonable timeframe.
Once that the entire rice genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....
had been sequenced, the effort shifted to identifying gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s that are involved in increased yield, disease resistance and nutritional value. This problem is made more difficult because very few cereal
Cereal
Cereals are grasses cultivated for the edible components of their grain , composed of the endosperm, germ, and bran...
plants have been sequenced, and therefore, many of the rice genes do not resemble any genes of known function. The Computational Biology Research Group at the University of Washington developed the Protinfo software, which can produce protein structures at a fraction of the cost and time.
Protinfo is being used to create three-dimensional models
3D modeling
In 3D computer graphics, 3D modeling is the process of developing a mathematical representation of any three-dimensional surface of object via specialized software. The product is called a 3D model...
of the tens of thousands of rice proteins. These models are then used to predict the function of each protein and to understand the role of the gene that encodes it. The models, and any analysis resulting from examining them, will be housed at the Bioverse database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
and webserver, which is a comprehensive framework to relate molecules such as proteins and DNA to an organism
Organism
In biology, an organism is any contiguous living system . In at least some form, all organisms are capable of response to stimuli, reproduction, growth and development, and maintenance of homoeostasis as a stable whole.An organism may either be unicellular or, as in the case of humans, comprise...
's pathways and systems.
Volunteers' computers on World Community Grid will run the Protinfo software to create models of all proteins encoded by the rice genome whose structure can be predicted reliably. These models will be analyzed to choose the best ones. From the resulting structures, prediction tools will determine the function of each protein and the role of the gene that encodes it. Using the power of Protinfo, World Community Grid will initially examine over 10,000 genes, and produce 100,000 models per gene.
Eventually, the structures of 30,000 to 60,000 proteins will be studied. Generating one billion models on the 320 CPU cluster at the Computational Biology Research Group was anticipated to take about 30 years to accomplish; however, using World Community Grid took only about two years working at 167 TFLOPS
FLOPS
In computing, FLOPS is a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second...
. The distributed computing function was suspended in April 2010 while in-house analysis of results continues. The DC function will resume when funding is secured for further phases.
The resulting knowledge base will hopefully lead to the development of improved hybrids of rice strains with higher yield, greater disease and pest resistance, and a full range of bioavailable nutrients. This knowledge can also be extended to other food crops such as wheat
Wheat
Wheat is a cereal grain, originally from the Levant region of the Near East, but now cultivated worldwide. In 2007 world production of wheat was 607 million tons, making it the third most-produced cereal after maize and rice...
and maize
Maize
Maize known in many English-speaking countries as corn or mielie/mealie, is a grain domesticated by indigenous peoples in Mesoamerica in prehistoric times. The leafy stalk produces ears which contain seeds called kernels. Though technically a grain, maize kernels are used in cooking as a vegetable...
.
System Requirements
The project has minimum system requirements that the computer that does calculations for the project must comply to, which include:- At least 128 MBMegabyteThe megabyte is a multiple of the unit byte for digital information storage or transmission with two different values depending on context: bytes generally for computer memory; and one million bytes generally for computer storage. The IEEE Standards Board has decided that "Mega will mean 1 000...
RAM (with virtual memoryVirtual memoryIn computing, virtual memory is a memory management technique developed for multitasking kernels. This technique virtualizes a computer architecture's various forms of computer data storage , allowing a program to be designed as though there is only one kind of memory, "virtual" memory, which...
enabled) - 200 MB Hard Disk Drive with at least 50 MB available for use
- The ability to display 8-bitBitA bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...
graphics at 640x480 resolutionDisplay resolutionThe display resolution of a digital television or display device is the number of distinct pixels in each dimension that can be displayed. It can be an ambiguous term especially as the displayed resolution is controlled by all different factors in cathode ray tube , flat panel or projection... - An InternetInternetThe Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
connection with minimum 28kbit/s speed - Operating System: Windows 98Windows 98Windows 98 is a graphical operating system by Microsoft. It is the second major release in the Windows 9x line of operating systems. It was released to manufacturing on 15 May 1998 and to retail on 25 June 1998. Windows 98 is the successor to Windows 95. Like its predecessor, it is a hybrid...
, MEWindows MeWindows Millennium Edition, or Windows Me , is a graphical operating system released on September 14, 2000 by Microsoft, and was the last operating system released in the Windows 9x series. Support for Windows Me ended on July 11, 2006....
, 2000Windows 2000Windows 2000 is a line of operating systems produced by Microsoft for use on personal computers, business desktops, laptops, and servers. Windows 2000 was released to manufacturing on 15 December 1999 and launched to retail on 17 February 2000. It is the successor to Windows NT 4.0, and is the...
, XPWindows XPWindows XP is an operating system produced by Microsoft for use on personal computers, including home and business desktops, laptops and media centers. First released to computer manufacturers on August 24, 2001, it is the second most popular version of Windows, based on installed user base...
, VistaWindows VistaWindows Vista is an operating system released in several variations developed by Microsoft for use on personal computers, including home and business desktops, laptops, tablet PCs, and media center PCs...
, LinuxLinuxLinux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
, or Mac OS XMac OS XMac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
using an Intel or PowerPCPowerPCPowerPC is a RISC architecture created by the 1991 Apple–IBM–Motorola alliance, known as AIM...
processor