Genome@home
Encyclopedia
Genome@home was a distributed computing
project run by Stefan Larson of Stanford University
, and a sister project to Folding@home
. Its goal was protein design
and its applications, which had implications in many fields including medicine
. Genome@home was run by the Pande Lab at Stanford University
, a non-profit institution dedicated to science research and education.
, scientists needed to know the biological and medical implications of the resulting wealth of genetic information. Genome@home used spare processing power on personal computer
s to virtually design gene
s that match existing protein
s, although it can also design new proteins that have not been found in nature. This process is computationally demanding, so distributed computing is a viable option. Researchers can use the results from the project to gain a better understanding of the evolution
of natural genome
s and proteins, and their functionality. This project had applications in medical therapy, new pharmaceuticals, and assigning functions to newly-sequenced genes.
Genome@home directly studied genomes and proteins by virtually designing new sequences for existing 3-D protein structures, which other scientists obtained through X-ray crystallography
or NMR
techniques. By understanding the relationship between the sequences and specific protein structures, the Pande lab tackled contemporary issues in structural biology
, genetics
, and medicine
.
Specifically, the Genome@home project aided the understanding of why thousands of different amino acid
sequences all form the same structures and assisted the fields of proteomics
and Structural genomics
by predicting the functions of newly-discovered genes and proteins. It also had implications in medical therapy by designing and virtually creating new versions of existing proteins. Genome@home's software was designed for uniprocessor
systems. It begins with a large set of potential sequences, and repeatedly searches through and refines these sequences until a well-designed sequence is found. It then sends this sequence to the server, and repeats the process.
Four peer-reviewed scientific publications have resulted from Genome@home.
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
project run by Stefan Larson of Stanford University
Stanford University
The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is a private research university on an campus located near Palo Alto, California. It is situated in the northwestern Santa Clara Valley on the San Francisco Peninsula, approximately northwest of San...
, and a sister project to Folding@home
Folding@home
Folding@home is a distributed computing project designed to use spare processing power on personal computers to perform simulations of disease-relevant protein folding and other molecular dynamics, and to improve on the methods of doing so...
. Its goal was protein design
Protein design
Protein design is the design of new protein molecules, either from scratch or by making calculated variations on a known structure. The use of rational design techniques for proteins is a major aspect of protein engineering....
and its applications, which had implications in many fields including medicine
Medicine
Medicine is the science and art of healing. It encompasses a variety of health care practices evolved to maintain and restore health by the prevention and treatment of illness....
. Genome@home was run by the Pande Lab at Stanford University
Stanford University
The Leland Stanford Junior University, commonly referred to as Stanford University or Stanford, is a private research university on an campus located near Palo Alto, California. It is situated in the northwestern Santa Clara Valley on the San Francisco Peninsula, approximately northwest of San...
, a non-profit institution dedicated to science research and education.
Function
Following the Human Genome ProjectHuman Genome Project
The Human Genome Project is an international scientific research project with a primary goal of determining the sequence of chemical base pairs which make up DNA, and of identifying and mapping the approximately 20,000–25,000 genes of the human genome from both a physical and functional...
, scientists needed to know the biological and medical implications of the resulting wealth of genetic information. Genome@home used spare processing power on personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...
s to virtually design gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s that match existing protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
s, although it can also design new proteins that have not been found in nature. This process is computationally demanding, so distributed computing is a viable option. Researchers can use the results from the project to gain a better understanding of the evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
of natural genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....
s and proteins, and their functionality. This project had applications in medical therapy, new pharmaceuticals, and assigning functions to newly-sequenced genes.
Genome@home directly studied genomes and proteins by virtually designing new sequences for existing 3-D protein structures, which other scientists obtained through X-ray crystallography
X-ray crystallography
X-ray crystallography is a method of determining the arrangement of atoms within a crystal, in which a beam of X-rays strikes a crystal and causes the beam of light to spread into many specific directions. From the angles and intensities of these diffracted beams, a crystallographer can produce a...
or NMR
NMR
NMR may refer to:Applications of Nuclear Magnetic Resonance:* Nuclear magnetic resonance* NMR spectroscopy* Solid-state nuclear magnetic resonance* Protein nuclear magnetic resonance spectroscopy* Proton NMR* Carbon-13 NMR...
techniques. By understanding the relationship between the sequences and specific protein structures, the Pande lab tackled contemporary issues in structural biology
Structural biology
Structural biology is a branch of molecular biology, biochemistry, and biophysics concerned with the molecular structure of biological macromolecules, especially proteins and nucleic acids, how they acquire the structures they have, and how alterations in their structures affect their function...
, genetics
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....
, and medicine
Medicine
Medicine is the science and art of healing. It encompasses a variety of health care practices evolved to maintain and restore health by the prevention and treatment of illness....
.
Specifically, the Genome@home project aided the understanding of why thousands of different amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
sequences all form the same structures and assisted the fields of proteomics
Proteomics
Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells. The term "proteomics" was first coined in 1997 to make an analogy with...
and Structural genomics
Structural genomics
Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches...
by predicting the functions of newly-discovered genes and proteins. It also had implications in medical therapy by designing and virtually creating new versions of existing proteins. Genome@home's software was designed for uniprocessor
Uniprocessor
A uniprocessor system is a computer system with a single central processing unit. As more and more computers employ multiprocessing architectures, such as SMP and MPP, the term is used to refer to systems that still have only one CPU. Most desktop computers are now shipped with multiprocessing...
systems. It begins with a large set of potential sequences, and repeatedly searches through and refines these sequences until a well-designed sequence is found. It then sends this sequence to the server, and repeats the process.
Conclusion
For financial reasons, the project was officially concluded on March 8, 2004, although data was still collected until April 15. Following its completion, users were asked to donate to Folding@home instead.Results
It accumulated a large database of protein sequences, which will be used for important scientific purposes for years by the Pande Lab and other scientists across the world.Four peer-reviewed scientific publications have resulted from Genome@home.