Storage Resource Broker
Encyclopedia
Storage Resource Broker (SRB) is a Data Grid Management System (DGMS) operating in many U.S. and international computational science
research projects. SRB is a logical distributed file system based on a client-server architecture which presents users with a single global logical namespace or file hierarchy.
Depending on the "flavor" of the configuration, use patterns, and policies, the SRB creates what is called a data grid, a digital library
, persistent archive, and/or distributed file system
.
SRB provides a uniform interface to heterogeneous data storage resources over a network. As part of this, it implements a logical namespace
(distinct from physical file names) and maintains metadata on data-objects (files), users, groups, resources, collections, and other items in an SRB Metadata Catalog (MCAT) stored in a relational database management system
. System and user-defined metadata can be queried to locate files based on attributes as well as by name. SRB runs on various versions of Unix
, Linux
, and Microsoft Windows
.
The SRB system is middleware
in the sense that it is built on top of other major software packages (various storage systems, real-time data sources, a relational database management system
, etc) and it has callable library functions that can be utilized by higher level software. However, it is more complete than many middleware software systems as it implements a comprehensive distributed data management environment, including various end-user client applications. It has features to support the management and collaborative (and controlled) sharing, publication, replication, transfer, and preservation of distributed data collections.
SRB is sometimes used in conjunction with computational grid computing
systems, such as Globus Alliance
, and can utilize the Globus Alliance Grid Security Infrastructure (GSI) authentication
system.
SRB can store and retrieve data in archival storage systems such as HPSS
and SAM-FS, on disk file system
s (Unix, Linux, or Windows), as Binary Large Objects or tabular data in relational database management system
s, and on tape libraries.
The SRB has been used in production since 1997. Globally the SRB is estimated to be managing over two petabytes of data, as of 2008. That number is expected to grow at a rate of one petabyte a year. UCSD is currently managing over 1 petabyte of data in over 200 million files. Many other computer centers and consortia are independently managing additional SRB data collections.
While licensed, SRB source distributions are freely available to academic and non-profit organizations. Nirvana SRB, a commercial version of SRB, features capabilities specifically adapted to government and commercial use.
The integrated Rule-based Data management System (iRODS) is a follow-on project of the SDSC SRB team (which is now the Data Intensive Cyber Environments (DICE) group), and now largely replaces the use SDSC SRB in research and academic communities. iRODS is based on SRB concepts but was completely re-written, includes a highly-configurable Rule Engine at its core and is fully open source.
, the Data Intensive Cyber Environments Group (DICE), and the San Diego Supercomputer Center
(SDSC) at the University of California, San Diego
(UCSD) with the support of the National Science Foundation
(NSF).
SRB software, or middleware, builds on the work of Dr. Reagan Moore. Dr. Moore, a doctorate in plasma physics from UCSD and former computational plasma physicist at General Atomics, has been with the San Diego Supercomputer Center since its inception.
In 2003, General Atomics
was granted an exclusive license from UCSD to further develop the capabilities of SRB for use in commercial applications.
Computational science
Computational science is the field of study concerned with constructing mathematical models and quantitative analysis techniques and using computers to analyze and solve scientific problems...
research projects. SRB is a logical distributed file system based on a client-server architecture which presents users with a single global logical namespace or file hierarchy.
Depending on the "flavor" of the configuration, use patterns, and policies, the SRB creates what is called a data grid, a digital library
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
, persistent archive, and/or distributed file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
.
SRB provides a uniform interface to heterogeneous data storage resources over a network. As part of this, it implements a logical namespace
Namespace (computer science)
A namespace is an abstract container or environment created to hold a logical grouping of unique identifiers or symbols . An identifier defined in a namespace is associated only with that namespace. The same identifier can be independently defined in multiple namespaces...
(distinct from physical file names) and maintains metadata on data-objects (files), users, groups, resources, collections, and other items in an SRB Metadata Catalog (MCAT) stored in a relational database management system
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
. System and user-defined metadata can be queried to locate files based on attributes as well as by name. SRB runs on various versions of Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
, and Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...
.
The SRB system is middleware
Middleware
Middleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...
in the sense that it is built on top of other major software packages (various storage systems, real-time data sources, a relational database management system
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
, etc) and it has callable library functions that can be utilized by higher level software. However, it is more complete than many middleware software systems as it implements a comprehensive distributed data management environment, including various end-user client applications. It has features to support the management and collaborative (and controlled) sharing, publication, replication, transfer, and preservation of distributed data collections.
SRB is sometimes used in conjunction with computational grid computing
Grid computing
Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...
systems, such as Globus Alliance
Globus Alliance
The Globus Alliance is an international association dedicated to developing fundamental technologies needed to build grid computing infrastructures...
, and can utilize the Globus Alliance Grid Security Infrastructure (GSI) authentication
Authentication
Authentication is the act of confirming the truth of an attribute of a datum or entity...
system.
SRB can store and retrieve data in archival storage systems such as HPSS
High Performance Storage System
High Performance Storage System is a flexible, scalable policy-based Hierarchical Storage Management product developed by IBM in collaboration with five DOE National Labs...
and SAM-FS, on disk file system
File system
A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...
s (Unix, Linux, or Windows), as Binary Large Objects or tabular data in relational database management system
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
s, and on tape libraries.
The SRB has been used in production since 1997. Globally the SRB is estimated to be managing over two petabytes of data, as of 2008. That number is expected to grow at a rate of one petabyte a year. UCSD is currently managing over 1 petabyte of data in over 200 million files. Many other computer centers and consortia are independently managing additional SRB data collections.
While licensed, SRB source distributions are freely available to academic and non-profit organizations. Nirvana SRB, a commercial version of SRB, features capabilities specifically adapted to government and commercial use.
The integrated Rule-based Data management System (iRODS) is a follow-on project of the SDSC SRB team (which is now the Data Intensive Cyber Environments (DICE) group), and now largely replaces the use SDSC SRB in research and academic communities. iRODS is based on SRB concepts but was completely re-written, includes a highly-configurable Rule Engine at its core and is fully open source.
History
SRB development began in 1995, through the cooperative efforts of General AtomicsGeneral Atomics
General Atomics is a nuclear physics and defense contractor headquartered in San Diego, California. General Atomics’ research into fission and fusion matured into competencies in related technologies, allowing the company to expand into other fields of research...
, the Data Intensive Cyber Environments Group (DICE), and the San Diego Supercomputer Center
San Diego Supercomputer Center
The San Diego Supercomputer Center is an organized research unit of the University of California, San Diego . Physically, SDSC is located on the east end of Eleanor Roosevelt College on the campus of UCSD....
(SDSC) at the University of California, San Diego
University of California, San Diego
The University of California, San Diego, commonly known as UCSD or UC San Diego, is a public research university located in the La Jolla neighborhood of San Diego, California, United States...
(UCSD) with the support of the National Science Foundation
National Science Foundation
The National Science Foundation is a United States government agency that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health...
(NSF).
SRB software, or middleware, builds on the work of Dr. Reagan Moore. Dr. Moore, a doctorate in plasma physics from UCSD and former computational plasma physicist at General Atomics, has been with the San Diego Supercomputer Center since its inception.
In 2003, General Atomics
General Atomics
General Atomics is a nuclear physics and defense contractor headquartered in San Diego, California. General Atomics’ research into fission and fusion matured into competencies in related technologies, allowing the company to expand into other fields of research...
was granted an exclusive license from UCSD to further develop the capabilities of SRB for use in commercial applications.