Shared nothing architecture
Encyclopedia
A shared nothing architecture (SN) is a distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

 architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. More specifically, none of the nodes share memory or disk storage.

People typically contrast SN with systems that keep a large amount of centrally-stored state
State (computer science)
In computer science and automata theory, a state is a unique configuration of information in a program or machine. It is a concept that occasionally extends into some forms of systems programming such as lexers and parsers....

 information, whether in a database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

, an application server
Application server
An application server is a software framework that provides an environment in which applications can run, no matter what the applications are or what they do...

, or any other similar single point of contention. While SN is best known in the context of web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 development, the concept predates the web: Michael Stonebraker
Michael Stonebraker
Michael Ralph Stonebraker is a computer scientist specializing in database research.Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational database systems on the market today...

 at the University of California, Berkeley
University of California, Berkeley
The University of California, Berkeley , is a teaching and research university established in 1868 and located in Berkeley, California, USA...

 used the term in a 1986 database paper. In it he mentions existing commercial implementations of the architecture (although none are named explicitly). Teradata
Teradata
Teradata Corporation is a vendor specializing in data warehousing and analytic applications. Its products are commonly used by companies to manage data warehouses for analytics and business intelligence purposes. Teradata was formerly a division of NCR Corporation, with the spinoff from NCR on...

, which delivered its first system in 1983, was probably one of those commercial implementations.

Shared nothing is popular for web development because of its scalability
Scalability
In electronics scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth...

. As Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 has demonstrated, a pure SN system can scale almost infinitely simply by adding nodes in the form of inexpensive computers, since there is no single bottleneck to slow the system down. Google calls this sharding. An SN system typically partitions its data among many nodes on different databases (assigning different computers to deal with different users or queries), or may require every node to maintain its own copy of the application's data, using some kind of coordination protocol. This is often referred to as database sharding.

There is some doubt about whether a web application with many independent web nodes but a single, shared database (clustered or otherwise) should be counted as SN. One of the approaches to achieve SN architecture for stateful applications (which typically maintain state in a centralized database
Centralized database
A Centralized database is a database located and maintained in one location, unlike a distributed database. One main advantage is that all data is located in one place. The disadvantage is that bottlenecks may occur....

) is the use of a data grid, also known as distributed caching. This still leaves the centralized database as a single point of failure.

Shared nothing architectures have become prevalent in the data warehousing space. There is much debate as to whether the shared nothing approach is superior to shared Disk with sound arguments presented by both camps. Shared nothing architectures certainly take longer to respond to queries that involve joins over large data sets from different partitions (machines). However the potential for scaling is huge.

What Is Shared

While there is no single point of contention within the software/hardware components of SN systems, it should be noted that information from disparate nodes still needs to be reintegrated at some point. Such points occur wherever an information system that is outside the SN architecture queries information from disparate nodes within the SN architecture for a single purpose. Examples of such external nodes might be:
  1. persons (minds) who look at two SN nodes and decide that they hold or process data about the same thing (simply recognising that two nodes belong to the same SN system would be sufficient)
  2. any software/hardware system that is written to query different nodes within the SN architecture

See also

  • Oracle RAC
    Oracle RAC
    In database computing, Oracle Real Application Clusters — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments...

     (Shared Everything)
  • Ad hoc networking
  • Ambient network
    Ambient network
    Ambient Networks is a network integration design that seeks to solve problems relating to switching between networks to maintain contact with the outside world...

  • Byzantine fault tolerance
    Byzantine fault tolerance
    Byzantine fault tolerance is a sub-field of fault tolerance research inspired by the Byzantine Generals' Problem, which is a generalized version of the Two Generals' Problem....

  • Client–server model
  • Comparison of P2P applications
  • Computer cluster
  • Decentralized computing
    Decentralized computing
    Decentralized computing is the allocation of resources, both hardware and software, to each individual workstation, or office location. In contrast, centralized computing exists when the majority of functions are carried out, or obtained from a remote centralized location. Decentralized computing...

  • Distributed hash table
    Distributed hash table
    A distributed hash table is a class of a decentralized distributed system that provides a lookup service similar to a hash table; pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key...

     (DHT)
  • File sharing
    File sharing
    File sharing is the practice of distributing or providing access to digitally stored information, such as computer programs, multimedia , documents, or electronic books. It may be implemented through a variety of ways...

  • Friend-to-friend
    Friend-to-friend
    A friend-to-friend computer network is a type of peer-to-peer network in which users only make direct connections with people they know. Passwords or digital signatures can be used for authentication....

     (F2F)
  • Friend-to-friend with third party storage
  • Greenplum
    Greenplum
    Greenplum is a database software company in San Mateo, California, specializing in enterprise data cloud solutions for large-scale data warehousing and analytics...

  • Grid computing
    Grid computing
    Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that involve a large number of files...

  • IBM DB2
    IBM DB2
    The IBM DB2 Enterprise Server Edition is a relational model database server developed by IBM. It primarily runs on Unix , Linux, IBM i , z/OS and Windows servers. DB2 also powers the different IBM InfoSphere Warehouse editions...

     (EEE and DPF)
  • MySQL Cluster
    MySQL Cluster
    MySQL Cluster is a technology which provides shared-nothing clustering capabilities for the MySQL database management system. It was first included in the production release of MySQL 4.1 in November 2004. It is designed to provide high availability and high performance, while allowing for nearly...

  • Overlay network
    Overlay network
    An overlay network is a computer network which is built on the top of another network. Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network...

  • Private peer-to-peer
  • Servent
    Servent
    In general a servent is a peer-to-peer network node, which has the functionalities of both a server and a client. This is a portmanteau derived from the terms server and client; it is a play on the word "servant"....

  • Swarm intelligence
    Swarm intelligence
    Swarm intelligence is the collective behaviour of decentralized, self-organized systems, natural or artificial. The concept is employed in work on artificial intelligence...

  • TREX
    TREX search engine
    TREX is a search engine in the SAP NetWeaver integrated technology platform produced by SAP AG. The TREX engine is a standalone component that can be used in a range of system environments but is used primarily as an integral part of such SAP products as Enterprise Portal, Knowledge Warehouse, and...

     (search engine in the SAP NetWeaver
    NetWeaver
    SAP NetWeaver is SAP's integrated technology computing platform and is the technical foundation for many SAP applications since the SAP Business Suite. SAP NetWeaver is marketed as a service-oriented application and integration platform...

     integrated technology platform produced by SAP AG
    SAP AG
    SAP AG is a German software corporation that makes enterprise software to manage business operations and customer relations. Headquartered in Walldorf, Baden-Württemberg, with regional offices around the world, SAP is the market leader in enterprise application software...

    )
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK