Scalability
Encyclopedia
In electronics
(including hardware, communication and software
) scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in a commercial
context, where scalability of a company implies that the underlying business model
offers the potential for economic growth
within the company.
Scalability, as a property of systems, is generally difficult to define and in any particular case it is necessary to define the specific requirements for scalability on those dimensions that are deemed important. It is a highly significant issue in electronics systems, databases, routers, and networking. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system. An algorithm
, design, networking protocol, program
, or other system is said to scale, if it is suitably efficient
and practical when applied to large situations (e.g. a large input data set or a large number of participating nodes in the case of a distributed system). If the design fails when the quantity increases, it does not scale.
The concept of scalability is desirable in technology as well as business
settings. The base concept is consistent - the ability for a business or technology to accept increased volume without impacting the contribution margin
(= revenue
- variable costs). For example, a given piece of equipment may have capacity from 1-1000 users, and beyond 1000 users, additional equipment is needed or performance will decline (variable costs will increase and reduce contribution margin).
As computer prices drop and performance continues to increase, low cost "commodity" systems can be used for high performance computing applications such as seismic analysis and biotechnology workloads that could in the past only be handled by supercomputer
s. Hundreds of small computers may be configured in a cluster to obtain aggregate computing power that often exceeds that of single traditional RISC processor based scientific computers. This model has further been fueled by the availability of high performance interconnects such as Myrinet
and InfiniBand
technologies. It has also led to demand for features such as remote maintenance and batch processing management previously not available for "commodity" systems.
The scale-out model has created an increased demand for shared data storage with very high I/O performance, especially where processing of large amounts of data is required, such as in seismic analysis
. This has fueled the development of new storage technologies such as object storage device
s.
To scale vertically (or scale up) means to add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer. Such vertical scaling of existing systems also enables them to use virtualization technology more effectively, as it provides more resources for the hosted set of operating system
and application
modules to share.
Taking advantage of such resources can also be called "scaling up", such as expanding the number of Apache
daemon processes currently running.
(where possible) is almost always less expensive than actually buying and installing a real one.Configuring an existing idle system has always been less expensive than buying, installing, and configuring a new one, regardless of the model.
s to grow to very large size while supporting an ever-increasing rate of transactions per second
. Not to be discounted, of course, is the rapid pace of hardware advances in both the speed and capacity of mass storage
devices, as well as similar advances in CPU and networking speed. Beyond that, a variety of architectures are employed in the implementation of very large-scale databases.
One technique supported by most of the major database management system (DBMS)
products is the partitioning
of large tables, based on ranges of values in a key field. In this manner, the database can be scaled out across a cluster of separate database server
s. Also, with the advent of 64-bit microprocessor
s, multi-core
CPUs, and large SMP multiprocessors
, DBMS vendors have been at the forefront of supporting multi-threaded
implementations that substantially scale up transaction processing
capacity.
Network-attached storage (NAS)
and Storage area networks (SANs)
coupled with fast local area networks and Fibre Channel
technology enable still larger, more loosely coupled configurations of databases and distributed computing power. The widely supported X/Open XA
standard employs a global transaction monitor to coordinate distributed transaction
s among semi-autonomous XA-compliant database resources. Oracle RAC
uses a different model to achieve scalability, based on a "shared-everything" architecture that relies upon high-speed connections between servers.
While DBMS vendors debate the relative merits of their favored designs, some companies and researchers question the inherent limitations of relational database management system
s. GigaSpaces, for example, contends that an entirely different model of distributed data access and transaction processing, named Space based architecture, is required to achieve the highest performance and scalability. On the other hand, Base One makes the case for extreme scalability without departing from mainstream database technology. In either case, there appears to be no limit in sight to database scalability.
to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in performance engineering
). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If is the fraction of a calculation that is sequential, and is the fraction that can be parallelized, the maximum speedup that can be achieved by using P processors is given according to Amdahl's Law
: . Substituting the value for this example, using 4 processors we get . If we double the compute power to 8 processors we get . Doubling the processing power has only improved the speedup by roughly one-fifth. If the whole problem was parallelizable, we would, of course, expect the speed up to double also. Therefore, throwing in more hardware is not necessarily the optimal approach.
Electronics
Electronics is the branch of science, engineering and technology that deals with electrical circuits involving active electrical components such as vacuum tubes, transistors, diodes and integrated circuits, and associated passive interconnection technologies...
(including hardware, communication and software
Software engineering
Software Engineering is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software, and the study of these approaches; that is, the application of engineering to software...
) scalability is the ability of a system, network, or process, to handle growing amount of work in a graceful manner or its ability to be enlarged to accommodate that growth. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in a commercial
Commerce
While business refers to the value-creating activities of an organization for profit, commerce means the whole system of an economy that constitutes an environment for business. The system includes legal, economic, political, social, cultural, and technological systems that are in operation in any...
context, where scalability of a company implies that the underlying business model
Business model
A business model describes the rationale of how an organization creates, delivers, and captures value...
offers the potential for economic growth
Economic growth
In economics, economic growth is defined as the increasing capacity of the economy to satisfy the wants of goods and services of the members of society. Economic growth is enabled by increases in productivity, which lowers the inputs for a given amount of output. Lowered costs increase demand...
within the company.
Scalability, as a property of systems, is generally difficult to define and in any particular case it is necessary to define the specific requirements for scalability on those dimensions that are deemed important. It is a highly significant issue in electronics systems, databases, routers, and networking. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system. An algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
, design, networking protocol, program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
, or other system is said to scale, if it is suitably efficient
Algorithmic efficiency
In computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce...
and practical when applied to large situations (e.g. a large input data set or a large number of participating nodes in the case of a distributed system). If the design fails when the quantity increases, it does not scale.
The concept of scalability is desirable in technology as well as business
Business
A business is an organization engaged in the trade of goods, services, or both to consumers. Businesses are predominant in capitalist economies, where most of them are privately owned and administered to earn profit to increase the wealth of their owners. Businesses may also be not-for-profit...
settings. The base concept is consistent - the ability for a business or technology to accept increased volume without impacting the contribution margin
Contribution margin
In cost-volume-profit analysis, a form of management accounting, contribution margin is the marginal profit per unit sale. It is a useful quantity in carrying out various calculations, and can be used as a measure of operating leverage...
(= revenue
Revenue
In business, revenue is income that a company receives from its normal business activities, usually from the sale of goods and services to customers. In many countries, such as the United Kingdom, revenue is referred to as turnover....
- variable costs). For example, a given piece of equipment may have capacity from 1-1000 users, and beyond 1000 users, additional equipment is needed or performance will decline (variable costs will increase and reduce contribution margin).
Measures
Scalability can be measured in various dimensions, such as:- Administrative scalability: The ability for an increasing number of organizations to easily share a single distributed system.
- Functional scalability: The ability to enhance the system by adding new functionality at minimal effort.
- Geographic scalability: The ability to maintain performance, usefulness, or usability regardless of expansion from concentration in a local area to a more distributed geographic pattern.
- Load scalability: The ability for a distributed system to easily expand and contract its resource pool to accommodate heavier or lighter loads. Alternatively, the ease with which a system or component can be modified, added, or removed, to accommodate changing load.
Examples
- A routing protocolRouting protocolA routing protocol is a protocol that specifies how routers communicate with each other, disseminating information that enables them to select routes between any two nodes on a computer network, the choice of the route being done by routing algorithms. Each router has a priori knowledge only of...
is considered scalable with respect to network size, if the size of the necessary routing tableRouting tableIn computer networking a routing table, or Routing Information Base , is a data table stored in a router or a networked computer that lists the routes to particular network destinations, and in some cases, metrics associated with those routes. The routing table contains information about the...
on each node grows as OBig O notationIn mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
(log N), where N is the number of nodes in the network. - A scalable online transaction processingOnline transaction processingOnline transaction processing, or OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing...
system or database management systemDatabase management systemA database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down. - Some early peer-to-peerPeer-to-peerPeer-to-peer computing or networking is a distributed application architecture that partitions tasks or workloads among peers. Peers are equally privileged, equipotent participants in the application...
(P2P) implementations of GnutellaGnutellaGnutella is a large peer-to-peer network which, at the time of its creation, was the first decentralized peer-to-peer network of its kind, leading to other, later networks adopting the model...
had scaling issues. Each node query floodedQuery floodingQuery flooding is a method to search for a resource on a P2P network. It is simple but scales very poorly and thus is rarely used. Early versions of the Gnutella protocol operated by query flooding; newer versions use more efficient search algorithms....
its requests to all peers. The demand on each peer would increase in proportion to the total number of peers, quickly overrunning the peers' limited capacity. Other P2P systems like BitTorrent scale well because the demand on each peer is independent of the total number of peers. There is no centralized bottleneck, so the system may expand indefinitely without the addition of supporting resources (other than the peers themselves). - The distributed nature of the Domain Name SystemDomain name systemThe Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...
allows it to work efficiently even when all hostsServer (computing)In the context of client-server architecture, a server is a computer program running to serve the requests of other programs, the "clients". Thus, the "server" performs some computational task on behalf of "clients"...
on the worldwide InternetInternetThe Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
are served, so it is said to "scale well".
Scale horizontally vs. vertically
Methods of adding more resources for a particular application fall into two broad categories:Scale horizontally (scale out)
To scale horizontally (or scale out) means to add more nodes to a system, such as adding a new computer to a distributed software application. An example might be scaling out from one Web server system to three.As computer prices drop and performance continues to increase, low cost "commodity" systems can be used for high performance computing applications such as seismic analysis and biotechnology workloads that could in the past only be handled by supercomputer
Supercomputer
A supercomputer is a computer at the frontline of current processing capacity, particularly speed of calculation.Supercomputers are used for highly calculation-intensive tasks such as problems including quantum physics, weather forecasting, climate research, molecular modeling A supercomputer is a...
s. Hundreds of small computers may be configured in a cluster to obtain aggregate computing power that often exceeds that of single traditional RISC processor based scientific computers. This model has further been fueled by the availability of high performance interconnects such as Myrinet
Myrinet
Myrinet, ANSI/VITA 26-1998, is a high-speed local area networking system designed by Myricom to be used as an interconnect between multiple machines to form computer clusters. Myrinet has much lower protocol overhead than standards such as Ethernet, and therefore provides better throughput, less...
and InfiniBand
InfiniBand
InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable...
technologies. It has also led to demand for features such as remote maintenance and batch processing management previously not available for "commodity" systems.
The scale-out model has created an increased demand for shared data storage with very high I/O performance, especially where processing of large amounts of data is required, such as in seismic analysis
Seismic analysis
Seismic Analysis is a subset of structural analysis and is the calculation of the response of a building structure to earthquakes...
. This has fueled the development of new storage technologies such as object storage device
Object storage device
An Object-based Storage Device is a computer storage device, similar to disk storage but working at a higher level. Instead of providing a block-oriented interface that reads and writes fixed sized blocks of data, an OSD organizes data into flexible-sized data containers, called objects...
s.
Scale vertically (scale up)
To scale vertically (or scale up) means to add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer. Such vertical scaling of existing systems also enables them to use virtualization technology more effectively, as it provides more resources for the hosted set of operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
and application
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...
modules to share.
Taking advantage of such resources can also be called "scaling up", such as expanding the number of Apache
Apache HTTP Server
The Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...
daemon processes currently running.
Tradeoffs
There are tradeoffs between the two models. Larger numbers of computers means increased management complexity, as well as a more complex programming model and issues such as throughput and latency between nodes; also, some applications do not lend themselves to a distributed computing model. In the past, the price difference between the two models has favored "scale out" computing for those applications that fit its paradigm, but recent advances in virtualization technology have blurred that advantage, since deploying a new virtual system over a hypervisorHypervisor
In computing, a hypervisor, also called virtual machine manager , is one of many hardware virtualization techniques that allow multiple operating systems, termed guests, to run concurrently on a host computer. It is so named because it is conceptually one level higher than a supervisory program...
(where possible) is almost always less expensive than actually buying and installing a real one.Configuring an existing idle system has always been less expensive than buying, installing, and configuring a new one, regardless of the model.
Database scalability
A number of different approaches enable databaseDatabase
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
s to grow to very large size while supporting an ever-increasing rate of transactions per second
Transactions Per Second
In a very generic sense, the term Transactions Per Second refers to the number of atomic actions performed by certain entity per second. In a more restricted view, the term is usually used by DBMS vendor and user community to refer to the number of database transactions performed per second....
. Not to be discounted, of course, is the rapid pace of hardware advances in both the speed and capacity of mass storage
Mass storage
In computing, mass storage refers to the storage of large amounts of data in a persisting and machine-readable fashion. Devices and/or systems that have been described as mass storage include tape libraries, RAID systems, hard disk drives, magnetic tape drives, optical disc drives, magneto-optical...
devices, as well as similar advances in CPU and networking speed. Beyond that, a variety of architectures are employed in the implementation of very large-scale databases.
One technique supported by most of the major database management system (DBMS)
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
products is the partitioning
Partition (database)
A partition is a division of a logical database or its constituting elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability reasons....
of large tables, based on ranges of values in a key field. In this manner, the database can be scaled out across a cluster of separate database server
Database server
A database server is a computer program that provides database services to other computer programs or computers, as defined by the client–server model. The term may also refer to a computer dedicated to running such a program...
s. Also, with the advent of 64-bit microprocessor
Microprocessor
A microprocessor incorporates the functions of a computer's central processing unit on a single integrated circuit, or at most a few integrated circuits. It is a multipurpose, programmable device that accepts digital data as input, processes it according to instructions stored in its memory, and...
s, multi-core
Multi-core (computing)
A multi-core processor is a single computing component with two or more independent actual processors , which are the units that read and execute program instructions...
CPUs, and large SMP multiprocessors
Symmetric multiprocessing
In computing, symmetric multiprocessing involves a multiprocessor computer hardware architecture where two or more identical processors are connected to a single shared main memory and are controlled by a single OS instance. Most common multiprocessor systems today use an SMP architecture...
, DBMS vendors have been at the forefront of supporting multi-threaded
Thread (computer science)
In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process...
implementations that substantially scale up transaction processing
Transaction processing
In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state...
capacity.
Network-attached storage (NAS)
Network-attached storage
Network-attached storage is file-level computer data storage connected to a computer network providing data access to heterogeneous clients. NAS not only operates as a file server, but is specialized for this task either by its hardware, software, or configuration of those elements...
and Storage area networks (SANs)
Storage area network
A storage area network is a dedicated network that provides access to consolidated, block level data storage. SANs are primarily used to make storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices...
coupled with fast local area networks and Fibre Channel
Fibre Channel
Fibre Channel, or FC, is a gigabit-speed network technology primarily used for storage networking. Fibre Channel is standardized in the T11 Technical Committee of the InterNational Committee for Information Technology Standards , an American National Standards Institute –accredited standards...
technology enable still larger, more loosely coupled configurations of databases and distributed computing power. The widely supported X/Open XA
X/Open XA
In computing, the XA standard is a specification by The Open Group for distributed transaction processing . It describes the interface between the global transaction manager and the local resource manager...
standard employs a global transaction monitor to coordinate distributed transaction
Distributed transaction
A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager is responsible for creating and managing a global transaction that encompasses all operations against such resources...
s among semi-autonomous XA-compliant database resources. Oracle RAC
Oracle RAC
In database computing, Oracle Real Application Clusters — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments...
uses a different model to achieve scalability, based on a "shared-everything" architecture that relies upon high-speed connections between servers.
While DBMS vendors debate the relative merits of their favored designs, some companies and researchers question the inherent limitations of relational database management system
Relational database management system
A relational database management system is a database management system that is based on the relational model as introduced by E. F. Codd. Most popular databases currently in use are based on the relational database model....
s. GigaSpaces, for example, contends that an entirely different model of distributed data access and transaction processing, named Space based architecture, is required to achieve the highest performance and scalability. On the other hand, Base One makes the case for extreme scalability without departing from mainstream database technology. In either case, there appears to be no limit in sight to database scalability.
Design for scalability
It is often advised to focus system design on hardware scalability rather than on capacity. It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in performance tuningPerformance tuning
Performance tuning is the improvement of system performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. The motivation for such activity is called a performance problem, which can be real or anticipated....
to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in performance engineering
Performance Engineering
Performance engineering within systems engineering, encompasses the set of roles, skills, activities, practices, tools, and deliverables applied at every phase of the Systems Development Life Cycle which ensures that a solution will be designed, implemented, and operationally supported to meet the...
). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If is the fraction of a calculation that is sequential, and is the fraction that can be parallelized, the maximum speedup that can be achieved by using P processors is given according to Amdahl's Law
Amdahl's law
Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved...
: . Substituting the value for this example, using 4 processors we get . If we double the compute power to 8 processors we get . Doubling the processing power has only improved the speedup by roughly one-fifth. If the whole problem was parallelizable, we would, of course, expect the speed up to double also. Therefore, throwing in more hardware is not necessarily the optimal approach.
Weak versus strong scaling
In the context of high performance computing there are two common notions of scalability. The first is strong scaling, which is defined as how the solution time varies with the number of processors for a fixed total problem size. The second is weak scaling, which is defined as how the solution time varies with the number of processors for a fixed problem size per processor.See also
- Amdahl's lawAmdahl's lawAmdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahl, and is used to find the maximum expected improvement to an overall system when only part of the system is improved...
- Asymptotic complexity
- Gustafson's lawGustafson's lawGustafson's Law is a law in computer science which says that problems with large, repetitive data sets can be efficiently parallelized. Gustafson's Law contradicts Amdahl's law, which describes a limit on the speed-up that parallelization can provide. Gustafson's law was first described by John...
- List of system quality attributes
- Load balancing (computing)Load balancing (computing)Load balancing is a computer networking methodology to distribute workload across multiple computers or a computer cluster, network links, central processing units, disk drives, or other resources, to achieve optimal resource utilization, maximize throughput, minimize response time, and avoid...
- Lock (computer science)Lock (computer science)In computer science, a lock is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. Locks are one way of enforcing concurrency control policies.-Types:...
- NoSQLNosqlIn computing, NoSQL is a broad class of database management systems that differ from the classic model of the relational database management system in some significant ways. These data stores may not require fixed table schemas, usually avoid join operations, and typically scale horizontally...
- Parallelism in computingParallel computingParallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...
- Performance engineeringPerformance EngineeringPerformance engineering within systems engineering, encompasses the set of roles, skills, activities, practices, tools, and deliverables applied at every phase of the Systems Development Life Cycle which ensures that a solution will be designed, implemented, and operationally supported to meet the...
- Scalable Video CodingScalable Video CodingScalable Video Coding is the name for the Annex G extension of the H.264/MPEG-4 AVC video compression standard. SVC standardizes the encoding of a high-quality video bitstream that also contains one or more subset bitstreams. A subset video bitstream is derived by dropping packets from the...
(SVC) - SimilitudeSimilitude (model)Similitude is a concept applicable to the testing of engineering models. A model is said to have similitude with the real application if the two share geometric similarity, kinematic similarity and dynamic similarity...
- Space-based architectureSpace-based architectureSpace-Based Architecture is a software architecture pattern for achieving linear scalability of stateful, high-performance applications using the tuple space paradigm. It follows many of the principles of Representational State Transfer , service-oriented architecture and Event-driven...
(SBA)
External links
- Architecture of a Highly Scalable NIO-Based Server - an article about writing scalable server in Java (java.net).
- Links to diverse learning resources - page curated by the memcachedMemcachedIn computing, memcached is a general-purpose distributed memory caching system that was originally developed by Danga Interactive for LiveJournal, but is now used by many other sites. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the...
project. - Scalable Definition - by The Linux Information Project (LINFO)