Gizzard (scala framework)
Encyclopedia
Gizzard is an open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 sharding framework
Software framework
In computer programming, a software framework is an abstraction in which software providing generic functionality can be selectively changed by user code, thus providing application specific software...

 to create custom fault-tolerant, distributed databases. It was initially used by Twitter
Twitter
Twitter is an online social networking and microblogging service that enables its users to send and read text-based posts of up to 140 characters, informally known as "tweets".Twitter was created in March 2006 by Jack Dorsey and launched that July...

 and emerged out of a wide variety of data storage problems. Gizzard operates as a middleware
Middleware
Middleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...

 networking service that runs on the Java Virtual Machine
Java Virtual Machine
A Java virtual machine is a virtual machine capable of executing Java bytecode. It is the code execution component of the Java software platform. Sun Microsystems stated that there are over 4.5 billion JVM-enabled devices.-Overview:...

. It manages partitioning
Partition (database)
A partition is a division of a logical database or its constituting elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability reasons....

 data across arbitrary backend datastores, that allows it to be accessed efficiently. The partitioning rules are stored in a forwarding table that maps key ranges to partitions. Each partition manages its own replication
Replication (computer science)
Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or...

 through a declarative replication tree
Tree
A tree is a perennial woody plant. It is most often defined as a woody plant that has many secondary branches supported clear of the ground on a single main stem or trunk with clear apical dominance. A minimum height specification at maturity is cited by some authors, varying from 3 m to...

. Gizzard handles both physical and logical shards. Physical shards point to a physical database backend whereas logical shards are trees of other shards. In addition Gizzard also supports migrations
Data migration
Data migration is the process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks...

 and gracefully handles failures. The system is made eventually consistent by requiring that all write operations are idempotent and commutative
Commutativity
In mathematics an operation is commutative if changing the order of the operands does not change the end result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it...

. As operations fail they are retried at a later time. Gizzard is available at GitHub
Github
GitHub is a web-based hosting service for software development projects that use the Git revision control system. GitHub offers both commercial plans and free accounts for open source projects...

 and licensed under the Apache License
Apache License
The Apache License is a copyfree free software license authored by the Apache Software Foundation . The Apache License requires preservation of the copyright notice and disclaimer....

.

See also

  • Distributed hash table
    Distributed hash table
    A distributed hash table is a class of a decentralized distributed system that provides a lookup service similar to a hash table; pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key...

     (DHT)
  • Distributed database
  • FlockDB
    FlockDB
    FlockDB is an open source distributed, fault-tolerant graph database for managing data at webscale. It was initially used by Twitter to build its database of users and manage their relationships to one another...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK