A distributed data store is a blurred concept and means either a distributed database where users store their information on a number of nodes, or a network in which a user stores their information on a number of peer network nodes ....
is a NoSQL object of the lowest level in a keyspace. It is a tuple
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...
(a key-value pair) consisting of three elements:
Unique name: Used to reference the column
Value: The content of the column. It can have different types, like AsciiType, LongType, TimeUUIDType, UTF8Type among others.
A timestamp is a sequence of characters, denoting the date or time at which a certain event occurred. A timestamp is the time at which an event is recorded by a computer, not the time of the event itself...
: The system timestamp used to determine the valid content.
Usage
The column is used as a store for the value and has a timestamp that is used to differentiate the valid content from stale ones. According to the CAP theorem
CAP theorem
In theoretical computer science the CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:...
, distributed data stores cannot guarantee consistency, as availability
Availability
In telecommunications and reliability theory, the term availability has the following meanings:* The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time...
is a more important issue. Therefore, the data store or the application programmer will use the timestamp to find out which of the stored values in the backup nodes are up-to-date.
Some data stores, like Apache Cassandra 0.7, may use the more sophisticated vector clock instead of the timestamp to resolve stale information.
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
s, a column is a part of a relational table that can be seen in each row of the table. This is not the case in distributed data stores, where the concept of a table only vaguely exists. A column can be part of a ColumnFamily that resembles at most a relational row, but it may appear in one row and not in the others. Also, the number of columns may change from row to row, and new updates to the data store model may also modify the column number. So, all the work of keeping up with changes relies on the application programmer.
JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...