Triplestore
Encyclopedia
A triplestore is a purpose-built database
for the storage and retrieval of Resource Description Framework
(RDF) metadata
.
Much like a relational database
, one stores information in a triplestore and retrieves it via a query language
. Unlike a relational database, a triplestore is optimized for the storage and retrieval of many short statements called triples, in the form of subject-predicate
-object, like "Bob is 35" or "Bob knows Fred".
Some triplestores can store billions of triples. The performance of a particular triplestore can be measured with the Lehigh University
Benchmark (LUBM),Lehigh University Triplestore Benchmark or with real data from UniProt
.
databases, this intermediate approach allowed large and powerful database engines to be constructed for little programming effort in the initial phases of triplestore development. Long-term though it seems likely that native triplestores will have the advantage for performance. A difficulty with implementing triplestores over SQL is that although "triples" may thus be "stored", implementing efficient querying of a graph-based RDF model (i.e. mapping from SPARQL
) onto SQL queries is difficult.
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
for the storage and retrieval of Resource Description Framework
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
(RDF) metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
.
Much like a relational database
Relational database
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system . Colloquial use of the term "relational database" may refer to the RDBMS software, or the relational database itself...
, one stores information in a triplestore and retrieves it via a query language
Query language
Query languages are computer languages used to make queries into databases and information systems.Broadly, query languages can be classified according to whether they are database query languages or information retrieval query languages...
. Unlike a relational database, a triplestore is optimized for the storage and retrieval of many short statements called triples, in the form of subject-predicate
Predicate (grammar)
There are two competing notions of the predicate in theories of grammar. Traditional grammar tends to view a predicate as one of two main parts of a sentence, the other being the subject, which the predicate modifies. The other understanding of predicates is inspired from work in predicate calculus...
-object, like "Bob is 35" or "Bob knows Fred".
Some triplestores can store billions of triples. The performance of a particular triplestore can be measured with the Lehigh University
Lehigh University
Lehigh University is a private, co-educational university located in Bethlehem, Pennsylvania, in the Lehigh Valley region of the United States. It was established in 1865 by Asa Packer as a four-year technical school, but has grown to include studies in a wide variety of disciplines...
Benchmark (LUBM),Lehigh University Triplestore Benchmark or with real data from UniProt
UniProt
UniProt is a comprehensive, high-quality and freely accessible database of protein sequence and functional information, many of which are derived from genome sequencing projects...
.
Implementation
Some triplestores have been built as database engines from scratch, while others have been built on top of existing commercial relational database engines (i.e. SQL-based). Like the early development of OLAPOLAP
In computing, online analytical processing, or OLAP , is an approach to swiftly answer multi-dimensional analytical queries. OLAP is part of the broader category of business intelligence, which also encompasses relational reporting and data mining...
databases, this intermediate approach allowed large and powerful database engines to be constructed for little programming effort in the initial phases of triplestore development. Long-term though it seems likely that native triplestores will have the advantage for performance. A difficulty with implementing triplestores over SQL is that although "triples" may thus be "stored", implementing efficient querying of a graph-based RDF model (i.e. mapping from SPARQL
SPARQL
SPARQL is an RDF query language; its name is an acronym that stands for SPARQL Protocol and RDF Query Language. It was made a standard by the RDF Data Access Working Group of the World Wide Web Consortium, and considered as one of the key technologies of semantic web...
) onto SQL queries is difficult.
List of triplestore implementations
Name | Language | Homepage |
---|---|---|
3store | C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://www.aktors.org/technologies/3store/ |
4store | C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://www.4store.org/ |
5store | C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://4store.org/trac/wiki/5store |
AllegroGraph AllegroGraph AllegroGraph is a closed source Graph database, an emerging category of databases. In contrast with a Relational database, a graph database considers each stored item to have any number of relationships. These relationships can be viewed as links, which together form a network, or graph.... |
Common Lisp Common Lisp Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers... |
http://www.franz.com/agraph/allegrograph/ |
ARC | PHP PHP PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document... |
http://arc.semsol.org/ |
Ariadne Genomics | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.ariadnegenomics.com/ |
Bigdata | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.bigdata.com/ |
BigOWLIM | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.ontotext.com/owlim/ |
Dydra | Common Lisp Common Lisp Common Lisp, commonly abbreviated CL, is a dialect of the Lisp programming language, published in ANSI standard document ANSI INCITS 226-1994 , . From the ANSI Common Lisp standard the Common Lisp HyperSpec has been derived for use with web browsers... , C |
http://www.dydra.com/ |
Jena Jena (framework) Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these... |
Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://jena.sourceforge.net/ |
Mulgara Mulgara (software) Mulgara is a triplestore and fork of the original Kowari project. It is Open Source, scalable, and transaction-safe. Mulgara instances can be queried via the iTQL query language and the SPARQL query language.-History:... |
Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.mulgara.org/ |
OpenAnzo | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.openanzo.org/ |
OntoBroker | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.ontoprise.de/en/home/products/ontobroker/ |
Oracle Oracle Database The Oracle Database is an object-relational database management system produced and marketed by Oracle Corporation.... |
Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... , PL/SQL PL/SQL PL/SQL is Oracle Corporation's procedural extension language for SQL and the Oracle relational database... , SQL SQL SQL is a programming language designed for managing data in relational database management systems .... |
http://www.oracle.com/technetwork/database/options/semantic-tech/whatsnew/index.html |
Parliament | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... /C++ C++ C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell... |
http://parliament.semwebcentral.org/ |
Pointrel System | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... /Python Python (programming language) Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive... |
http://sourceforge.net/projects/pointrel/ |
RAP | PHP PHP PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document... |
http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/ |
RDF::Core | Perl Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular... |
http://search.cpan.org/dist/RDF-Core/ |
RDF::Trine | Perl Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular... |
http://www.perlrdf.org/ |
RDF-3X | C++ C++ C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell... |
http://www.mpi-inf.mpg.de/~neumann/rdf3x/ |
RDFBroker | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://rdfbroker.opendfki.de/ |
Redland Redland RDF Application Framework Redland is a set of free software libraries written in C that provide support for the Resource Description Framework , created by Dave Beckett .The packages that form Redland are:... |
C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://librdf.org/ |
RedStore | C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://www.aelius.com/njh/redstore/ |
Semantics Platform | C# | http://www.intellidimension.com/ |
SemWeb-DotNet | C# | http://razor.occams.info/code/semweb/ |
Sesame Sesame (framework) Sesame is an open-source framework for querying and analyzing RDF data. It was created, and is still being maintained, by the Dutch software company . It was originally developed as part of the "On-To-Knowledge", a semantic web project that ran from 1999 to 2002. It contains a triplestore.Sesame... |
Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.openrdf.org/ |
Soprano Soprano (KDE) Soprano is a software library that provides an object-oriented C++/Qt4 framework for RDF data. Soprano was created as a sub project under the NEPOMUK project and forms a part of the semantic desktop in KDE Software Compilation 4... |
C++ C++ C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell... |
http://soprano.sourceforge.net/ |
Stardog | Java Java Java is an island of Indonesia. With a population of 135 million , it is the world's most populous island, and one of the most densely populated regions in the world. It is home to 60% of Indonesia's population. The Indonesian capital city, Jakarta, is in west Java... |
http://stardog.com/ |
StrixDB StrixDB StrixDB is a Triplestore designed to manipulate middle sized RDF graphs.- Features :StrixDB main features are:*compliance with SPARQL and SPARQL/Update.... |
C++ C++ C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell... /Lua |
http://www.strixdb.com/ |
SwiftOWLIM | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://www.ontotext.com/owlim/ |
Virtuoso Virtuoso Universal Server Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional RDBMS, ORDBMS, virtual database, RDF, XML, free-text, web application server and file server functionality in a single system... |
C C (programming language) C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system.... |
http://virtuoso.openlinksw.com/ |
YARS | Java Java (programming language) Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities... |
http://sw.deri.org/2004/06/yars/ |
Smart-M3 Smart-M3 Smart-M3 is a name of an open source software project that aims to provide a "Semantic Web" information sharing infrastructure between software entities and devices. It combines the ideas of distributed, networked systems and semantic web... |
Python/Java/C/C# |
See also
- FreebaseFreebase (database)Freebase is a large collaborative knowledge base consisting of metadata composed mainly by its community members. It is an online collection of structured data harvested from many sources, including individual 'wiki' contributions. Freebase aims to create a global resource which allows people to...
, uses a triplestore called graphd. - Named graphs
External links
- A list of large triplestores
- Lehigh University Benchmark (LUBM)
- Semantic Systems Biology
- ARC's RDF Store is built using PHPPHPPHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
with MySQLMySQLMySQL officially, but also commonly "My Sequel") is a relational database management system that runs as a server providing multi-user access to a number of databases. It is named after developer Michael Widenius' daughter, My...
as the backend for the triplestore. It also provides a SPARQLSPARQLSPARQL is an RDF query language; its name is an acronym that stands for SPARQL Protocol and RDF Query Language. It was made a standard by the RDF Data Access Working Group of the World Wide Web Consortium, and considered as one of the key technologies of semantic web...
endpoint for access and updating of stored triples.