Semantic publishing
Encyclopedia
Semantic publishing on the Web
or semantic web
publishing refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and data integration
more efficient.
Although semantic publishing is not specific to the Web, it has been driven by the rising of the semantic web. In the semantic web, published information is accompanied by metadata describing the information, providing a "semantic" context.
Although semantic publishing has the potential to change the face of web publishing, acceptance depends on the emergence of compelling applications. Web sites can already be built with all contents in both HTML
format and semantic format. RSS
1.0, uses RDF
(a semantic web standard) format, although it has become less popular than RSS2.0 and Atom
.
Semantic publishing has the potential to revolutionize scientific publishing
. Tim Berners-Lee
predicted in 2001 that the semantic web “will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine”. Revisiting the semantic web in 2006, he and his colleagues believed the semantic web “could bring about a revolution in how, for example, scientific content is managed throughout its life cycle”. Researchers could directly self-publish their experiment data in "semantic" format on the web. Semantic search engines could then make these data widely available. The W3C interest group in healthcare and life sciences is exploring this idea.
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
or semantic web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
publishing refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and data integration
Data integration
Data integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...
more efficient.
Although semantic publishing is not specific to the Web, it has been driven by the rising of the semantic web. In the semantic web, published information is accompanied by metadata describing the information, providing a "semantic" context.
Although semantic publishing has the potential to change the face of web publishing, acceptance depends on the emergence of compelling applications. Web sites can already be built with all contents in both HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
format and semantic format. RSS
RSS
-Mathematics:* Root-sum-square, the square root of the sum of the squares of the elements of a data set* Residual sum of squares in statistics-Technology:* RSS , "Really Simple Syndication" or "Rich Site Summary", a family of web feed formats...
1.0, uses RDF
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
(a semantic web standard) format, although it has become less popular than RSS2.0 and Atom
Atom (standard)
The name Atom applies to a pair of related standards. The Atom Syndication Format is an XML language used for web feeds, while the Atom Publishing Protocol is a simple HTTP-based protocol for creating and updating web resources.Web feeds allow software programs to check for updates published on a...
.
Semantic publishing has the potential to revolutionize scientific publishing
Academic publishing
Academic publishing describes the subfield of publishing which distributes academic research and scholarship. Most academic work is published in journal article, book or thesis form. The part of academic written output that is not formally published but merely printed up or posted is often called...
. Tim Berners-Lee
Tim Berners-Lee
Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...
predicted in 2001 that the semantic web “will likely profoundly change the very nature of how scientific knowledge is produced and shared, in ways that we can now barely imagine”. Revisiting the semantic web in 2006, he and his colleagues believed the semantic web “could bring about a revolution in how, for example, scientific content is managed throughout its life cycle”. Researchers could directly self-publish their experiment data in "semantic" format on the web. Semantic search engines could then make these data widely available. The W3C interest group in healthcare and life sciences is exploring this idea.
Two approaches
- Publish information as data objects using semantic web languages like RDFResource Description FrameworkThe Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
and OWLWeb Ontology LanguageThe Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...
. Ontology is usually developed for a specific information domain, which can formally represent the data in its domain. Semantic publishing of more general information like product information, news, and job openings uses so-called shallow ontology. The SWEO Linking Open Data Project maintains a list of data sources that follow this approach as well as a list of Semantic Publishing Tools. - Embed formal metadata in documents using new markup languages like RDFaRDFaRDFa is a W3C Recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents...
and MicroformatsMicroformatsA microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...
.
Examples of ontologies and vocabularies for publishing
- Dublin CoreDublin CoreThe Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...
- SKOSSKOSSimple Knowledge Organization System is a family of formal languages designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is built upon RDF and RDFS, and its main objective is to enable...
- FOAFFOAF (software)FOAF is a machine-readable ontology describing persons, their activities and their relations to other people and objects. Anyone can use FOAF to describe him or herself...
- SIOCSIOCSemantically-Interlinked Online Communities Project is a Semantic Web technology. SIOC provides methods for interconnecting discussion methods such as blogs, forums and mailing lists to each other...
- RSSRSS (file format)RSS is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format...
- DOAPDOAPDescription of a Project is an RDF schema and XML vocabulary to describe software projects, and in particular open-source. It was created and initially developed by Edd Dumbill to convey semantically information associated with open-source software projects...
- SPE
Examples of free or open source tools and services
- Ambra Project is open source software designed to publish open access journals with RDFResource Description FrameworkThe Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
. Used by PLoS. - Semantic MediaWikiSemantic MediaWikiSemantic MediaWiki is an extension to MediaWiki that allows for annotating semantic data within wiki pages, thus turning a wiki that incorporates the extension into a semantic wiki...
: An extension to the wiki application MediaWikiMediaWikiMediaWiki is a popular free web-based wiki software application. Developed by the Wikimedia Foundation, it is used to run all of its projects, including Wikipedia, Wiktionary and Wikinews. Numerous other wikis around the world also use it to power their websites...
that allows users to semantically annotate data on the wiki, and then republish it in formats such as RDF XML. - Swoogle: A search engine for ontologies and instance data on the Web.
- Ufeed: Tool for publishing data resources and data feeds in RDF, including product information, news, events, jobs and studies.
- D2R Server: Tool for publishing relational databases on the Semantic Web as Linked DataLinked DataIn computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...
and SPARQLSPARQLSPARQL is an RDF query language; its name is an acronym that stands for SPARQL Protocol and RDF Query Language. It was made a standard by the RDF Data Access Working Group of the World Wide Web Consortium, and considered as one of the key technologies of semantic web...
endpoints. - BigBlogZoo: Regularly crawlWeb crawlerA Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...
s 60,000 XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
sources and reaggregates articles under a Semantic URLSemantic URLThe term semantic URL refers to a URL which is of a form that is immediately and intuitively meaningful to non-experts. Such URL schemes tend to reflect the conceptual structure of a collection of information and decouple the user interface from a server's internal representation of information...
. Adopts the DMOZ RDF classification schema. - Utopia: Interactive documents
See also
- MetadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
- Metadata publishingMetadata publishingMetadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
- Semantic technologySemantic technologyIn software, semantic technology encodes meanings separately from data and content files, and separately from application code.This enables machines as well as people to understand, share and reason with them at execution time...
- RDF feedRDF feedRDF feed refers to data feed in RDF format. RDF is an emerging semantic web standard language ideal for describing real world objects, and the resulted RDF data resources can be consumed by computers . Theoretically, every web site can create RDF resources for all the information on the web site...
- Data feedData feedData feed is a mechanism for users to receive updated data from data sources. It is commonly used by real-time applications in point-to-point settings as well as on the world-wide web. The latter is also called Web feed. News feed is a popular form of Web feed. RSS feed makes dissemination of blogs...
External Links
- Tutorial on How to publish Linked Data on the Web
- Resources for semantic publishing
- SePublica 2011, the first international workshop on semantic publishing
- Semantic MediaWiki Plus, Semantic MediaWiki from Ontoprise
- Semantic MediaWiki Plus User Forum
Further reading
- Attwood, T. K., Kell, D. B., McDermott, P., Marsh, J., Pettifer, S. R., Thorne, D., et al. (2009). Calling International Rescue: knowledge lost in literature and data landslide! The Biochemical journal, 424(3), 317-33. doi:10.1042/BJ20091474
- Batchelor, C.R., and Corbett, P.T. (2007) Semantic enrichment of journal articles using chemical named entity recognition. Proceedings of the ACL 2007 Demo and Poster Sessions, pages 45–48, Prague, June 2007.
- Shotton, D. (2009), 'Semantic publishing: the coming revolution in scientific journal publishing’. Learned Publishing 22(2), 85-94. doi:10.1087/2009202
- David Shotton,Katie Portwin, Graham Klyne, and Alistair Miles. Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article 2009 April 17. doi:10.1371/journal.pcbi.1000361