Single source publishing
Encyclopedia
Single source publishing, also known as single sourcing, allows the same content to be used in different documents (deliverables) or in various formats. The labour-intensive and expensive work of editing need only be carried out once, on one document. Further transformations are carried out mechanistically, by automated tools. You may also add new output formats in the future, as your organization's needs change.
Single source publishing offers benefits in at least three areas. Their relative emphasis depends on the application and the type of information being communicated, and will influence the choice of tools used:
, a printed document PDF, a web page
and in an Interactive Voice Response
system. With a single source solution, the company only has to update the one source file for the content and regenerate the four outputs.
Where the differences are purely those of formatting, this is a simple process to implement.
of this procedure (one in each guide) the guides can share the content, merging it into the document at the time of publication. Eliminating duplicate content reduces the cost of maintaining it, and also improves future consistency. A change to this procedure need only be made once, then it will appear in all of the outputs.
By separating text and structure of overall documents, the text units may be translated simply and overall structure is abstracted from this. It supports the provision of multiple localizations at lower cost, also the structure or formatting may then be modified in the future without needing the translation work to be re-done.
of use. The technical transformation is mechanical and easily implemented, but the different ways in which a document may be used will have an impact on its authoring from the outset and this is not easy to automate, or to automate well. As an example, a user manual as a PDF document may have a linear narrative of 50 pages while its comparable online help system may present the same content as paragraphs, and structures this as 100 pages, all of which must be usable as stand-alone topics and require extensive linking between them. A "tutorial" might present much of the same content, but in less depth and following a linear narrative
. This requirement to support multiple contexts is not easily met by mere programming.
s, GUI builders, or spreadsheets. Various tools can then be used to extract the information from the master document and format them into various output formats or modalities.
A number of tools can be used to generate the various modalities from the master document. Programming languages provide the greatest flexibility, but also provide the least amount of direct support. If the output can pass through an XML
format during output, then XSLT
and XSL-FO can be used to transform the document to various forms. Code generators such as CodeSmith or a graphical stylesheet design tool such as Altova's StyleVision can be used to make transforms more easily than a general purpose programming language. Content management systems may have various output modalities that they support directly.
At the top end, in both price and functionality, there are programs specifically designed to support single sourcing. These will require less effort to configure and manage than more general purpose tools. Consider these programs carefully. At the very least, they can give you ideas about what sorts of features you would want in your in-house single sourcing solution. (See later section that lists several popular single-sourcing tools.)
can be helpful at this stage in identifying and organizing common attributes. Store information about each object and attribute in your master file. How to do this of course varies considerably depending upon the nature of your master file. In choosing the master file format, keep in mind those who will maintain the actual words. Often, it will be technical or professional writers. Choose a tool that they are familiar with or can learn quickly. Raw XML, for example, might be difficult for many writers to manually input accurately.
Having fine and quantified granularity of your information can be helpful in enabling various methods of massaging the data for different output modalities. For example, you don't want your master file to consist of pages upon pages of unorganized text about the object. You generally want to know such things as its name, its category, a short description, a long description, perhaps how it is used in a given context. For a museum item, for example, you would want its catalog number, the collection it's from, its age, where it was collected from, how it was acquired, its value if known, its use, its provenance, its historical context, etc. With all of these individual pieces of information, you can output cards for use in displays, descriptions for the museum's web site, and printed manuals describing specific collections. If you just have pages of unorganized text, this becomes much more difficult to manage from a single source perspective.
If you have a database containing information about the objects, study it. There will be many ideas contained therein about what might be interesting about the items you are documenting. A database programmer can also be helpful in helping you to design your master files.
There may be multiple dimensions to the master data. For example, you might have the data translated into various languages. Every time you add a dimension, you make maintenance of the master data exponentially more difficult. However, if the problem you are solving warrants multiple dimensions, then it's also likely a good candidate for single sourcing.
as your master format since many technical writers are familiar with it, it does a good job of creating attractive pages, and it has tags that are easily transformed into other modalities using the single source tools mentioned above. You might also keep in mind that translations to other languages are often outsourced, so a common format that can be easily used by translators is frequently important.
Whatever master format you choose, you should provide templates for each type of object you wish to document. This helps maintain consistency over a collection of master objects. It is advisable to design your master format very carefully before beginning to use it. Going through all of your master objects making changes to conform to a change you think of later can be very expensive, tedious, and error-prone. Planning ahead and thinking about your organization's needs are very important to this process.
The following are applications for designing and structuring multi-format output based on a single source.
Single source publishing offers benefits in at least three areas. Their relative emphasis depends on the application and the type of information being communicated, and will influence the choice of tools used:
- Technical publishing formats
- Content assembly, including translationTranslationTranslation is the communication of the meaning of a source-language text by means of an equivalent target-language text. Whereas interpreting undoubtedly antedates writing, translation began only after the appearance of written literature; there exist partial translations of the Sumerian Epic of...
and localizationLanguage localisationLanguage localisationThe spelling "localization", a variant of "localisation", is the preferred spelling in the US and Canada. is the second phase of a larger process of product translation and cultural adaptation to account for... - Different contexts of use
Technical publishing formats
Single sourcing allows the creation of documents in various technical formats from the same content. For example, a company might use the same content in online helpOnline help
Online help is topic-oriented, procedural or reference information delivered through computer software. It is a form of user assistance. Most online help is designed to give assistance in the use of a software application or operating system, but can also be used to present information on a broad...
, a printed document PDF, a web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
and in an Interactive Voice Response
Interactive voice response
Interactive voice response is a technology that allows a computer to interact with humans through the use of voice and DTMF keypad inputs....
system. With a single source solution, the company only has to update the one source file for the content and regenerate the four outputs.
Where the differences are purely those of formatting, this is a simple process to implement.
Content assembly
As an example, a company may have several products with individual user guides, all of which share a common procedure. Rather than maintain duplicate versionsFile copying
In the realm of computer file management, file copying is the creation of a new file which has the same content as an existing file.All computer operating systems include file copying provisions in the user interface, like the command, "cp" in Unix and "copy" in MS-DOS; operating systems with a...
of this procedure (one in each guide) the guides can share the content, merging it into the document at the time of publication. Eliminating duplicate content reduces the cost of maintaining it, and also improves future consistency. A change to this procedure need only be made once, then it will appear in all of the outputs.
Translation and localization
An increasing need in a globalised market is that of localizing products and their documentation to suit local markets. This most obviously encompasses translation, but it may also extend to simple formatting (dates or currency) or by the selection of culturally appropriate examples (more usually, by avoiding culturally or religiously inappropriate content).By separating text and structure of overall documents, the text units may be translated simply and overall structure is abstracted from this. It supports the provision of multiple localizations at lower cost, also the structure or formatting may then be modified in the future without needing the translation work to be re-done.
Context of use
A common mistake in single sourcing systems is to confuse transforms between technical formats (e.g. PDF vs. HTML) and different modalitiesModality (semiotics)
In semiotics, a modality is a particular way in which the information is to be encoded for presentation to humans, i.e. to the type of sign and to the status of reality ascribed to or claimed by a sign, text or genre. It is more closely associated with the semiotics of Charles Peirce than Saussure...
of use. The technical transformation is mechanical and easily implemented, but the different ways in which a document may be used will have an impact on its authoring from the outset and this is not easy to automate, or to automate well. As an example, a user manual as a PDF document may have a linear narrative of 50 pages while its comparable online help system may present the same content as paragraphs, and structures this as 100 pages, all of which must be usable as stand-alone topics and require extensive linking between them. A "tutorial" might present much of the same content, but in less depth and following a linear narrative
Narrative paradigm
The Narrative Paradigm is a theory proposed by Walter Fisher that all meaningful communication is a form of storytelling or giving a report of events and so human beings experience and comprehend life as a series of ongoing narratives, each with their own conflicts, characters, beginnings,...
. This requirement to support multiple contexts is not easily met by mere programming.
Choosing the tool
Ideally, the tools used for single-sourcing do not require human intervention to customize the formatting or content for the various outputs. There are many approaches to single-sourcing. The master information can be stored in any number of ways. These might include word processing documents, databases, XML files, content management systemContent management system
A content management system is a system providing a collection of procedures used to manage work flow in a collaborative environment. These procedures can be manual or computer-based...
s, GUI builders, or spreadsheets. Various tools can then be used to extract the information from the master document and format them into various output formats or modalities.
A number of tools can be used to generate the various modalities from the master document. Programming languages provide the greatest flexibility, but also provide the least amount of direct support. If the output can pass through an XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
format during output, then XSLT
XSLT
XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...
and XSL-FO can be used to transform the document to various forms. Code generators such as CodeSmith or a graphical stylesheet design tool such as Altova's StyleVision can be used to make transforms more easily than a general purpose programming language. Content management systems may have various output modalities that they support directly.
At the top end, in both price and functionality, there are programs specifically designed to support single sourcing. These will require less effort to configure and manage than more general purpose tools. Consider these programs carefully. At the very least, they can give you ideas about what sorts of features you would want in your in-house single sourcing solution. (See later section that lists several popular single-sourcing tools.)
Designing the master
One of the more difficult parts of single sourcing is designing the master formats. To do this, you need to notice if you have a lot of similar things that all need to be documented. For example, menus and dialogs in an application, classes in a programming application programmer interface (API), widgets in a line of similar products, objects in a museum collection, parts of complex machines, products for sale in an online store, and so forth. Once you have identified a set of similar items, what characteristics do each of these items have in common? A computer programmer or analyst skilled at object-oriented programmingObject-oriented programming
Object-oriented programming is a programming paradigm using "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction,...
can be helpful at this stage in identifying and organizing common attributes. Store information about each object and attribute in your master file. How to do this of course varies considerably depending upon the nature of your master file. In choosing the master file format, keep in mind those who will maintain the actual words. Often, it will be technical or professional writers. Choose a tool that they are familiar with or can learn quickly. Raw XML, for example, might be difficult for many writers to manually input accurately.
Having fine and quantified granularity of your information can be helpful in enabling various methods of massaging the data for different output modalities. For example, you don't want your master file to consist of pages upon pages of unorganized text about the object. You generally want to know such things as its name, its category, a short description, a long description, perhaps how it is used in a given context. For a museum item, for example, you would want its catalog number, the collection it's from, its age, where it was collected from, how it was acquired, its value if known, its use, its provenance, its historical context, etc. With all of these individual pieces of information, you can output cards for use in displays, descriptions for the museum's web site, and printed manuals describing specific collections. If you just have pages of unorganized text, this becomes much more difficult to manage from a single source perspective.
If you have a database containing information about the objects, study it. There will be many ideas contained therein about what might be interesting about the items you are documenting. A database programmer can also be helpful in helping you to design your master files.
There may be multiple dimensions to the master data. For example, you might have the data translated into various languages. Every time you add a dimension, you make maintenance of the master data exponentially more difficult. However, if the problem you are solving warrants multiple dimensions, then it's also likely a good candidate for single sourcing.
Transformation
Once you have identified these objects and attributes, think about how they will be presented in each output modality. Do mock ups of your data for several objects in each modality you are thinking about supporting. If you can translate the data by hand from your master format to each output modality with the control you are interested in having, then you're on the way to a successful system. If you demand ultimate control in one modality, you might consider that modality as your master, or part of your master. For example, if a PDF manual is the most important modality, you might consider FrameMakerFrameMaker
Adobe FrameMaker is a document processor for the production and manipulation of large structured documents. It is produced by Adobe Systems. Although FrameMaker has evolved slowly in recent years, it maintains a strong following among professional technical writers.- Overview :FrameMaker has more...
as your master format since many technical writers are familiar with it, it does a good job of creating attractive pages, and it has tags that are easily transformed into other modalities using the single source tools mentioned above. You might also keep in mind that translations to other languages are often outsourced, so a common format that can be easily used by translators is frequently important.
Whatever master format you choose, you should provide templates for each type of object you wish to document. This helps maintain consistency over a collection of master objects. It is advisable to design your master format very carefully before beginning to use it. Going through all of your master objects making changes to conform to a change you think of later can be very expensive, tedious, and error-prone. Planning ahead and thinking about your organization's needs are very important to this process.
Human considerations
Single sourcing can be accomplished successfully in many contexts where you have a large number of similar items to document. It requires considerable up front planning and careful training of staff members. Alternatively, effort can be put into creating a program or set of online forms for inputting the raw data in a prompted way so that training can be minimized. Balancing these efforts in the context of maximizing your productivity is not a trivial task, but given a project of reasonable size, it can give great returns in flexibility and return on investment (ROI). Most single-sourcing projects, if they fail, fail because of inadequate training, poor planning, or resistance from staff members.Popular tools
The following are commonly used as end-step publishing frameworks that support transformations to multiple formats.- Apache ForrestApache ForrestApache Forrest is a web-publishing framework based on Apache Cocoon. It is an XML publishing framework that allows multiple types of data-files as input, such as various popular word processing and spreadsheet files, as well as two wiki dialects...
- Based on the earlier Cocoon, Forrest can aggregate multiple sources as well as serving multiple targets.
- Apache CocoonApache CocoonApache Cocoon, usually just called Cocoon, is a web application framework built around the concepts of pipeline, separation of concerns and component-based web development. The framework focuses on XML and XSLT publishing and is built using the Java programming language...
- Apache Cocoon
- An early example of pipelined processing and a framework for XSLT, Cocoon is still widely used.
The following are applications for designing and structuring multi-format output based on a single source.
- Technical Communication Suite
- Adobe® Technical Communication Suite 3 is a complete single-source authoring toolkit with multichannel, multidevice publishing capabilities. Develop standards-compliant content with Adobe FrameMaker® 10 software, publish in various formats with Adobe RoboHelp® 9 software and Adobe Captivate® 5 workflows, collaborate with reviewable PDF files, incorporate images using Adobe Photoshop® CS5, and add demos and simulations using Adobe Captivate 5.
- AltovaAltovaFounded in 1992, Altova is a commercial software development company with headquarters in Beverly, MA, USA and Vienna, Austria that produces integrated XML, database, UML, and data management software development tools.-Products:Altova’s products include:...
StyleVision
- Altova
- Graphical stylesheet designer used for creating template-based designs for XML, XBRLXBRLXBRL is a freely available, market-driven, open, and global standard for exchanging business information. XBRL allows information modeling and the expression of semantic meaning commonly required in business reporting. XBRL is XML-based...
, and database output to HTML, RTF, PDF, Office Open XML, and Authentic e-Forms- Arbortext dynamic information delivery system
- Graphical authoring tool, stylesheet designer, and publishing engine used for creating template-based designs for XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
, SGML (and other input sources) automatically to HTML, RTF, PDF, text, or any other XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
-based output format for further compilation (CHM, eBook, XSLT, etc.)]- AuthorIT
- MadCap Flare Authoring tool for high-end digital print and online help
- Help authoring tool, stylesheet designer, and publishing engine used for creating user manuals, knowledge bases and online help Technical CommunicationTechnical communicationTechnical communication is a method of researching and creating information about technical processes or products directed to an audience through media. The information must be relevant to the intended audience. Technical communicators often work collaboratively to create products for various...
.
External links
- Society for Technical Communication Single-sourcing Special Interest Group (the "bible" for Data Modeling)
- Single Sourcing Information - An Agile Practice for Effective Documentation
- Technical Writing in a Wiki - Single Source Publishing (An article by Atlassian's technical writing team about using a wiki for technical documentation.)
- Planning a Single Source Publishing Application for Business Documents (A paper presented by Peter Meyer at OpenPublish, Sydney, on 29 July 2005)
See Also
- DocBookDocBookDocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation....
- Darwin Information Typing ArchitectureDarwin Information Typing ArchitectureThe Darwin Information Typing Architecture is an OASIS standard XML data model for authoring and publishing. Many third party tools support authoring, including Adobe FrameMaker, XMetaL, Arbortext, Quark XML Author, Oxygen XML Editor, easyDITA, and SDL Xopus...
- EPUBEPUBEPUB is a free and open e-book standard by the International Digital Publishing Forum...
- Markup languageMarkup languageA markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...
- Content managementContent managementContent management, or CM, is the set of processes and technologies that support the collection, managing, and publishing of information in any form or medium. In recent times this information is typically referred to as content or, to be precise, digital content...