Document management system
Encyclopedia
A document management system (DMS) is a computer system (or set of computer programs) used to track and store electronic document
Electronic document
An electronic document is any electronic media content that are intended to be used in either an electronic form or as printed output....

s and/or images
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...

 of paper documents. It is usually also capable of keeping track of the different versions created by different users (history tracking). The term has some overlap with the concepts of content management system
Content management system
A content management system is a system providing a collection of procedures used to manage work flow in a collaborative environment. These procedures can be manual or computer-based...

s. It is often viewed as a component of enterprise content management
Enterprise content management
Enterprise Content Management is a formalized means of organizing and storing an organization's documents, and other content, that relate to the organization's processes...

 (ECM) systems and related to digital asset management
Digital asset management
Digital asset management consists of management tasks and decisions surrounding the ingestion, annotation, cataloguing, storage, retrieval and distribution of digital assets...

, document imaging
Document imaging
Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer...

, workflow
Workflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...

 systems and records management
Records management
Records management, or RM, is the practice of maintaining the records of an organization from the time they are created up to their eventual disposal...

 systems.

History

Beginning in the 1980s, a number of vendors began developing software systems to manage paper-based documents. These systems dealt with paper documents, which included not only printed and published documents, but also photographs, prints, etc.

Later developers began to write a second type of system which could manage electronic documents, i.e., all those documents, or files, created on computers, and often stored on users' local file-systems. The earliest electronic document management (EDM) systems managed either proprietary file types, or a limited number of file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

s. Many of these systems later became known as document imaging
Document imaging
Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer...

 systems, because they focused on the capture, storage, indexing and retrieval of image file formats
Image file formats
Image file formats are standardized means of organizing and storing digital images. Image files are composed of either pixels, vector data, or a combination of the two. Whatever the format, the files are rasterized to pixels when displayed on most graphic displays...

. These systems enabled an organization to capture faxes and forms, to save copies of the documents as images, and to store the image files in the repository
Repository
Repository commonly refers to a location for storage, often for safety or preservation.Repository may also refer to:* Repository clone, concept from distributed revision control...

 for security and quick retrieval (retrieval made possible because the system handled the extraction of the text from the document in the process of capture, and the text-indexer function provided text-retrieval capabilities).

EDM systems evolved to a point where systems could manage any type of file format that could be stored on the network. The applications grew to encompass electronic documents, collaboration tool
Collaboration tool
A collaboration tool is something that helps people collaborate. The term is often used to mean collaborative software, but collaboration tools were being used before computers existed...

s, security, workflow, and audit
Audit
The general definition of an audit is an evaluation of a person, organization, system, process, enterprise, project or product. The term most commonly refers to audits in accounting, but similar concepts also exist in project management, quality management, and energy conservation.- Accounting...

ing capabilities.

While many EDM systems store documents in their native file format (Microsoft Word or Excel, PDF), some web-based document management systems are beginning to store content in the form of html
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

. These policy management systems require content to be imported into the system. However, once content is imported, the software acts like a search engine so users can find what they are looking for faster. The html format allows for better application of search capabilities such as full-text searching and stemming.

Components

Document management systems commonly provide storage, versioning, metadata, security, as well as indexing and retrieval capabilities. Here is a description of these components:
Topic Description
Metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

Metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 is typically stored for each document. Metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 may, for example, include the date the document was stored and the identity of the user storing it. The DMS may also extract metadata from the document automatically or prompt the user to add metadata. Some systems also use optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 on scanned images, or perform text extraction on electronic documents. The resulting extracted text can be used to assist users in locating documents by identifying probable keywords or providing for full text search capability, or can be used on its own. Extracted text can also be stored as a component of metadata, stored with the image, or separately as a source for searching document collections.
Integration Many document management systems attempt to integrate document management directly into other applications, so that users may retrieve existing documents directly from the document management system repository, make changes, and save the changed document back to the repository as a new version, all without leaving the application. Such integration is commonly available for office suite
Office suite
In computing, an office suite, sometimes called an office software suite or productivity suite is a collection of programs intended to be used by knowledge workers...

s and e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 or collaboration/groupware software. Integration often uses open standards such as ODMA
ODMA
The Open Document Management API is an API that simplifies the communication of desktop applications with document management systems . ODMA standardizes the access to the DMS, which makes getting to these files as easy as if the files were in the actual local file system.ODMA was an effort to...

, LDAP
Lightweight Directory Access Protocol
The Lightweight Directory Access Protocol is an application protocol for accessing and maintaining distributed directory information services over an Internet Protocol network...

, WebDAV
WebDAV
Web-based Distributed Authoring and Versioning is a set of methods based on the Hypertext Transfer Protocol that facilitates collaboration between users in editing and managing documents and files stored on World Wide Web servers...

 and SOAP
SOAP
SOAP, originally defined as Simple Object Access Protocol, is a protocol specification for exchanging structured information in the implementation of Web Services in computer networks...

 to allow integration with other software and compliance with internal controls.
Capture Capture primarily involves accepting and processing images of paper documents from scanner
Image scanner
In computing, an image scanner—often abbreviated to just scanner—is a device that optically scans images, printed text, handwriting, or an object, and converts it to a digital image. Common examples found in offices are variations of the desktop scanner where the document is placed on a glass...

s or multifunction printer
Multifunction printer
An MFP , multifunctional, all-in-one , or Multifunction Device , is an office machine which incorporates the functionality of multiple devices in one, so as to have a smaller footprint in a home or small business setting , or to provide centralized document...

s. Optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...

 (OCR) software is often used, whether integrated into the hardware or as stand-alone software, in order to convert digital images into machine readable text. Optical mark recognition
Optical mark recognition
Optical Mark Recognition is the process of capturing human-marked data from document forms such as surveys and tests.-OMR background:...

 (OMR) software is sometimes used to extract values of check-boxes or bubbles. Capture may also involve accepting electronic documents and other computer-based files.
Indexing Track electronic documents. Indexing may be as simple as keeping track of unique document identifiers; but often it takes a more complex form, providing classification through the documents' metadata or even through word indexes extracted from the documents' contents. Indexing exists mainly to support retrieval. One area of critical importance for rapid retrieval is the creation of an index topology
Topology
Topology is a major area of mathematics concerned with properties that are preserved under continuous deformations of objects, such as deformations that involve stretching, but no tearing or gluing...

.
Storage Store electronic documents. Storage of the documents often includes management of those same documents; where they are stored, for how long, migration of the documents from one storage media to another (hierarchical storage management
Hierarchical storage management
Hierarchical storage management is a data storage technique which automatically moves data between high-cost and low-cost storage media. HSM systems exist because high-speed storage devices, such as hard disk drive arrays, are more expensive than slower devices, such as optical discs and magnetic...

) and eventual document destruction.
Retrieval Retrieve the electronic documents from the storage. Although the notion of retrieving a particular document is simple, retrieval in the electronic context can be quite complex and powerful. Simple retrieval of individual documents can be supported by allowing the user to specify the unique document identifier, and having the system use the basic index (or a non-indexed query on its data store) to retrieve the document. More flexible retrieval allows the user to specify partial search terms involving the document identifier and/or parts of the expected metadata. This would typically return a list of documents which match the user's search terms. Some systems provide the capability to specify a Boolean expression
Boolean expression
In computer science, a Boolean expression is an expression in a programming language that produces a Boolean value when evaluated, i.e. one of true or false...

 containing multiple keywords or example phrases expected to exist within the documents' contents. The retrieval for this kind of query may be supported by previously built indexes, or may perform more time-consuming searches through the documents' contents to return a list of the potentially relevant documents. See also Document retrieval
Document retrieval
Document retrieval is defined as the matching of some stated user query against a set of free-text records. These records could be any type of mainly unstructured text, such as newspaper articles, real estate records or paragraphs in a manual...

.
Distribution A published document for distribution has to be in a format that can not be easily altered. As a common practice in law regulated industries, an original master copy of the document is usually never used for distribution other than archiving. If a document is to be distributed electronically in a regulatory environment, then the equipment tasking the job has to be quality endorsed AND validated. Similarly quality endorsed electronic distribution carriers have to be used. This approach applies to both of the systems by which the document is to be inter-exchanged, if the integrity of the document is highly in demand.
Security Document security is vital in many document management applications. Compliance requirements for certain documents can be quite complex depending on the type of documents. For instance, in the United States, the Health Insurance Portability and Accountability Act
Health Insurance Portability and Accountability Act
The Health Insurance Portability and Accountability Act of 1996 was enacted by the U.S. Congress and signed by President Bill Clinton in 1996. It was originally sponsored by Sen. Edward Kennedy and Sen. Nancy Kassebaum . Title I of HIPAA protects health insurance coverage for workers and their...

 (HIPAA) requirements dictate that medical documents have certain security requirements. Some document management systems have a rights management module that allows an administrator to give access to documents based on type to only certain people or groups of people. Document marking at the time of printing or PDF-creation is an essential element to preclude alteration or unintended use.
Workflow
Workflow
A workflow consists of a sequence of connected steps. It is a depiction of a sequence of operations, declared as work of a person, a group of persons, an organization of staff, or one or more simple or complex mechanisms. Workflow may be seen as any abstraction of real work...

Workflow is a complex problem and some document management systems have a built-in workflow module. There are different types of workflow. Usage depends on the environment the electronic document management system (EDMS) is applied to. Manual workflow requires a user to view the document and decide who to send it to. Rules-based workflow allows an administrator to create a rule that dictates the flow of the document through an organization: for instance, an invoice passes through an approval process and then is routed to the accounts-payable department. Dynamic rules allow for branches to be created in a workflow process. A simple example would be to enter an invoice amount and if the amount is lower than a certain set amount, it follows different routes through the organization. Advanced workflow mechanisms can manipulate content or signal external processes while these rules are in effect.
Collaboration Collaboration should be inherent in an EDMS. In its basic form, a collaborative EDMS should allow documents to be retrieved and worked on by an authorized user. Access should be blocked to other users while work is being performed on the document. Other advanced forms of collaboration allow multiple users to view and modify (or markup) a document at the same time in a collaboration session. The resulting document should be viewable in its final shape, while also storing the markups done by each individual user during the collaboration session.
Versioning Versioning is a process by which documents are checked in or out of the document management system, allowing users to retrieve previous versions and to continue work from a selected point. Versioning is useful for documents that change over time and require updating, but it may be necessary to go back to or reference a previous copy.
Searching Finds documents and folders using template attributes or full text search. Documents can be searched using various attributes and document content
Publishing
Publishing
Publishing is the process of production and dissemination of literature or information—the activity of making information available to the general public...

Publishing a document involves the procedures of proofreading
Proofreading
Proofreading is the reading of a galley proof or computer monitor to detect and correct production-errors of text or art. Proofreaders are expected to be consistently accurate by default because they occupy the last stage of typographic production before publication.-Traditional method:A proof is...

, peer
Peer review
Peer review is a process of self-regulation by a profession or a process of evaluation involving qualified individuals within the relevant field. Peer review methods are employed to maintain standards, improve performance and provide credibility...

 or public reviewing, authorizing, printing and approving etc. Those steps ensure prudence
Prudence
Prudence is the ability to govern and discipline oneself by the use of reason. It is classically considered to be a virtue, and in particular one of the four Cardinal virtues .The word comes from Old French prudence , from Latin...

 and logical thinking. Any careless handling may result in the inaccuracy of the document and therefore mislead or upset its users and readers. In law regulated industries, some of the procedures have to be completed as evidenced by their corresponding signatures and the date(s) on which the document was signed. Refer to the ISO
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

 divisions of ICS 01.140.40 and 35.240.30 for further information.
The published document should be in a format that is not easily altered without a specific knowledge or tools, and yet it is read-only or portable.
Reproduction Document/image reproduction is key when thinking about implementing a system. It's all well and good being able to put things in, but how are you going to get it out? An example of this is building plans. How will plans be scanned and scale be retained when printed?

Standardization

Many industry associations publish their own lists of particular document control standards that are used in their particular field. Following is a list of some of the relevant ISO documents. Divisions ICS 01.140.10 and 01.140.20. The ISO has also published a series of standards regarding the technical documentation
Technical documentation
In engineering, technical documentation refers to any type of documentation that describes handling, functionality and architecture of a technical product or a product under development or use.Documentation may include:* patents...

, covered by the division of 01.110.
  • ISO 2709
    ISO 2709
    ISO 2709 is an ISO standard for bibliographic descriptions, titled Information and documentation—Format for information exchange.It is maintained by the Technical Committee for Information and Documentation .-History:...

     Information and documentation — Format for information exchange
  • ISO 15836 Information and documentation — The Dublin Core metadata element set
  • ISO 15489 Information and documentation — Records management
  • ISO 21127 Information and documentation — A reference ontology for the interchange of cultural heritage information
  • ISO 23950 Information and documentation — Information retrieval (Z39.50) — Application service definition and protocol specification
  • ISO 10244 Document management — Business process baselining and analysis
  • ISO 32000 Document management — Portable document format

Document control

Compliance adds requirements that transform document management into a document control issue. Document control is a regulatory requirement within accounting (e.g., 8th EU Directive, Sarbanes-Oxley), food safety (e.g., Food Safety Modernization Act), ISO (mentioned above), Medical device manufacturing (FDA), Healthcare (JCAHO), and Information technology (ITIL
Itil
Itil may mean:*Atil or Itil, the ancient capital of Khazaria*Itil , also Idel, Atil, Atal, the ancient and modern Turkic name of the river Volga.ITIL can stand for:*Information Technology Infrastructure Library...

). Your documents — procedures, work instructions, policy statements, etc. — provide evidence of documents under control. Failing to comply could cause fines, the loss of business, or damage to your business reputation.

The basic requirement for document control require that you establish and document a procedure for:

  • Reviewing and approving documents prior to release
  • Reviews and approvals
  • Ensuring changes and revisions are clearly identified
  • Ensuring that relevant versions of applicable documents are available at their “points of use”
  • Ensuring that documents remain legible and identifiable
  • Ensuring that external documents like customer supplied documents or supplier manuals are identified and controlled
  • Preventing “unintended” use of obsolete documents


See also

  • Construction collaboration technology
    Construction collaboration technology
    Construction collaboration technology refers to software applications used to enable effective sharing of project-related information between geographically dispersed members of a construction project team, often through use of a web-based Software as a service platform.-History:The terms...

  • Content management system
    Content management system
    A content management system is a system providing a collection of procedures used to manage work flow in a collaborative environment. These procedures can be manual or computer-based...

  • Data proliferation
    Data proliferation
    Data proliferation refers to the prodigious amount of data, structured and unstructured, that businesses and governments continue to generate at an unprecedented rate and the usability problems that result from attempting to store and manage that data...

  • Document automation
    Document automation
    Document automation is the design of systems and workflow that assist in the creation of electronic documents. These include logic based systems that use segments of pre-existing text and/or data to assemble a new document. This process is increasingly used within certain industries to assemble...


  • Documentation
    Documentation
    Documentation is a term used in several different ways. Generally, documentation refers to the process of providing evidence.Modules of Documentation are Helpful...

  • Information repository
    Information repository
    An information repository is an easy way to deploy a secondary tier of data storage that can comprise multiple, networked data storage technologies running on diverse operating systems, where data that no longer needs to be in primary storage is protected, classified according to captured metadata,...

  • Information science
    Information science
    -Introduction:Information science is an interdisciplinary science primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval and dissemination of information...

  • Intelligent document
    Intelligent document
    Intelligent document is a general term to describe electronic documents with more functionality than a page designed to emulate paper. Formats include PDF from Adobe, InfoPath from Microsoft, Cardiff Software and XForms from W3C, and the non-programming solutions DocFire, Exari and Intelledox...


  • Library science
    Library science
    Library science is an interdisciplinary or multidisciplinary field that applies the practices, perspectives, and tools of management, information technology, education, and other areas to libraries; the collection, organization, preservation, and dissemination of information resources; and the...

  • Revision control
    Revision control
    Revision control, also known as version control and source control , is the management of changes to documents, programs, and other information stored as computer files. It is most commonly used in software development, where a team of people may change the same files...

  • Taxonomy
    Taxonomy
    Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

  • Enterprise Content Management
    Enterprise content management
    Enterprise Content Management is a formalized means of organizing and storing an organization's documents, and other content, that relate to the organization's processes...



External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK