Document Layout Analysis
Encyclopedia
Document Layout Analysis is a part of Computer Vision
indicating the process of identifying and categorizing the regions of interest
in a document image, e.g. a scanned page. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order. Detection and labeling of the different zones (or blocks) as text body, pictures, math symbols
, and tables embedded in a document is called geometric layout analysis. But text zones play different logical roles inside the document (titles, captions, footnotes, etc.) and this kind of semantic labeling is the scope of the logical layout analysis.
Document layout analysis is the union of geometric and logical labeling. It is typically performed before a document image is sent to an OCR
engine, but it can be used also to detect duplicate copies of the same document in large archives, or to index documents by their structure or pictorial content.
Document layout is formally defined in the international standard ISO 8613-1:1989.
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...
indicating the process of identifying and categorizing the regions of interest
Region of interest
A Region of Interest, often abbreviated ROI, is a selected subset of samples within a dataset identified for a particular purpose.For example:* on a waveform , a time or frequency interval...
in a document image, e.g. a scanned page. A reading system requires the segmentation of text zones from non-textual ones and the arrangement in their correct reading order. Detection and labeling of the different zones (or blocks) as text body, pictures, math symbols
Mathematical notation
Mathematical notation is a system of symbolic representations of mathematical objects and ideas. Mathematical notations are used in mathematics, the physical sciences, engineering, and economics...
, and tables embedded in a document is called geometric layout analysis. But text zones play different logical roles inside the document (titles, captions, footnotes, etc.) and this kind of semantic labeling is the scope of the logical layout analysis.
Document layout analysis is the union of geometric and logical labeling. It is typically performed before a document image is sent to an OCR
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...
engine, but it can be used also to detect duplicate copies of the same document in large archives, or to index documents by their structure or pictorial content.
Document layout is formally defined in the international standard ISO 8613-1:1989.