Hierarchical classifier
Encyclopedia
A hierarchical classifier is a classifier that maps input data
Data (computing)
In computer science, data is information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform...

 into defined subsumptive output categories. The classification occurs first on a low-level with highly-specific pieces of input data. The classifications of the individual pieces of data are then combined systematically and classified on a higher level iteratively until one output is produced. This final output is the overall classification of the data. Depending on application
Function application
In mathematics, function application is the act of applying a function to an argument from its domain so as to obtain the corresponding value from its range.-Representation:...

-specific details, this output can be one of a set of pre-defined outputs, one of a set of on-line learned outputs, or even a new novel classification that hasn't been seen before. Generally, such systems rely on relatively simple individual units of the hierarchy that have only one universal function to do the classification. In a sense, these machines rely on the power of the hierarchical structure itself instead of the computational abilities of the individual components. This makes them relatively simple, easily expandable, and very powerful.

Application

Many applications exist that are efficiently implemented using hierarchical classifiers or variants thereof. The clearest example lies in the area of computer vision
Computer vision
Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

. Recognizing pictures is something that hierarchical processing can do well. The reason the model is so well fit to this application is that pictures can intuitively be viewed as a collection
Collection (computing)
In computer science, a collection is a grouping of some variable number of data items that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting...

 of components or objects
Entity
An entity is something that has a distinct, separate existence, although it need not be a material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is animate.An entity could be viewed as a set...

. These objects can be viewed as collections of smaller components like shape
Shape
The shape of an object located in some space is a geometrical description of the part of that space occupied by the object, as determined by its external boundary – abstracting from location and orientation in space, size, and other properties such as colour, content, and material...

s, which can be viewed as collections of lines
Line (geometry)
The notion of line or straight line was introduced by the ancient mathematicians to represent straight objects with negligible width and depth. Lines are an idealization of such objects...

, and so on. This coincides directly with the way hierarchical processing works. If a simple unit of the processing
Information processing
Information processing is the change of information in any manner detectable by an observer. As such, it is a process which describes everything which happens in the universe, from the falling of a rock to the printing of a text file from a digital computer system...

 hierarchy can classify lines into shapes, then an equivalent unit could process shapes into objects (of course, there are some intermediate steps between these, but the idea is there). Thus, if you arrange these generic classifying units in a hierarchical fashion (using a directed acyclic graph
Directed acyclic graph
In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...

), a full step-by-step classification can ensue from pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

s of color all the way up to an abstract label of what is in the picture.

There are a lot of similar applications that can also be tackled by hierarchical classification such as written text recognition, robot awareness
Artificial consciousness
Artificial consciousness , also known as machine consciousness or synthetic consciousness, is a field related to artificial intelligence and cognitive robotics whose aim is to define that which would have to be synthesized were consciousness to be found in an engineered artifact .Neuroscience...

, etc. It is possible that mathematical models and problem solving
Problem solving
Problem solving is a mental process and is part of the larger problem process that includes problem finding and problem shaping. Consideredthe most complex of all intellectual functions, problem solving has been defined as higher-order cognitive process that requires the modulation and control of...

 methods can also be represented in this fashion. If this is the case, future research in this area could lead to very successful automated theorem provers across multiple domain. Such developments would be very powerful, but is yet unclear how exactly these models are applicable.

Similar models

One similar model is the notion of graphical model
Graphical model
A graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning....

s where an input space is systematically broken down into subspaces, and those into smaller subspaces, and so on, creating a hierarchy of input spaces. This allows for predictions
Predictive inference
Predictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.Initially, predictive inference was based on observable parameters and it was the main purpose of studying probability, but it fell out of favor in the 20th...

 about behavior of inputs in various regions with statistical methods such as Bayesian network
Bayesian network
A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph . For example, a Bayesian network could represent the probabilistic...

s allowing for easily computable conditional probabilities. Recently, there has been a lot of research in this area with respect to vision systems
Machine vision
Machine vision is the process of applying a range of technologies and methods to provide imaging-based automatic inspection, process control and robot guidance in industrial applications. While the scope of MV is broad and a comprehensive definition is difficult to distil, a "generally accepted...

. Hierarchical classifiers are extremely similar to these models, but do not have to depend on statistical interpretation.

Another similar model is the simple neural network
Neural network
The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...

. Commonly, neural networks are a network of individual nodes
Node (computer science)
A node is a record consisting of one or more fields that are links to other nodes, and a data field. The link and data fields are often implemented by pointers or references although it is also quite common for the data to be embedded directly in the node. Nodes are used to build linked, often...

 that each tries to learn
LEARN
LEARN may refer to:* Law Enforcement Agency Resource Network, a website run by the Anti-Defamation League* LEARN diet, a brand name diet product...

 a function of input to output. The functionality of the network as a whole is dependent on the ability of the nodes to work together to yield the correct overall output. Neural networks can be trained to do lots of tasks and are often domain-specific. However, as in the case of graphical models, neural networks have shown great general-purpose behavior in computer vision even when tackling relatively general problems. Hierarchical classifiers can, in fact, be seen as a special case of neural networks where, instead of learning functions, discrete output classes are learned. Learning is then a pattern-match
Pattern matching
In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...

 with an error threshold instead of an interpolation
Interpolation
In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....

 of an approximate function.

Neuroscience
Neuroscience
Neuroscience is the scientific study of the nervous system. Traditionally, neuroscience has been seen as a branch of biology. However, it is currently an interdisciplinary science that collaborates with other fields such as chemistry, computer science, engineering, linguistics, mathematics,...

's perspective on the workings of the human cortex
Cerebral cortex
The cerebral cortex is a sheet of neural tissue that is outermost to the cerebrum of the mammalian brain. It plays a key role in memory, attention, perceptual awareness, thought, language, and consciousness. It is constituted of up to six horizontal layers, each of which has a different...

 also serves as a similar model. The generally accepted view of the brain today is that the brain is a generic pattern machine
Pattern recognition
In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...

 that works to abstract information again and again until it relates to a broad stored concept
Concept
The word concept is used in ordinary language as well as in almost all academic disciplines. Particularly in philosophy, psychology and cognitive sciences the term is much used and much discussed. WordNet defines concept: "conception, construct ". However, the meaning of the term concept is much...

. For instance, a familiar face
Face
The face is a central sense organ complex, for those animals that have one, normally on the ventral surface of the head, and can, depending on the definition in the human case, include the hair, forehead, eyebrow, eyelashes, eyes, nose, ears, cheeks, mouth, lips, philtrum, temple, teeth, skin, and...

 is not stored as a collection of pixels, rather as a combination of very specific eyes, nose, mouth, ears, etc. In this way, when the data has been classified into those components, that collection of those components can then be classified into that face. Thus, neuroscience trends and data are very valuable to research in these areas as they are highly relevant to the inner workings of these models. This is especially true since the human brain is inherently very good at applications like facial recognition that these models strive to be good at. The brain is in a sense a benchmark of proficiency for hierarchical processing.

Notable researchers

Tomaso Poggio
Tomaso Poggio
Tomaso Armando Poggio, born in Genoa, Italy, is the Eugene McDermott Professor in the Department of Brain and Cognitive Sciences, an investigator at the McGovern Institute for Brain Research, a member of the MIT Computer Science and Artificial Intelligence Laboratory and the director of The Center...

 of MIT does research in the area of computer vision and has recently been developing a hierarchical vision system that is both relatively simple and empirically comparable to the human brain. The research combines neural networks with lots of cognitive psychology and neuroscience data in an attempt to create the most realistic and human-like artificial vision system that exists.

Tom Dean of Brown University researches graphical models and Bayesian networks and is currently developing a hierarchical system that claims to have very good results in vision problems. This model is able to very simply produce properties such as rotational and translational invariance that a reliable vision system needs in order to yield non-trivial results.

Jeff Hawkins
Jeff Hawkins
Jeffrey Hawkins is the founder of Palm Computing and Handspring...

 is the founder of Palm Computing and the Redwood Neuroscience Institute and is the author of On Intelligence in which he proposes his theories on the workings of the brain that center around hierarchical processing and the brain as a generic pattern machine that functions by continually abstracting and categorizing data.

Leslie Lamport
Leslie Lamport
Leslie Lamport is an American computer scientist. A graduate of the Bronx High School of Science, he received a B.S. in mathematics from the Massachusetts Institute of Technology in 1960, and M.A. and Ph.D. degrees in mathematics from Brandeis University, respectively in 1963 and 1972...

is the author of the paper "How to write a proof", in which he proposes to write proofs in a hierarchical fashion with main ideas, sub-ideas, sub-sub-ideas, etc. The proofs are written in such a way that they mirror the structure of a tree. Some automated theorem provers of today have attempted to capitalize on this formalization of the structure of proofs as to more efficiently solve problems. However, none of these theorem provers have the capabilities to adequately solve problems across domains as current vision systems are beginning to be able to do. Thus, it is highly possible but still unknown whether similar tactics to the ones used in the vision system and specifically hierarchical processing can dramatically improve automated theorem provers.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK