Hierarchical classifier - AbsoluteAstronomy.com

A hierarchical classifier is a classifier that maps input data

Data (computing)

In computer science, data is information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform...

into defined subsumptive output categories. The classification occurs first on a low-level with highly-specific pieces of input data. The classifications of the individual pieces of data are then combined systematically and classified on a higher level iteratively until one output is produced. This final output is the overall classification of the data. Depending on application

Function application

In mathematics, function application is the act of applying a function to an argument from its domain so as to obtain the corresponding value from its range.-Representation:...

-specific details, this output can be one of a set of pre-defined outputs, one of a set of on-line learned outputs, or even a new novel classification that hasn't been seen before. Generally, such systems rely on relatively simple individual units of the hierarchy that have only one universal function to do the classification. In a sense, these machines rely on the power of the hierarchical structure itself instead of the computational abilities of the individual components. This makes them relatively simple, easily expandable, and very powerful.

Application

Many applications exist that are efficiently implemented using hierarchical classifiers or variants thereof. The clearest example lies in the area of computer vision

Computer vision

Computer vision is a field that includes methods for acquiring, processing, analysing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions...

. Recognizing pictures is something that hierarchical processing can do well. The reason the model is so well fit to this application is that pictures can intuitively be viewed as a collection

Collection (computing)

In computer science, a collection is a grouping of some variable number of data items that have some shared significance to the problem being solved and need to be operated upon together in some controlled fashion. Generally, the data items will be of the same type or, in languages supporting...

of components or objects

Entity

An entity is something that has a distinct, separate existence, although it need not be a material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is animate.An entity could be viewed as a set...

. These objects can be viewed as collections of smaller components like shape

Shape

The shape of an object located in some space is a geometrical description of the part of that space occupied by the object, as determined by its external boundary – abstracting from location and orientation in space, size, and other properties such as colour, content, and material...

s, which can be viewed as collections of lines

Line (geometry)

The notion of line or straight line was introduced by the ancient mathematicians to represent straight objects with negligible width and depth. Lines are an idealization of such objects...

, and so on. This coincides directly with the way hierarchical processing works. If a simple unit of the processing

Information processing

Information processing is the change of information in any manner detectable by an observer. As such, it is a process which describes everything which happens in the universe, from the falling of a rock to the printing of a text file from a digital computer system...

hierarchy can classify lines into shapes, then an equivalent unit could process shapes into objects (of course, there are some intermediate steps between these, but the idea is there). Thus, if you arrange these generic classifying units in a hierarchical fashion (using a directed acyclic graph

Directed acyclic graph

In mathematics and computer science, a directed acyclic graph , is a directed graph with no directed cycles. That is, it is formed by a collection of vertices and directed edges, each edge connecting one vertex to another, such that there is no way to start at some vertex v and follow a sequence of...

), a full step-by-step classification can ensue from pixel

Pixel

In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....

s of color all the way up to an abstract label of what is in the picture.

There are a lot of similar applications that can also be tackled by hierarchical classification such as written text recognition, robot awareness

Artificial consciousness

Artificial consciousness , also known as machine consciousness or synthetic consciousness, is a field related to artificial intelligence and cognitive robotics whose aim is to define that which would have to be synthesized were consciousness to be found in an engineered artifact .Neuroscience...

, etc. It is possible that mathematical models and problem solving

Problem solving

Problem solving is a mental process and is part of the larger problem process that includes problem finding and problem shaping. Consideredthe most complex of all intellectual functions, problem solving has been defined as higher-order cognitive process that requires the modulation and control of...

methods can also be represented in this fashion. If this is the case, future research in this area could lead to very successful automated theorem provers across multiple domain. Such developments would be very powerful, but is yet unclear how exactly these models are applicable.

Similar models

One similar model is the notion of graphical model

Graphical model

A graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning....

s where an input space is systematically broken down into subspaces, and those into smaller subspaces, and so on, creating a hierarchy of input spaces. This allows for predictions

Predictive inference

Predictive inference is an approach to statistical inference that emphasizes the prediction of future observations based on past observations.Initially, predictive inference was based on observable parameters and it was the main purpose of studying probability, but it fell out of favor in the 20th...

about behavior of inputs in various regions with statistical methods such as Bayesian network

Bayesian network

A Bayesian network, Bayes network, belief network or directed acyclic graphical model is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph . For example, a Bayesian network could represent the probabilistic...

s allowing for easily computable conditional probabilities. Recently, there has been a lot of research in this area with respect to vision systems

Machine vision

Machine vision is the process of applying a range of technologies and methods to provide imaging-based automatic inspection, process control and robot guidance in industrial applications. While the scope of MV is broad and a comprehensive definition is difficult to distil, a "generally accepted...

. Hierarchical classifiers are extremely similar to these models, but do not have to depend on statistical interpretation.

Another similar model is the simple neural network

Neural network

The term neural network was traditionally used to refer to a network or circuit of biological neurons. The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes...

. Commonly, neural networks are a network of individual nodes

Node (computer science)

A node is a record consisting of one or more fields that are links to other nodes, and a data field. The link and data fields are often implemented by pointers or references although it is also quite common for the data to be embedded directly in the node. Nodes are used to build linked, often...

that each tries to learn

LEARN

LEARN may refer to:* Law Enforcement Agency Resource Network, a website run by the Anti-Defamation League* LEARN diet, a brand name diet product...

a function of input to output. The functionality of the network as a whole is dependent on the ability of the nodes to work together to yield the correct overall output. Neural networks can be trained to do lots of tasks and are often domain-specific. However, as in the case of graphical models, neural networks have shown great general-purpose behavior in computer vision even when tackling relatively general problems. Hierarchical classifiers can, in fact, be seen as a special case of neural networks where, instead of learning functions, discrete output classes are learned. Learning is then a pattern-match

Pattern matching

In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...

with an error threshold instead of an interpolation

Interpolation

In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points....

of an approximate function.

Neuroscience

Neuroscience

Neuroscience is the scientific study of the nervous system. Traditionally, neuroscience has been seen as a branch of biology. However, it is currently an interdisciplinary science that collaborates with other fields such as chemistry, computer science, engineering, linguistics, mathematics,...

's perspective on the workings of the human cortex

Cerebral cortex

The cerebral cortex is a sheet of neural tissue that is outermost to the cerebrum of the mammalian brain. It plays a key role in memory, attention, perceptual awareness, thought, language, and consciousness. It is constituted of up to six horizontal layers, each of which has a different...

also serves as a similar model. The generally accepted view of the brain today is that the brain is a generic pattern machine

Pattern recognition

In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...

that works to abstract information again and again until it relates to a broad stored concept

Concept

The word concept is used in ordinary language as well as in almost all academic disciplines. Particularly in philosophy, psychology and cognitive sciences the term is much used and much discussed. WordNet defines concept: "conception, construct ". However, the meaning of the term concept is much...

. For instance, a familiar face

Face

The face is a central sense organ complex, for those animals that have one, normally on the ventral surface of the head, and can, depending on the definition in the human case, include the hair, forehead, eyebrow, eyelashes, eyes, nose, ears, cheeks, mouth, lips, philtrum, temple, teeth, skin, and...

is not stored as a collection of pixels, rather as a combination of very specific eyes, nose, mouth, ears, etc. In this way, when the data has been classified into those components, that collection of those components can then be classified into that face. Thus, neuroscience trends and data are very valuable to research in these areas as they are highly relevant to the inner workings of these models. This is especially true since the human brain is inherently very good at applications like facial recognition that these models strive to be good at. The brain is in a sense a benchmark of proficiency for hierarchical processing.

Notable researchers

Tomaso Poggio

Tomaso Armando Poggio, born in Genoa, Italy, is the Eugene McDermott Professor in the Department of Brain and Cognitive Sciences, an investigator at the McGovern Institute for Brain Research, a member of the MIT Computer Science and Artificial Intelligence Laboratory and the director of The Center...

of MIT does research in the area of computer vision and has recently been developing a hierarchical vision system that is both relatively simple and empirically comparable to the human brain. The research combines neural networks with lots of cognitive psychology and neuroscience data in an attempt to create the most realistic and human-like artificial vision system that exists.

Tom Dean of Brown University researches graphical models and Bayesian networks and is currently developing a hierarchical system that claims to have very good results in vision problems. This model is able to very simply produce properties such as rotational and translational invariance that a reliable vision system needs in order to yield non-trivial results.

Jeff Hawkins

Jeff Hawkins

Jeffrey Hawkins is the founder of Palm Computing and Handspring...

is the founder of Palm Computing and the Redwood Neuroscience Institute and is the author of On Intelligence in which he proposes his theories on the workings of the brain that center around hierarchical processing and the brain as a generic pattern machine that functions by continually abstracting and categorizing data.

Leslie Lamport

Leslie Lamport

Leslie Lamport is an American computer scientist. A graduate of the Bronx High School of Science, he received a B.S. in mathematics from the Massachusetts Institute of Technology in 1960, and M.A. and Ph.D. degrees in mathematics from Brandeis University, respectively in 1963 and 1972...

is the author of the paper "How to write a proof", in which he proposes to write proofs in a hierarchical fashion with main ideas, sub-ideas, sub-sub-ideas, etc. The proofs are written in such a way that they mirror the structure of a tree. Some automated theorem provers of today have attempted to capitalize on this formalization of the structure of proofs as to more efficiently solve problems. However, none of these theorem provers have the capabilities to adequately solve problems across domains as current vision systems are beginning to be able to do. Thus, it is highly possible but still unknown whether similar tactics to the ones used in the vision system and specifically hierarchical processing can dramatically improve automated theorem provers.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.