Maximum entropy Markov model
Encyclopedia
In machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

, a maximum-entropy Markov model (MEMM), or conditional Markov model (CMM), is a graphical model
Graphical model
A graphical model is a probabilistic model for which a graph denotes the conditional independence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning....

 for sequence labeling
Sequence labeling
In machine learning, sequence labeling is a type of pattern recognition task that involves the algorithmic assignment of a categorical label to each member of a sequence of observed values. A common example of a sequence labeling task is part of speech tagging, which seeks to assign a part of...

 that combines features of hidden Markov model
Hidden Markov model
A hidden Markov model is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved states. An HMM can be considered as the simplest dynamic Bayesian network. The mathematics behind the HMM was developed by L. E...

s (HMMs) and maximum entropy
Maximum entropy probability distribution
In statistics and information theory, a maximum entropy probability distribution is a probability distribution whose entropy is at least as great as that of all other members of a specified class of distributions....

 (MaxEnt) models. An MEMM is a discriminative model
Discriminative model
Discriminative models are a class of models used in machine learning for modeling the dependence of an unobserved variable y on an observed variable x...

 that extends a standard maximum entropy classifier by assuming that the unknown values to be learned are connected in a Markov chain
Markov chain
A Markov chain, named after Andrey Markov, is a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process characterized as memoryless: the next state depends only on the current state and not on the...

 rather than being conditionally independent of each other. MEMMs find applications in natural language processing
Natural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....

, specifically in part-of-speech tagging
Part-of-speech tagging
In corpus linguistics, part-of-speech tagging , also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text as corresponding to a particular part of speech, based on both its definition, as well as its context—i.e...

 and information extraction
Information extraction
Information extraction is a type of information retrieval whose goal is to automatically extract structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language...

.

MEMMs potentially suffer from the "label bias problem," where states with low-entropy transition distributions "effectively ignore their observations." Conditional random field
Conditional random field
A conditional random field is a statistical modelling method often applied in pattern recognition.More specifically it is a type of discriminative undirected probabilistic graphical model. It is used to encode known relationships between observations and construct consistent interpretations...

s were designed to overcome this weakness.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK