Pointwise Mutual Information
Encyclopedia
Pointwise mutual information (PMI), or point mutual information, is a measure of association used in information theory
and statistics
.
x and y belonging to discrete random variables X and Y quantifies the discrepancy between the probability of their coincidence given their joint distribution and the probability of their coincidence given only their individual distributions, assuming independence. Mathematically:
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
and statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
.
Definition
The PMI of a pair of outcomesProbability space
In probability theory, a probability space or a probability triple is a mathematical construct that models a real-world process consisting of states that occur randomly. A probability space is constructed with a specific kind of situation or experiment in mind...
x and y belonging to discrete random variables X and Y quantifies the discrepancy between the probability of their coincidence given their joint distribution and the probability of their coincidence given only their individual distributions, assuming independence. Mathematically:
-
The mutual informationMutual informationIn probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two random variables...
(MI) of the random variables X and Y is the expected value of the PMI over all possible outcomes.
The measure is symmetric (). It can take positive or negative values, but is zero if X and Y are independentStatistical independenceIn probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...
. PMI maximizes when X and Y are perfectly associated, yielding the following bounds:
Finally, will increase if is fixed but decreases.
Here is an example to illustrate:x y p(x, y) 0 0 0.1 0 1 0.7 1 0 0.15 1 1 0.05
Using this table we can marginalize to get the following additional table for the individual distributions:p(x) p(y) 0 .8 0.25 1 .2 0.75
With this example, we can compute four values for . Using base-2 logarithms:pmi(x=0;y=0) −1 pmi(x=0;y=1) 0.222392421 pmi(x=1;y=0) 1.584962501 pmi(x=1;y=1) −1.584962501
(For reference would then be 0.214170945)
Similarities to Mutual Information
Pointwise Mutual Information has many of the same relationships as the mutual information. In particular,
Where is the self-informationSelf-informationIn information theory, self-information is a measure of the information content associated with the outcome of a random variable. It is expressed in a unit of information, for example bits,nats,or...
, or .
Normalized Pointwise mutual information (npmi)
Pointwise mutual information can be normalized between [-1,+1] resulting in -1 (in the limit) for never occurring together, 0 for independence, and +1 for complete co-occurrence.
Chain-rule for pmi
Pointwise mutual information follows the chain-rule, that is,
This is easily proven by:
External links
- Demo at Rensselaer MSR Server (PMI values normalized to be between 0 and 1)