Lift (data mining)
Encyclopedia
In data mining
, lift is a measure of the performance of a model at predicting or classifying cases, measuring against a random choice model.
For example, suppose a population has a predicted response rate of 5%, but a certain model has identified a segment with a predicted response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).
Typically, the modeller seeks to divide the population into quantile
s, and rank the quantiles by lift. Organizations can then consider each quantile, and by weighing the predicted response rate (and associated financial benefit) against the cost, they can decide whether to market to that quantile.
where the antecedent is the input variable that we can control, and the consequent is the variable we are trying to predict. Real mining problems would typically have more complex antecedents, but usually focus on single-value consequents.
Most mining algorithms would determine the following rules:
because these are simply the most common patterns found in the data. A simple review of the above table should make these rules obvious.
The support for Rule 1 is 3/7 because that is the number of items in the dataset in which the antecedent is A and the consequent 0 . The support for Rule 2 is 2/7 because two of the seven records meet the antecedent of B and the consequent of 1.
The confidence for Rule 1 is 3/4 because three of the four records that meet the antecedent of A meet the consequent of 0. The confidence for Rule 2 is 2/3 because two of the three records that meet the antecedent of B meet the consequent of 1.
Lift can be found by dividing the confidence by the probability of the consequent, or by dividing the support by the probability of the antecedent times the consequent, so:
If some rule had a lift of 1, it would imply that a specific occurrence of the some value pairs of the antecedent and consequent was independent. If two events are independent from each other, we can't draw a rule involving those two events.
If the lift is positive, like for Rule 1 and Rule 2, that lets us know the degree to which those two occurrences are dependent from one another, that those rules are potentially useful for predicting the consequent in future data sets.
Observe that even though Rule 1 has higher confidence, it has lower lift. Intuitively, it would seem that Rule 1 is more valuable because of its higher confidence—it seems more accurate. But accuracy of the rule independent of the data set can be misleading. The value of lift is that it considers both the confidence of the rule and the overall data set.
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...
, lift is a measure of the performance of a model at predicting or classifying cases, measuring against a random choice model.
For example, suppose a population has a predicted response rate of 5%, but a certain model has identified a segment with a predicted response rate of 20%. Then that segment would have a lift of 4.0 (20%/5%).
Typically, the modeller seeks to divide the population into quantile
Quantile
Quantiles are points taken at regular intervals from the cumulative distribution function of a random variable. Dividing ordered data into q essentially equal-sized data subsets is the motivation for q-quantiles; the quantiles are the data values marking the boundaries between consecutive subsets...
s, and rank the quantiles by lift. Organizations can then consider each quantile, and by weighing the predicted response rate (and associated financial benefit) against the cost, they can decide whether to market to that quantile.
Example
Assume the data set being mined is:Antecedent | Consequent |
---|---|
A | 0 |
A | 0 |
A | 1 |
A | 0 |
B | 1 |
B | 0 |
B | 1 |
where the antecedent is the input variable that we can control, and the consequent is the variable we are trying to predict. Real mining problems would typically have more complex antecedents, but usually focus on single-value consequents.
Most mining algorithms would determine the following rules:
- Rule 1: A implies 0
- Rule 2: B implies 1
because these are simply the most common patterns found in the data. A simple review of the above table should make these rules obvious.
The support for Rule 1 is 3/7 because that is the number of items in the dataset in which the antecedent is A and the consequent 0 . The support for Rule 2 is 2/7 because two of the seven records meet the antecedent of B and the consequent of 1.
The confidence for Rule 1 is 3/4 because three of the four records that meet the antecedent of A meet the consequent of 0. The confidence for Rule 2 is 2/3 because two of the three records that meet the antecedent of B meet the consequent of 1.
Lift can be found by dividing the confidence by the probability of the consequent, or by dividing the support by the probability of the antecedent times the consequent, so:
- The lift for Rule 1 is (3/4)/(4/7) = 1.3125
- The lift for Rule 2 is (2/3)/(3/7) = 2/3 * 7/3 = 14/9 = 1.(5).
If some rule had a lift of 1, it would imply that a specific occurrence of the some value pairs of the antecedent and consequent was independent. If two events are independent from each other, we can't draw a rule involving those two events.
If the lift is positive, like for Rule 1 and Rule 2, that lets us know the degree to which those two occurrences are dependent from one another, that those rules are potentially useful for predicting the consequent in future data sets.
Observe that even though Rule 1 has higher confidence, it has lower lift. Intuitively, it would seem that Rule 1 is more valuable because of its higher confidence—it seems more accurate. But accuracy of the rule independent of the data set can be misleading. The value of lift is that it considers both the confidence of the rule and the overall data set.