Bayesian average
Encyclopedia
A Bayesian average is a method of estimating the mean
of a population consistent with Bayesian interpretation
, where instead of estimating the mean strictly from the available data set, other existing information related to that data set may also be incorporated into the calculation in order to minimize the impact of large deviations, or to assert a default value when the data set is small.
For example, in a calculation of an average review score of a book where only two reviews are available, both giving scores of 10, a normal average score would be 10. However, as only two reviews are available, 10 may not represent the true average had more reviews been available. The review site may instead calculate a Bayesian average of this score by adding the average review score of all books in the store to the calculation. For example, by adding five scores of 7 each, the Bayesian average becomes 7.86 instead of 10, which the review site would hope that it will better represent the quality of the book.
Note that the additional information incorporated into the mean calculation does not have to be the true prior mean of the larger population, but rather a value subjectively determined by the person calculating the average as relevant and serving the purpose of the calculation. Therefore, the quality of the Bayesian average (in term of representing the data set) is dependent on the judgment of the person doing the calculation.
In cases where the averages' relative values are the only result of importance, m can be replaced with zero. C can be calculated based on the priors regarding variance between data sets. In circumstances where that kind of rigor is desired, other more expressive measures of statistical power
are likely to be used. As a result, C is usually assigned a value in an ad-hoc manner.
is available, for an average height of 201 cm.
Here, the Bayesian average correctly reduces the effect of a single anomalously large value. If the sample sizes for basketball players were similarly small, the Bayesian average would have mis-estimated basketball players as being far closer to average.
Mean
In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
of a population consistent with Bayesian interpretation
Bayesian probability
Bayesian probability is one of the different interpretations of the concept of probability and belongs to the category of evidential probabilities. The Bayesian interpretation of probability can be seen as an extension of logic that enables reasoning with propositions, whose truth or falsity is...
, where instead of estimating the mean strictly from the available data set, other existing information related to that data set may also be incorporated into the calculation in order to minimize the impact of large deviations, or to assert a default value when the data set is small.
For example, in a calculation of an average review score of a book where only two reviews are available, both giving scores of 10, a normal average score would be 10. However, as only two reviews are available, 10 may not represent the true average had more reviews been available. The review site may instead calculate a Bayesian average of this score by adding the average review score of all books in the store to the calculation. For example, by adding five scores of 7 each, the Bayesian average becomes 7.86 instead of 10, which the review site would hope that it will better represent the quality of the book.
Note that the additional information incorporated into the mean calculation does not have to be the true prior mean of the larger population, but rather a value subjectively determined by the person calculating the average as relevant and serving the purpose of the calculation. Therefore, the quality of the Bayesian average (in term of representing the data set) is dependent on the judgment of the person doing the calculation.
Calculation
Calculating the Bayesian average uses the prior mean m and a constant C. C is assigned a value that is proportional to the typical data set size. The value is larger when the expected variation between data sets (within the larger population) is small. It is smaller, when the data sets are expected to vary substantially from one another.In cases where the averages' relative values are the only result of importance, m can be replaced with zero. C can be calculated based on the priors regarding variance between data sets. In circumstances where that kind of rigor is desired, other more expressive measures of statistical power
Statistical power
The power of a statistical test is the probability that the test will reject the null hypothesis when the null hypothesis is actually false . The power is in general a function of the possible distributions, often determined by a parameter, under the alternative hypothesis...
are likely to be used. As a result, C is usually assigned a value in an ad-hoc manner.
Example
The goal is to calculate the Bayesian average of the heights of various occupations of adult American men. In the larger population of adult American men, the average height is 176 cm. A value of C is chosen as 10. For the purpose of this example, the occupations used will be "Basketball Players", "Actors" and "Students". For the basketball players, a group of 15 individuals are identified with an average height of 191 cm among them. For the students, a group of 10 individuals is identified with an average height of 179 cm. For the actors, only James CromwellJames Cromwell
James Oliver Cromwell is an American film and television actor. Some of his more notable roles are in Babe , for which he earned Academy Award nomination for Best Supporting Actor, Star Trek: First Contact , L.A...
is available, for an average height of 201 cm.
Group | N | Group mean Mean In statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean.... |
Bayesian average |
---|---|---|---|
Basketball players | 15 | 191 cm | 185 cm |
Students | 10 | 179 cm | 177.5 cm |
Actors | 1 | 201 cm | 178 cm |
Here, the Bayesian average correctly reduces the effect of a single anomalously large value. If the sample sizes for basketball players were similarly small, the Bayesian average would have mis-estimated basketball players as being far closer to average.