Assumed mean
Encyclopedia
In statistics
the assumed mean is a method for calculating the arithmetic mean
and standard deviation
of a data set. It simplifies calculating accurate values by hand. Its interest today is chiefly historical but it can be used to quickly estimate these statistics. There are other rapid calculation methods which are more suited for computers which also ensure more accurate results than the obvious methods.
Suppose we start with a plausible initial guess that the mean is about 240. Then the deviations from this "assumed" mean are the following:
In adding these up, one finds that:
and so on. We are left with a sum of −30. The average of these 15 deviations from the assumed mean is therefore −30/15 = −2. Therefore that is what we need to add to the assumed mean to get the correct mean:
For a data set with assumed mean x0 suppose:
Then
or for a sample standard deviation using Bessel's correction
:
For instance with the sample:
The minimum and maximum are 159.6 and 187.6 we can group them as follows rounding the numbers down. The class size (CS) is 3. The assumed mean is the centre of the range from 174 to 177 which is 175.5. The differences are counted in classes.
The mean is then estimated to be
which is very close to the actual mean of 173.846.
The standard deviation is estimated as
This is a 3% higher than the actual sample standard deviation of 5.40.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
the assumed mean is a method for calculating the arithmetic mean
Arithmetic mean
In mathematics and statistics, the arithmetic mean, often referred to as simply the mean or average when the context is clear, is a method to derive the central tendency of a sample space...
and standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
of a data set. It simplifies calculating accurate values by hand. Its interest today is chiefly historical but it can be used to quickly estimate these statistics. There are other rapid calculation methods which are more suited for computers which also ensure more accurate results than the obvious methods.
Example
The mean of the following numbers is sought:- 219, 223, 226, 228, 231, 234, 235, 236, 240, 241, 244, 247, 249, 255, 262
Suppose we start with a plausible initial guess that the mean is about 240. Then the deviations from this "assumed" mean are the following:
- −21, −17, −14, −12, −9, −6, −5, −4, 0, 1, 4, 7, 9, 15, 22
In adding these up, one finds that:
- 22 and −21 almost cancel, leaving +1,
- 15 and −17 almost cancel, levaing −2,
- 9 and −9 cancel,
- 7 + 4 cancels −6 − 5,
and so on. We are left with a sum of −30. The average of these 15 deviations from the assumed mean is therefore −30/15 = −2. Therefore that is what we need to add to the assumed mean to get the correct mean:
- correct mean = 240 − 2 = 238.
Method
The method depends on estimating the mean and rounding to an easy value to calculate with. This value is then subtracted from all the sample values. When the samples are classed into equal size ranges a central class is chosen and the count of ranges from that is used in the calculations. For example for people's heights a value of 1.75m might be used as the assumed mean.For a data set with assumed mean x0 suppose:
Then
or for a sample standard deviation using Bessel's correction
Bessel's correction
In statistics, Bessel's correction, named after Friedrich Bessel, is the use of n − 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample: it corrects the bias in the estimation of the population variance,...
:
Example using class ranges
Where there are a large number of samples a quick reasonable estimate of the mean and standard deviation can be got by grouping the samples into classes using equal size ranges. This introduces a quantization error but is normally accurate enough for most purposes if 10 or more classes are used.For instance with the sample:
- 167.8 175.4 176.1 166 174.7 170.2 178.9 180.4 174.6 174.5 182.4 173.4 167.4 170.7 180.6 169.6 176.2 176.3 175.1 178.7 167.2 180.2 180.3 164.7 167.9 179.6 164.9 173.2 180.3 168 175.5 172.9 182.2 166.7 172.4 181.9 175.9 176.8 179.6 166 171.5 180.6 175.5 173.2 178.8 168.3 170.3 174.2 168 172.6 163.3 172.5 163.4 165.9 178.2 174.6 174.3 170.5 169.7 176.2 175.1 177 173.5 173.6 174.3 174.4 171.1 173.3 164.6 173 177.9 166.5 159.6 170.5 174.7 182 172.7 175.9 171.5 167.1 176.9 181.7 170.7 177.5 170.9 178.1 174.3 173.3 169.2 178.2 179.4 187.6 186.4 178.1 174 177.1 163.3 178.1 179.1 175.6
The minimum and maximum are 159.6 and 187.6 we can group them as follows rounding the numbers down. The class size (CS) is 3. The assumed mean is the centre of the range from 174 to 177 which is 175.5. The differences are counted in classes.
Range | tally-count | frequency | class diff | freq×diff | freq×diff2 |
---|---|---|---|---|---|
159—161 | / | 1 | −5 | −5 | 25 |
162—164 | |
6 | −4 | −24 | 96 |
165—167 | |
10 | −3 | −30 | 90 |
168—170 | |
13 | −2 | −26 | 52 |
171—173 | |
16 | −1 | −16 | 16 |
174—176 | |
25 | 0 | 0 | 0 |
177—179 | |
16 | 1 | 16 | 16 |
180—182 | |
11 | 2 | 22 | 44 |
183—186 | 0 | 3 | 0 | 0 | |
185—188 | // | 2 | 4 | 8 | 32 |
Sum | N = 100 | A = −55 | B = 371 |
The mean is then estimated to be
which is very close to the actual mean of 173.846.
The standard deviation is estimated as
This is a 3% higher than the actual sample standard deviation of 5.40.