Seven-number summary
Encyclopedia
In descriptive statistics
, the seven-number summary is a collection of seven summary statistics
, and is a modification or extension of the five-number summary
. There are two common forms.
As with the five-number summary, it can be represented by a modified box plot
, adding hatch-marks for two of the additional numbers.
for a normally distributed model:
The middle three values – the lower quartile
, median
, and upper quartile
– are the usual statistics from the five-number summary
and are the standard values for the box in a box plot
.
The two unusual percentiles at either end are used because the locations of all seven values will be equally spaced if the data is normally distributed. Some statistical tests require normally distributed data, so the plotted values provide a convenient visual check for validity of later tests, simply by scanning to see if the locations of those seven percentiles appear to be equally spaced.
Notice that whereas the five-number summary
makes no assumptions about the distribution of the data, the (parametric) seven-number summary is based on the normal distribution, and is not especially appropriate when normal data is not expected. However, the non-parametric seven number summary, discussed below, makes no assumptions.
The values can be represented using a modified box plot
. The 2nd and 98th percentiles are represented by the ends of the whiskers, and hatch-marks across the whiskers mark the 9th and 91st percentiles.
, called a "seven-figure summary", including the extremes, decile
s and quartile
s, along with the median.
Thus the numbers are:
Descriptive statistics
Descriptive statistics quantitatively describe the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics , in that descriptive statistics aim to summarize a data set, rather than use the data to learn about the population that the data are...
, the seven-number summary is a collection of seven summary statistics
Summary statistics
In descriptive statistics, summary statistics are used to summarize a set of observations, in order to communicate the largest amount as simply as possible...
, and is a modification or extension of the five-number summary
Five-number summary
The five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:# the sample minimum # the lower quartile or first quartile...
. There are two common forms.
As with the five-number summary, it can be represented by a modified box plot
Box plot
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...
, adding hatch-marks for two of the additional numbers.
(Parametric) Seven-number summary
The following numbers are parametric statisticsParametric statistics
Parametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods are parametric....
for a normally distributed model:
- the 2nd percentilePercentileIn statistics, a percentile is the value of a variable below which a certain percent of observations fall. For example, the 20th percentile is the value below which 20 percent of the observations may be found...
- the 9th percentile
- the 25th percentile or lower quartileQuartileIn descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
or first quartile - the 50th percentile or medianMedianIn probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
(middle value, or second quartile) - the 75th percentile or upper quartileQuartileIn descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
or third quartile - the 91st percentile
- the 98th percentile
The middle three values – the lower quartile
Quartile
In descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
, median
Median
In probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
, and upper quartile
Quartile
In descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
– are the usual statistics from the five-number summary
Five-number summary
The five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:# the sample minimum # the lower quartile or first quartile...
and are the standard values for the box in a box plot
Box plot
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...
.
The two unusual percentiles at either end are used because the locations of all seven values will be equally spaced if the data is normally distributed. Some statistical tests require normally distributed data, so the plotted values provide a convenient visual check for validity of later tests, simply by scanning to see if the locations of those seven percentiles appear to be equally spaced.
Notice that whereas the five-number summary
Five-number summary
The five-number summary is a descriptive statistic that provides information about a set of observations. It consists of the five most important sample percentiles:# the sample minimum # the lower quartile or first quartile...
makes no assumptions about the distribution of the data, the (parametric) seven-number summary is based on the normal distribution, and is not especially appropriate when normal data is not expected. However, the non-parametric seven number summary, discussed below, makes no assumptions.
The values can be represented using a modified box plot
Box plot
In descriptive statistics, a box plot or boxplot is a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation , lower quartile , median , upper quartile , and largest observation...
. The 2nd and 98th percentiles are represented by the ends of the whiskers, and hatch-marks across the whiskers mark the 9th and 91st percentiles.
Bowley’s seven-figure summary
Arthur Bowley used a set of non-parametric statisticsNon-parametric statistics
In statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...
, called a "seven-figure summary", including the extremes, decile
Decile
* In descriptive statistics, any of the nine values that divide the sorted data into ten equal parts, so that each part represents 1/10 of the sample or population* In astrology, an aspect of 36 degrees-See also:*Percentile*Quantile*Quartile*Summary statistics...
s and quartile
Quartile
In descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
s, along with the median.
Thus the numbers are:
- the sample minimum
- the 10th percentile (first decileDecile* In descriptive statistics, any of the nine values that divide the sorted data into ten equal parts, so that each part represents 1/10 of the sample or population* In astrology, an aspect of 36 degrees-See also:*Percentile*Quantile*Quartile*Summary statistics...
) - the 25th percentile or lower quartileQuartileIn descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
or first quartile - the 50th percentile or medianMedianIn probability theory and statistics, a median is described as the numerical value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to...
(middle value, or second quartile) - the 75th percentile or upper quartileQuartileIn descriptive statistics, the quartiles of a set of values are the three points that divide the data set into four equal groups, each representing a fourth of the population being sampled...
or third quartile - the 90th percentile (last decileDecile* In descriptive statistics, any of the nine values that divide the sorted data into ten equal parts, so that each part represents 1/10 of the sample or population* In astrology, an aspect of 36 degrees-See also:*Percentile*Quantile*Quartile*Summary statistics...
) - the sample maximum