Student's t-statistic - AbsoluteAstronomy.com

Statistics

Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, the t-statistic is a ratio of the departure of an estimated parameter from its notional value and its standard error

Standard error (statistics)

The standard error is the standard deviation of the sampling distribution of a statistic. The term may also be used to refer to an estimate of that standard deviation, derived from a particular sample used to compute the estimate....

. It is used in hypothesis testing

Statistical hypothesis testing

A statistical hypothesis test is a method of making decisions using data, whether from a controlled experiment or an observational study . In statistics, a result is called statistically significant if it is unlikely to have occurred by chance alone, according to a pre-determined threshold...

, for example in the Student's t-test

Student's t-test

A t-test is any statistical hypothesis test in which the test statistic follows a Student's t distribution if the null hypothesis is supported. It is most commonly applied when the test statistic would follow a normal distribution if the value of a scaling term in the test statistic were known...

, in the augmented Dickey–Fuller test, and in bootstrapping

Bootstrapping (statistics)

In statistics, bootstrapping is a computer-based method for assigning measures of accuracy to sample estimates . This technique allows estimation of the sample distribution of almost any statistic using only very simple methods...

Definition

Let

\scriptstyle\hat\beta

be an estimator of parameter β in some statistical model
Statistical model
A statistical model is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more random variables. The model is statistical as the variables are not deterministically but...

. Then a t-statistic for this parameter is any quantity of the form

where β₀ is a non-random, known constant, and

\scriptstyle s.e.(\hat\beta)

is the standard error

Standard error (statistics)

of the estimator

\scriptstyle\hat\beta

. By default, statistical packages report t-statistic with (these t-statistics are used to test the significance of corresponding regressor). However, when t-statistic is needed to test the hypothesis of the form , then a non-zero β₀ may be used.

If

\scriptstyle\hat\beta

is an ordinary least squares

Ordinary least squares

In statistics, ordinary least squares or linear least squares is a method for estimating the unknown parameters in a linear regression model. This method minimizes the sum of squared vertical distances between the observed responses in the dataset and the responses predicted by the linear...

estimator in the classical linear regression model (that is, with normally distributed and homoskedastic

Homoscedasticity

In statistics, a sequence or a vector of random variables is homoscedastic if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity...

error terms), and if the true value of parameter β is equal to β₀, then the sampling distribution

Sampling distribution

In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given statistic based on a random sample. Sampling distributions are important in statistics because they provide a major simplification on the route to statistical inference...

of the t-statistic is the Student’s t-distribution with degrees of freedom, where n is the number of observations, and k is the number of regressors (including the intercept).

In the majority of models the estimator

\scriptstyle\hat\beta

is consistent for β and distributed asymptotically normally. If the true value of parameter β is equal to β₀ and the quantity

\scriptstyle s.e.(\hat\beta)

correctly estimates the asymptotic variance of this estimator, then the t-statistic will have asymptotically the standard normal distribution.

In some models the distribution of t-statistic is different from normal, even asymptotically. For example, when a time series

Time series

In statistics, signal processing, econometrics and mathematical finance, a time series is a sequence of data points, measured typically at successive times spaced at uniform time intervals. Examples of time series are the daily closing value of the Dow Jones index or the annual flow volume of the...

with unit root

Unit root

In time series models in econometrics , a unit root is a feature of processes that evolve through time that can cause problems in statistical inference if it is not adequately dealt with....

is regressed in the augmented Dickey–Fuller test, the test t-statistic will asymptotically have one of the Dickey–Fuller distributions (depending on the test setting).

Use

Most frequently, t-statistics are used in Student's t-test

Student's t-test

s, a form of statistical hypothesis testing

Statistical hypothesis testing

, and in the computation of certain confidence interval

Confidence interval

In statistics, a confidence interval is a particular kind of interval estimate of a population parameter and is used to indicate the reliability of an estimate. It is an observed interval , in principle different from sample to sample, that frequently includes the parameter of interest, if the...

s.

The key property of the t-statistic is that it is a pivotal quantity

Pivotal quantity

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters whose probability distribution does not depend on unknown parameters....

– while defined in terms of the sample mean, its sampling distribution does not depend on the sample parameters, and thus it can be used regardless of what these may be.

One can also divide a residual

Errors and residuals in statistics

In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its "theoretical value"...

by the sample standard deviation:

to compute an estimate for the number of standard deviations a given sample is from the mean, as a sample version of a z-score, the z-score requiring the population parameters.

Prediction

Given a normal distribution

with unknown mean and variance, the t-statistic of a future observation

after one has made n observations, is an ancillary statistic

Ancillary statistic

In statistics, an ancillary statistic is a statistic whose sampling distribution does not depend on which of the probability distributions among those being considered is the distribution of the statistical population from which the data were taken...

– a pivotal quantity (does not depend on the values of μ and σ²) that is a statistic (computed from observations). This allows one to compute a frequentist prediction interval

Prediction interval

In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which future observations will fall, with a certain probability, given what has already been observed...

(a predictive confidence interval

Confidence interval

), via the following t-distribution:

Solving for

yields the prediction distribution

from which one may compute predictive confidence intervals – given a probability p, one may compute intervals such that 100p% of the time, the next observation

will fall in that interval.

History

The term "t-statistic" is abbreviated from "test statistic", while "Student" was the pen name of William Sealy Gosset

William Sealy Gosset

William Sealy Gosset is famous as a statistician, best known by his pen name Student and for his work on Student's t-distribution....

, who introduced the t-statistic and t-test in 1908, while working for the Guinness

Guinness

Guinness is a popular Irish dry stout that originated in the brewery of Arthur Guinness at St. James's Gate, Dublin. Guinness is directly descended from the porter style that originated in London in the early 18th century and is one of the most successful beer brands worldwide, brewed in almost...

brewery

Brewery

A brewery is a dedicated building for the making of beer, though beer can be made at home, and has been for much of beer's history. A company which makes beer is called either a brewery or a brewing company....

in Dublin, Ireland.

Related concepts

z-score: If the population parameters are known, then rather than computing the t-statistic, one can compute the z-score; analogously, rather than using a t-test, one uses a z-test

Z-test

A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Due to the central limit theorem, many test statistics are approximately normally distributed for large samples...

. This is rare outside of standardized testing.
Studentized residual

Studentized residual

In statistics, a studentized residual is the quotient resulting from the division of a residual by an estimate of its standard deviation. Typically the standard deviations of residuals in a sample vary greatly from one data point to another even when the errors all have the same standard...

: In regression analysis

Regression analysis

In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...

, the standard errors of the estimators at different data points vary (compare the middle versus endpoints of a simple linear regression

Simple linear regression

In statistics, simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. In other words, simple linear regression fits a straight line through the set of n points in such a way that makes the sum of squared residuals of the model as...

), and thus one must divide the different residuals by different estimates for the error, yielding what are called studentized residual

Studentized residual

External links

What is a t statistic? at The Children's Mercy Hospital

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Definition

Use

Prediction

History

Related concepts

See also

External links