Bootstrap error-adjusted single-sample technique
Encyclopedia
In statistics
, the bootstrap error-adjusted single-sample technique (BEST or the BEAST) is a non-parametric
method that is intended to allow an assessment to be made of the validity of a single sample. It is based on estimating a probability distribution
representing what can be expected from valid samples. This is done use a statistical method called bootstrapping
, applied to previous samples that are known to be valid.
, because it does not assume that for all spectral groups have equal covariance
s or that each group is drawn for a normally distributed population. A quantitative approach involves BEST along with a nonparametric cluster analysis algorithm. Multidimensional standard deviations (MDSs) between clusters and spectral data points are calculated, where BEST considers each frequency to be taken from a separate dimension.
BEST is based on a population, P, relative to some hyperspace, R, that represents the universe of possible samples. P* is the realized values of P based on a calibration set, T. T is used to find all possible variation in P. P* is bound by parameters C and B. C is the expectation value of P, written E(P), and B is a bootstrapping distribution called the Monte Carlo
approximation. The standard deviation
can be found using this technique. The values of B projected into hyperspace give rise to X. The hyperline from C to X gives rise to the skew adjusted standard deviation which is calculated in both directions of the hyperline.
Methods such as ICP-AES
require capsules to be emptied for analysis. A nondestructive
method is valuable. A method such as NIRA can be coupled to the BEST method in the following ways.
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, the bootstrap error-adjusted single-sample technique (BEST or the BEAST) is a non-parametric
Non-parametric statistics
In statistics, the term non-parametric statistics has at least two different meanings:The first meaning of non-parametric covers techniques that do not rely on data belonging to any particular distribution. These include, among others:...
method that is intended to allow an assessment to be made of the validity of a single sample. It is based on estimating a probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
representing what can be expected from valid samples. This is done use a statistical method called bootstrapping
Bootstrapping (statistics)
In statistics, bootstrapping is a computer-based method for assigning measures of accuracy to sample estimates . This technique allows estimation of the sample distribution of almost any statistic using only very simple methods...
, applied to previous samples that are known to be valid.
Methodology
BEST provides advantages over other methods such as the Mahalanobis metricMahalanobis distance
In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analyzed. It gauges similarity of an unknown sample set to a known one. It differs from Euclidean...
, because it does not assume that for all spectral groups have equal covariance
Covariance
In probability theory and statistics, covariance is a measure of how much two variables change together. Variance is a special case of the covariance when the two variables are identical.- Definition :...
s or that each group is drawn for a normally distributed population. A quantitative approach involves BEST along with a nonparametric cluster analysis algorithm. Multidimensional standard deviations (MDSs) between clusters and spectral data points are calculated, where BEST considers each frequency to be taken from a separate dimension.
BEST is based on a population, P, relative to some hyperspace, R, that represents the universe of possible samples. P* is the realized values of P based on a calibration set, T. T is used to find all possible variation in P. P* is bound by parameters C and B. C is the expectation value of P, written E(P), and B is a bootstrapping distribution called the Monte Carlo
Monte Carlo method
Monte Carlo methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo methods are often used in computer simulations of physical and mathematical systems...
approximation. The standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
can be found using this technique. The values of B projected into hyperspace give rise to X. The hyperline from C to X gives rise to the skew adjusted standard deviation which is calculated in both directions of the hyperline.
Application
BEST is used in detection of sample tampering in pharmaceutical products. Valid (unaltered) samples are defined as those that fall inside the cluster of training-set points when the BEST is trained with unaltered product samples. False (tampered) samples are those that fall outside of the same cluster.Methods such as ICP-AES
ICP-AES
Inductively coupled plasma atomic emission spectroscopy , also referred to as inductively coupled plasma optical emission spectrometry , is an analytical technique used for the detection of trace metals...
require capsules to be emptied for analysis. A nondestructive
Nondestructive testing
Nondestructive testing or Non-destructive testing is a wide group of analysis techniques used in science and industry to evaluate the properties of a material, component or system without causing damage....
method is valuable. A method such as NIRA can be coupled to the BEST method in the following ways.
- Detect any tampered product by determining that it is not similar to the previously analyzed unaltered product.
- Quantitatively identify the contaminant from a library of known adulterants in that product.
- Provide quantitative indication of the amount of contaminant present.
Further reading
- Y. Zou, Robert A. Lodder (1993) "An Investigation of the Performance of the Extended Quantile BEAST in High Dimensional Hyperspace", paper #885 at the Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy, Atlanta, GA
- Y. Zou, Robert A. Lodder (1993) "The Effect of Different Data Distributions on the Performance of the Extended Quantile BEAST in Pattern Recognition", paper #593 at the Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy, Atlanta, GA