Negative multinomial distribution
Encyclopedia
In probability theory
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...

 and statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....

, the negative multinomial distribution is a generalization of the negative binomial distribution
Negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified number of failures occur...

 (NB(r, p)) to more than two outcomes.

Suppose we have an experiment that generates m+1≥2 possible outcomes, {X0,…,Xm}, each occurring with non-negative probabilities {p0,…,pm} respectively. If sampling proceeded until n observations were made, then {X0,…,Xm} would have been multinomially distributed. However, if the experiment is stopped once X0 reaches the predetermined value k0, then the distribution of the m-tuple {X1,…,Xm} is negative multinomial.

Negative multinomial distribution example

The table below shows the an example of 400 Melanoma
Melanoma
Melanoma is a malignant tumor of melanocytes. Melanocytes are cells that produce the dark pigment, melanin, which is responsible for the color of skin. They predominantly occur in skin, but are also found in other parts of the body, including the bowel and the eye...

 (skin cancer) Patients where the Type and Site of the cancer are recorded for each subject.
Type Site Totals
Head and Neck Trunk Extremities
Hutchinson's melanomic freckle 22 2 10 34
Superficial 16 54 115 185
Nodular 19 33 73 125
Indeterminant 11 17 28 56
Column Totals 68 106 226 400



The sites (locations) of the cancer may be independent, but there may be positive dependencies of the type of cancer for a given location (site). For example, localized exposure to radiation implies that elevated level of one type of cancer (at a given location) may indicate higher level of another cancer type at the same location. The Negative Multinomial distribution may be used to model the sites cancer rates and help measure some of the cancer type dependencies within each location.

If denote the cancer rates for each site () and each type of cancer (), for a fixed site () the cancer rates are independent Negative Multinomial distributed random variables. That is, for each column index (site) the column-vector X has the following distribution:
.

Different columns in the table (sites) are considered to be different instances of the random multinomially distributed vector, X. Then we have the following estimates of expected counts (frequencies of cancer):
Example:


For the first site (Head and Neck, j=0), suppose that and . Then:
and therefore,


Notice that the pair-wise NM correlations are always positive, where as the correlations between multinomial counts are always negative. As the parameter increases, the paired correlations tend to zero! Thus, for large , the Negative Multinomial counts behave as independent Poisson random variables
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...

 with respect to their means .

The marginal distribution
Marginal distribution
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. The term marginal variable is used to refer to those variables in the subset of variables being retained...

 of each of the variables is negative binomial, as the count (considered as success) is measured against all the other outcomes (failure). But jointly, the distribution of is negative multinomial, i.e., .

Parameter estimation

  • Estimation of the mean (expected) frequency counts () of each outcome () using maximum likelihood is possible. If we have a single observation vector , then If we have several observation vectors, like in this case we have the cancer type frequencies for 3 different sites, then the MLE estimates of the mean counts are , where is the cancer-type index and the summation is over the number of observed (sampled) vectors (I). For the cancer data above, we have the following MLE estimates for the expectations for the frequency counts:
Hutchinson's melanomic freckle type of cancer () is .
Superficial type of cancer () is .
Nodular type of cancer () is .
Indeterminant type of cancer () is .

  • There is no MLE estimate for the NM parameter. However, there are approximate protocols for estimating the parameter using the chi-squared goodness of fit statistic. In the usual chi-squared statistic:
, we can replace the expected-means () by their estimates, , and replace denominators by the corresponding negative multinomial variances. Then we get the following test statistic for negative multinomial distributed data:
.

Next, we can estimate the parameter by varying the values of in the expression and matching the values of this statistic with the corresponding asymptotic chi-squared distribution. The following protocol summarizes these steps using the cancer data above.
DF: The degree of freedom for the Chi-squared distribution in this case is:
df = (# rows – 1)(# columns – 1) = (3-1)*(4-1) = 6

Median: The median of a chi-squared random variable with 6 df is 5.261948.

Mean Counts Estimates: The mean counts estimates () for the 4 different cancer types are:
; ; and .

Thus, we can solve the equation above for the single variable of interest -- the unknown parameter . In the cancer example, suppose . Then, the solution is an asymptotic chi-squared distribution driven estimate of the parameter .
.
Solving this equation for provides the desired estimate for the last parameter.
Mathematica
Mathematica
Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...

provides 3 distinct () solutions to this equation: {50.5466, -21.5204, 2.40461}. Since there are 2 candidate solutions.

  • Estimates of probabilities: Assume and , then:
Hence, , and , , and .
Therefore, the best model distribution for the observed sample is

Further reading

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK