|
|
Negative binomial distribution{{Probability distribution| name =Negative binomial| type =mass| pdf_image =| cdf_image =| parameters = (real number) (real)| support =| pdf =| cdf =| mean =| median =| mode = if if | variance =| skewness =| kurtosis =| entropy =| mgf =| char = }} In probability and statistics the negative binomial distribution is a discrete probability distribution. The Pascal distribution is a special case of the negative binomial. ==Specification of the negative binomial distribution== ===Probability mass function=== The family of negative binomial distributions is a two-parameter family; several parametrizations are in common use. One very common parameterization employs two real number-valued parameters ''p'' and ''r'' with 0 < ''p'' < 1 and ''r'' > 0. Under this parameterization, the probability mass function of a random variable with a NegBin(''r'', ''p'') distribution takes the following form: : for ''k'' ≥ 0 (Γ is the gamma function). Under an alternative parameterization, let : and , and so the mass function becomes : where ''λ'' and ''ω'' are nonnegative real parameters. Under this parameterization, we have : which is precisely the mass function of a Poisson distribution random variable with Poisson rate ''λ''. In other words, the alternatively parameterized negative binomial distribution convergence of random variables to the Poisson distribution and ''ω'' controls the deviation from the Poisson. This makes the negative binomial distribution suitable as a robust alternative to the Poisson, which approaches the Poisson for large ''ω'', but which has larger variance than the Poisson for small ''ω''. Third, the negative binomial distribution arises as a continuous mixture of Poisson distributions where the mixing distribution of the Poisson rate is a gamma distribution. Formally, this means that the mass function of the negative binomial distribution can also be written as {|- | | |- | | |- | | |- | | |- | | |} Because of this, the negative binomial distribution is also known as the gamma-Poisson (mixture) distribution. ===Cumulative distribution function=== The cumulative distribution function can be expressed in terms of the regularized incomplete beta function: : ==Occurrence== ===Waiting time in a Bernoulli process=== The NegBin(''r'', ''p'') distribution is the probability distribution of a certain number of failures and successes in a series of Independent identically-distributed random variables Bernoulli trials. Specifically, for ''k''+''r'' Bernoulli trials with success probability ''p'', the negative binomial gives the probability of ''k'' failures and ''r'' successes, with success on the last trial. In other words, the negative binomial distribution is the probability distribution of the number of failures before the ''r''th success in a Bernoulli process. Consider the following example. Suppose we repeatedly throw a die, and consider a "1" to be a "success". The probability of success on each trial is 1/6. The number of trials needed to get three successes belongs to the infinite set { 3, 4, 5, 6, ... }. That number of trials is a (displaced) negative-binomially distributed random variable. The number of failures before the third success belongs to the infinite set { 0, 1, 2, 3, ... }. That number of failures is also a negative-binomially distributed random variable. A Bernoulli process is a discrete time process, and so the number of trials, failures, and successes are integers. For the special case where ''r'' is an integer, the negative binomial distribution is known as the Pascal distribution. In this case the gamma function is not needed to express the probability mass function, and factorials or binomial coefficients can be used instead: : A further specialization occurs when ''r'' = 1: in this case we get the probability distribution of failures before the first success (i.e. the probability of success on the (''k''+1)th trial), which is a geometric distribution. To wit: : ===Overdispersed Poisson=== The negative binomial distribution, especially in its alternative parameterization described above, can be used as an alternative to the Poisson distribution. It is especially useful for discrete data over an unbounded positive range whose sample variance exceeds the sample mean. If a Poisson distribution is used to model such data, the model mean and variance are equal. In that case, the observations are ''overdispersed'' with respect to the Poisson model. Since the negative binomial distribution has one more parameter than the Poisson, the second parameter can be used to adjust the variance independently of the mean. ==Related distributions== *The geometric distribution is a special case of the negative binomial distribution, with . *The negative binomial distribution convergence in distribution to the Poisson distribution in the following sense: . ==Properties== ===Relation to other distributions=== If ''X''''r'' is a random variable following the negative binomial distribution with parameters ''r'' and ''p'', then ''X''''r'' is a sum of ''r'' statistical independence variables following the geometric distribution with parameter ''p''. As a result of the central limit theorem, ''X''''r'' is therefore approximately normal distribution for sufficiently large ''r''. Furthermore, if ''Y''''s''+''r'' is a random variable following the binomial distribution with parameters ''s'' + ''r'' and ''p'', then {| | | |- | | |- | | |- | | |- | | |- | | |} In this sense, the negative binomial distribution is the "inverse" of the binomial distribution. The sum of independent negative-binomially distributed random variables with the same value of the parameter ''p'' but the "''r''-values" ''r''1 and ''r''2 is negative-binomially distributed with the same ''p'' but with "''r''-value" ''r''1 + ''r''2. The negative binomial distribution is infinite divisibility, i.e., if ''X'' has a negative binomial distribution, then for any positive integer ''n'', there exist independent identically distributed random variables ''X''1, ..., ''X''''n'' whose sum has the same distribution that ''X'' has. These will not be negative-binomially distributed in the sense defined above unless ''n'' is a divisor of ''r'' (more on this below). ===Relation to the binomial theorem=== Suppose ''X'' is a random variable with a negative binomial distribution with parameters ''r'' and ''p''. The statement that the sum from ''x'' = ''r'' to infinity, of the probability Pr[''X'' = ''x''], is equal to 1, can be shown by a bit of algebra to be equivalent to the statement that (1 − ''p'')− ''r'' is what binomial series says it should be. Suppose ''Y'' is a random variable with a binomial distribution with parameters ''n'' and ''p''. The statement that the sum from ''y'' = 0 to ''n'', of the probability Pr[''Y'' = ''y''], is equal to 1, says that 1 = (''p'' + (1 − ''p''))''n'' is what the strictly finitary binomial theorem of rudimentary algebra says it should be. Thus the negative binomial distribution bears the same relationship to the negative-integer-exponent case of the binomial theorem that the binomial distribution bears to the positive-integer-exponent case. Assume ''p'' + ''q'' = 1. Then the binomial theorem of elementary algebra implies that : This can be written in a way that may at first appear to some to be incorrect, and perhaps perverse even if correct: : in which the upper bound of summation is infinite. If the binomial coefficient is defined by : then it does not make sense when ''x'' > ''n'', since factorials of negative numbers are not defined. But one may also read it as : In that case it is defined even when ''n'' is negative or is not an integer. But in our case of the binomial distribution it is zero when ''x'' > ''n''. So ''why'' would we write the result in that form, with a seemingly needless sum of infinitely many zeros? The answer comes when we generalize the binomial theorem of elementary algebra to Newton's binomial theorem. Then we can say, for example : Now suppose ''r'' > 0 and we use a negative exponent: : Then all of the terms are positive, and the term : is just the probability that the number of failures before the ''r''th success is equal to ''x'', provided ''r'' is an integer. (If ''r'' is a negative non-integer, so that the exponent is a positive non-integer, then some of the terms in the sum above are negative, so we do not have a probability distribution on the set of all nonnegative integers.) Now we also allow non-integer values of ''r''. Then we have a proper negative binomial distribution, which is a generalization of the Pascal distribution, which coincides with the Pascal distribution when ''r'' happens to be a positive integer. Recall from above that :The sum of independent negative-binomially distributed random variables with the same value of the parameter ''p'' but the "''r''-values" ''r''1 and ''r''2 is negative-binomially distributed with the same ''p'' but with "''r''-value" ''r''1 + ''r''2. This property persists when the definition is thus generalized, and affords a quick way to see that the negative binomial distribution is infinite divisibility. ==Examples== ''(After a problem by Dr. Diane Evans, professor of mathematics at Rose-Hulman Institute of Technology)'' Pat is required to sell candy bars to raise money for the 6th grade field trip. There are thirty houses in the neighborhood, and Pat is not supposed to return home until five candy bars have been sold. So the child goes door to door, selling candy bars. At each house, there is a 0.4 probability of selling one candy bar and a 0.6 probability of selling nothing. ''What's the probability mass function for selling the last candy bar at the ''n''th house?'' Recall that the NegBin(''r'', ''p'') distribution describes the probability of ''k'' failures and ''r'' successes in ''k''+''r'' Bernoulli(''p'') trials with success on the last trial. Selling five candy bars means getting five successes. The number of trials (i.e. houses) this takes is therefore ''k''+5 = ''n''. The random variable we are interested in is the number of houses, so we substitute ''k'' = ''n'' − 5 into a NegBin(5, 0.4) mass function and obtain the following mass function of the distribution of houses (for ''n'' ≥ 5): : ''What's the probability that Pat finishes on the tenth house?'' : ''What's the probability that Pat finishes on or before reaching the eighth house?'' To finish on or before the eighth house, Pat must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities: : : : : : ''What's the probability that Pat exhausts all 30 houses in the neighborhood?'' : Probability distributions Negative binomial distributionI have reverted the most recent edit to negative binomial distribution for the following reason. * Sometimes one defines the ''negative binomial distribution'' to be the distribution of the number of failures before the ''r''th success. In that case, the statement that the expected value is ''r''(1 &minus ''p'')/''p'' is correct. *But sometimes, and in particular in the present article, one defines it to be the distribution of the number of trials needed to get ''r'' successes. In that case, the statement is wrong. If you're going to edit one part of the article to be consistent with the former definition, you need to be consistent and change the definition. User:Michael Hardy 17:40, 7 Jul 2004 (UTC) ==Equivalence?== If ''X''''r'' is the number of trials needed to get ''r'' successes, and ''Y''''s'' is the number of successes in ''s'' trials, then : The article went from there to say the following: ::Every question about probabilities of negative binomial variables can be translated into an equivalent one about binomial variables. I removed it. I tentatively propose this as a counterexample: Suppose ''W''''r'' is the number of failures before the ''r'' successes have been achieved. Then ''W''''r'' has a negative binomial distribution according to the second convention in this article, and it is clear that this distribution is just the negative binomial distribution according to the first convention, translated ''r'' units to the left. This probability distribution is infinitely divisible, a fact ''now'' explained in the article. That means that for any positive integer ''m'', no matter how big, there is some probability distribution ''F'' such that if ''U''1, ..., ''U''''m'' are random variables distributed according to ''F'', then ''U''1 + ... + ''U''''m'' has the same distribution that ''W''''r'' has. So how can the question of whether the negative binomial distribution is infinitely divisible be "translated into an equivalent one about binomial variables"? User:Michael Hardy 01:43, 27 Aug 2004 (UTC) : Removing the bit about "every question" seems OK to me; the important point is the relation between binomial and negative binomial probabilities. But Mike, it wasn't put in there for the purpose of annoying you. You might consider using the edit summary to say something about the edit rather than your state of mind -- how about ''rm questionable claim about "every question"'' instead of ''I am removing a statement that has long irritated me''. User:Wile E. Heresiarch 15:23, 6 Nov 2004 (UTC) ==Major reorganization== Trying to be bold, I've just committed several major changes. I found the previous version somewhat confusing, since it talked about three slightly different but closely related "conventions" for the negative binomial, and it never became fully clear to me which convention was in use at which point in the subsequent discussion. I've replaced the definition with what I consider to be the most natural version (the previous convention #3). The reasons that definition is "natural" is that it arises naturally as the Gamma-Poisson mixture, converges-in-distribution to the Poisson, etc. The shifted negative binomial (previous convention #1) can still be derived (see the worked example of the candy pusher). Now we have a single, consistent (hopefully!) definition of the negative binomial instead of three similar-yet-different conventions. I'm painfully aware that all of the previous three conventions are in use and sometimes referred to as the negative binomial; but then again, that doesn't even begin to exhaust the variations on this distribution that can be found in the wild, so why not pick one reasonble definition and stick to that here? --User:MarkSweep 12:04, 5 Nov 2004 (UTC) :Well, if we were writing a textbook, we would certainly want to pick one defn and stick to it. However, we're here to document stuff as it is used by others. If there are multiple defns in common use, I don't see that we have the option to pick and choose. Sometimes multiple defns can be collapsed by saying "#2 is a special case of #1 with ''A'' always a blurfle" and then describing only #1. I don't know if that's feasible here. Regards & happy editing, User:Wile E. Heresiarch 15:09, 6 Nov 2004 (UTC) ::Yes, that was basically the case here. The previous "convention #2" was the Pascal distribution, which is a special case of the general negative binomial (previous "convention #3"). This didn't become fully clear in the previous revision, where the discussion of the Pascal distribution seemed more like an afterthought. The previous "convention #1" appeared to be simply a Pascal distribution shifted by a fixed amount. There is still a discussion of that in the worked example, but that could arguably be moved to the front and made more explicit. --User:MarkSweep 23:25, 6 Nov 2004 (UTC) See other meanings of words starting from letter: NNA | NB | NC | ND | NE | NF | NG | NH | NI | NJ | NK | NL | NM | NO | NP | NR | NS | NT | NU | NW | NX | NY | NZ |Words begining with Negative_binomial_distribution: Negative_binomial_distribution Negative_binomial_distribution |
These materials are based on Wikipedia and licensed under the GNU FDL
YouTube.com videos better site than Turbo Tax 2007 |
|
|