Poisson distribution - meaning of word
Rozmiar: 8938 bajtów


Poisson distribution



{{Probability distribution| name =Poisson| type =mass| pdf_image =
The horizontal axis is the index ''k''. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)| cdf_image =
The horizontal axis is the index ''k''. (Note that the function is only defined at integer values of k. The connecting lines do not indicate continuity.)| parameters =\lambda \ge 0| support =k \in \{0,1,2,\ldots\}| pdf =\frac{e^{-\lambda} \lambda^k}{k!}\!| cdf =\frac{\Gamma(k+1, \lambda)}{k!}\!| mean =\lambda\,| median =| mode =\lfloor\lambda\rfloor| variance =\lambda\,| skewness =\lambda^{-1/2}\,| kurtosis =\lambda^{-1}\,| entropy =| mgf =\exp(\lambda (e^t-1))\,| char =\exp(\lambda (e^{it}-1))\, }} In probability theory and statistics, the Poisson distribution is a discrete mathematics probability distribution (discovered by Simeon Poisson (17811840) and published, together with his probability theory, in 1838 in his work ''Recherches sur la probabilité des jugements en matières criminelles et matière civile'') belonging to certain random variables ''N'' that count, among other things, a number of discrete occurrences (sometimes called "arrivals") that take place during a time-interval of given length. The probability that there are exactly ''k'' occurrences (''k'' being a non-negative integer, ''k'' = 0, 1, 2, ...) is :P(N=k)=\frac{e^{-\lambda} \lambda^k}{k!},\,\! where * ''e'' is the e (mathematical constant) (''e'' = 2.71828...), * ''k''! is the factorial of ''k'', * λ is a positive real number, equal to the expected number of occurrences that occur during the given interval. For instance, if the events occur on average every 2 minutes, and you are interested in the number of events occurring in a 10 minute interval, you would use as model a Poisson distribution with λ = 5. ==Poisson processes== Sometimes λ is taken to be the ''rate'', i.e., the average number of occurrences per unit time. In that case, if ''N''''t'' is the number of occurrences before time ''t'' then we have :P(N_t=k)=\frac{e^{-\lambda t} (\lambda t)^k}{k!},\,\! and the waiting time ''T'' until the first occurrence is a ''continuous'' random variable with an exponential distribution (with parameter λ). This probability distribution may be deduced from the fact that :P(T>t)=P(N_t=0)=e^{-\lambda t}.\, When time becomes involved, then we have a 1-dimensional Poisson process, which involves both the discrete Poisson-distributed random variables that count the number of arrivals in each time interval, and the continuous Erlang distribution waiting times. There are also Poisson processes of dimension higher than 1. ==Related distributions== *Y \sim \mathrm{Poisson}(\bar{\lambda}) is a Poisson distribution if Y = \sum_{m=1}^N X_m for X_m \sim \mathrm{Poisson}(\lambda_m) statistical independence Poisson distributions and \bar{\lambda} = \sum_{m=1}^N \lambda_m. == Occurrence == The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete nature (that is, those that may happen 0, 1, 2, 3, ... times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. Examples include: * The number of unstable atomic nucleus that decayed within a given period of time in a piece of radioactivity. * The number of cars that pass through a certain point on a road during a given period of time. * The number of spelling mistakes a secretary makes while typing a single page. * The number of phone calls at a call center per minute. * The number of times a web server is accessed per minute. ** For instance, the number of edits per hour recorded on Wikipedia's special:Recentchanges page follows an approximately Poisson distribution. * The number of road fauna found per unit length of road. * The number of mutations in a given stretch of DNA after a certain amount of radiation. * The number of pine trees per unit area of mixed forest. * The number of stars in a given volume of space. * The number of soldiers killed by horse-kicks each year in each corps in the Prussian cavalry. This example was made famous by a book of Ladislaus Bortkiewicz (18681931). * The number of bombs falling on each unit area of London during a German air raid in the early part of the World War II. Statistician Roger Mexico uses the Poisson distribution to study this topic in Gravity's Rainbow. * The distribution of visual receptor cells in the retina of the human eye eye. == How does this distribution arise? – The ''law of rare events'' == The binomial distribution with parameters ''n'' and λ/''n'', i.e., the probability distribution of the number of successes in ''n'' trials, with probability λ/''n'' of success on each trial, approaches the Poisson distribution with expected value λ as ''n'' approaches infinity. This is sometimes known as the law of rare events. Here are the details. First, recall from calculus that :\lim_{n\to\infty}\left(1-{\lambda \over n}\right)^n=e^{-\lambda}. Let ''p'' = λ/''n''. Then we have :\lim_{n\to\infty} P(X=k)=\lim_{n\to\infty}{n \choose k} p^k (1-p)^{n-k} =\lim_{n\to\infty}{n! \over (n-k)!k!} \left({\lambda \over n}\right)^k \left(1-{\lambda\over n}\right)^{n-k} :=\lim_{n\to\infty} \underbrace{\left({n \over n}\right)\left({n-1 \over n}\right)\left({n-2 \over n}\right) \cdots \left({n-k+1 \over n}\right)}\ \underbrace{\left({\lambda^k \over k!}\right)}\ \underbrace{\left(1-{\lambda \over n}\right)^n}\ \underbrace{\left(1-{\lambda \over n}\right)^{-k}}. As ''n'' approaches ∞, the expression over the first of the four \underbrace{\mathrm{underbraces}} approaches 1; the expression over the second underbrace remains constant since "''n''" does not appear in it at all; the expression over the third underbrace approaches ''e''−λ; and the one over the fourth underbrace approaches 1. Consequently the limit is :{\lambda^k e^{-\lambda} \over k!}.\,\! More generally, whenever a sequence of binomial random variables with parameters ''n'' and ''p''''n'' is such that :\lim_{n\rightarrow\infty} np_n = \lambda, the sequence convergence in distribution to a Poisson random variable with mean λ (see, e.g., [http://planetmath.org/?op=getobj&from=objects&id=6252 law of rare events]). == Properties == The expected value of a Poisson distributed random variable is equal to λ and so is its variance. The higher moment (mathematics) of the Poisson distribution are Touchard polynomials in λ, whose coefficients have a combinatorics meaning. The mode (statistics) of a Poisson distributed random variable with non-integer ''λ'' is equal to \lfloor \lambda \rfloor, which is the largest integer less than or equal to ''λ''. This is also written as floor function(''λ''). When ''λ'' is a positive integer, the modes are ''λ'' and ''λ'' − 1. For sufficiently large values of ''λ'' (''λ'' > 1000 say), the normal distribution with mean ''λ'' and variance ''λ'' is an excellent approximation to the Poisson distribution. If ''λ'' is greater than about 10, then the normal distribution is a good approximation if an appropriate continuity correction is performed, i.e., P(''X'' ≤ ''x''), where (lower-case) ''x'' is a non-negative integer, is replaced by P(''X'' ≤ ''x'' + 0.5). If ''N'' and ''M'' are two statistical independence random variables, both following a Poisson distribution with parameters ''λ'' and ''μ'', respectively, then ''N'' + ''M'' follows a Poisson distribution with parameter ''λ'' + ''μ''. The moment-generating function of the Poisson distribution with expected value ''λ'' is :E\left(e^{tX}\right)=\sum_{k=0}^\infty e^{tk} P(X=k)=\sum_{k=0}^\infty e^{tk} {\lambda^k e^{-\lambda} \over k!} =e^{\lambda(e^t-1)}. All of the cumulants of the Poisson distribution are equal to the expected value ''λ''. The ''n''th factorial moment of the Poisson distribution is ''λ''''n''. The Poisson distributions are infinite divisibility probability distributions. == Parameter estimation == Given a sample of ''N''  measured values k_i we wish to estimate the value of the parameter \lambda of the Poisson population from which the sample was drawn. To calculate the maximum likelihood value, we form the likelihood function :L(\lambda)=\prod_{i=1}^N f(k_i) = \prod_{i=1}^N \frac{e^{-\lambda}\lambda^{k_i}}{k_i!} = \frac{e^{-N\lambda}\lambda^{\Sigma k_i}}{\prod k_i!} where the sums and products are from i=1 to N. Taking the logarithm of ''L'' and then the derivative with respect to \lambda and equating to zero yields the MLE estimate of \lambda: :\lambda_\mathrm{MLE}=\frac{1}{N}\sum_{i=1}^N k_i From the properties of characteristic functions, it is seen that the characteristic function of the distribution of \lambda_\mathrm{MLE}  is :\varphi_\mathrm{MLE}(t)=\left(\prod_{i=1}^N \varphi(t/N)\right)=\varphi^N(t/N)=\exp(N\lambda(e^{it/N}-1)) The mean value of \lambda_\mathrm{MLE}  is then found to be: :\langle \lambda_\mathrm{MLE}\rangle = -i\left(\frac{d}{dt}\,\varphi_\mathrm{MLE}(t)\right)_{t=0}=\lambda Since the average value of \lambda_\mathrm{MLE}  is equal to \lambda  it is therefore an unbiased estimator of \lambda. ==The "law of small numbers"== The word law is sometimes used as a synonym of probability distribution, and ''convergence in law'' means ''convergence in distribution''. Accordingly, the Poisson distribution is sometimes called the law of small numbers because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. ''The Law of Small Numbers'' is a book by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898. Some historians of mathematics have argued that the Poisson distribution should have been called the Bortkiewicz distribution. ==See also== * Compound Poisson distribution * Poisson process * Erlang distribution which describes the waiting time until ''n'' events have occurred. For time distributed events, the Poisson distribution is the probability distribution of the number of events that would occur within a preset time, the Erlang distribution is the probability distribution of the amount of time until the ''n''th event. * Skellam distribution, the distribution of the difference of two Poisson variates, not necessarily from the same parent distribution. ==External links== * * [http://www.eventhelix.com/RealtimeMantra/CongestionControl/queueing_theory.htm Queueing Theory Basics] * [http://www.eventhelix.com/RealtimeMantra/CongestionControl/m_m_1_queue.htm M/M/1 Queueing System] Probability distributions

Poisson distribution



The limit of the binomial distribution isn't so much how the Poisson distribution arises as one example of a physical situation that the Poisson distribution can model fairly well. It far more often arises as the limit of a wide number of independent processes, which can in turn be modelled by the binomial distribution - but the model isn't the thing. As it happens, it's a lot more illuminating and a better look at the causality to examine this limit of a wide number of independent processes using differential equations and generating functions, but it's simpler to use the binomial distribution approach. PML. ---- The comment above definitely could bear elaboration! User:Michael Hardy 01:45 Feb 5, 2003 (UTC) ---- Well, for instance consider how many breaks a power line of length l might have after a storm. Suppose there is an independent probability lambda delta l of a break in any stretch of length delta l. (We know this is crawling with assumptions; if we do this right - like the better sort of economist - in any real case we will check the theory back to outcomes to see if it was really like that in the first place.) Anyhow, we pretend we already have a general formula and put it in the form of a Probability generating function P(lambda, l, x). Then we get an expression for P(lambda, l + delta l , x) in terms of P(lambda, l, x) and P(lambda, delta l , x). When we take the limit of this we get a differential equation which we can solve to get the Poisson distribution. If people already know the slightly more advanced concept of a Cumulant generating function we can rearrange the problem in that form, and then the result almost jumps out at you without needing to solve anything (a Cumulant generating function is what you get when you take the logarithm of a probability generating function). :Actually, the cumulant-generating function is the logarithm of the moment-generating function. User:Michael Hardy 22:05, 2 Apr 2004 (UTC) I have heard that the empirical data that was first used for this formula was the annual number of deaths of German soldiers from horse kicks in the 19th century. PML. *I'm not sure that this isn't just the same as what is on the page, just with different maths. I disagree with PML (but am open to being convinced otherwise) and think the binomial is a great place to start a derivation of the Poisson distribution from. It is exactly the appropriate approximation for nuclear decay, phones rining, et cetera. I would also use it for the above example. --User:Pdbailey 13:21, 31 Aug 2004 (UTC) ---- Concerning the source of the horse-kick data, see Ladislaus Bortkiewicz; it was his book The Law of Small Numbers that made that data-set famous. User:131.183.84.78 02:25 Feb 5, 2003 (UTC) ---- I've seen this approach via differential equations before, but I don't think it's a reason not to include the limit theorem. For that matter, I still think an account of the limit theorem should appear earlier in the article than anything about differential equations or cumulant-generating functions. User:Michael Hardy 02:31 Feb 5, 2003 (UTC) ---- The word "arise" really only tells us that we can do the algebra this way, not that the process is itself like this. My concern was that the wording suggests that it all somehow comes out of the Binomial distribution, when that is simply yet another thing that can describe/model the same sort of underlying processes. You would expect the limit of the binomial distribution to work, but only because it is itself modelling the same processes; but it only does that when you plug the right things in, i.e. taking the limit while you keep the expected values where you want them. You can have a binomial distribution that converges to other limits under other constraints. PML. ---- None of which looks to me like a reason why the limit theorem should not be given prominence before cumulants or differential equations are mentioned. I agree that the "constraints" do need to be emphasized. User:Michael Hardy 02:41 Feb 5, 2003 (UTC) ---- I think you're missing my point. I'm not saying you shouldn't mention these things early on. Only, you shouldn't make them look like where the Poisson distribution comes from, the underlying mechanism. You could easily use these things to show how to calculate it, to get to the algebraic formula, while stating that these are merely applying underlying things which will be bought out later. It's the word "arise" in the subtopic introduction I'm uncomfortable with, not what you're doing after that. An analogy: it's a lot easier to state a formula for Fibonacci numbers, and prove that the formula works with mathematical induction, than to derive it in the first place - and it was probably derived in the first place by using generating functions. So you introduce the subject with the easy bit but you don't make it look like where you're coming from. PML. ---- I don't know the history, but to me it is plausible that the limit theorem I stated on this page is how the distribution was first discovered. And if you talk about phone calls arriving at a switchboard, it's not so implausible to think of each second that passes as having many opportunities for a phone call to arrive and few opportunities actually realized, so that limit theorem does seem to describe the mechanism. User:Michael Hardy 17:20 Feb 5, 2003 (UTC) ---- I am a dunce, but wouldn't the number of mutations in a given stretch of DNA be a binomial distribution, since you have discrete units? You couldn't very well have a nice Poisson process with a DNA stretch of only 4 base pairs... on the other hand maybe I don't know what I'm talking about... User:Graft 21:14, 2 Apr 2004 (UTC) :It would be well-approximated by a Poisson distribution if the number of "discrete units" is large, and using a Poisson distribution is simpler. User:Michael Hardy 21:23, 2 Apr 2004 (UTC) == Waiting time to next event. == In the waiting time to the next event :P(T>t)=P(N_t=0)=e^{-\lambda t}.\, This looks like it isn't normarmalized. since there should be a \lambda out in front. Am I wrong? User:Pdbailey 03:47, 11 Jan 2005 (UTC) :Yes; you're wrong. The normalizing constant should appear in the probability density function, but not in ''this'' expression, which is 1 minus the cumulative distribution function. User:Michael Hardy 03:50, 11 Jan 2005 (UTC) ==Parameter estimation== I'm confused about the recent edits to the MLE section. I'm under the distinct impression that the sample mean is the minimum-variance unbiased estimator for λ, but a combination of ignorance and laziness prevents me from investigating this myself. Could someone please enlighten me? --User:MarkSweep 07:07, 15 May 2005 (UTC) :Evidently when I wrote it, I was also confused. I think its right this time, please check the derivation. I didn't put in the part about "minimum variance" because I can't prove it quickly, and I haven't got a source that says that, but it would be a good thing to add. User:PAR 14:07, 15 May 2005 (UTC) ::''This'' MLE is unbiased, and is the MVUE. MLEs generally are often biased. User:Michael Hardy 22:42, 15 May 2005 (UTC) == Poisson Distribution for Crime Analysis? == Is a Poisson distribution the best one for describing the frequency of crime? Before I add it as an example on the main page, I’d like to post this for discussion. Recently, I've been trying to use the normal distribution to approximate the monthly statistics of the eight "Part I" crimes in the ten police districts of San Francisco. But the normal distribution is continuous and not discrete like the Poisson. It also doesn't seem appropriate for situations where the value of a crime like homicide is zero for several weeks. My goal is to approximate the occurrences of crime with the appropriate distribution, and then use this distribution to determine whether a change in crime from one week to the next is statistically significant or not. Distinguishing between significant change and predicable variations might help deploy police resources more effectively. Knowing the mean and standard deviation of the historical crime data, I can compare a new week’s data to the mean, and - given the correct distribution - assess the significance of any change that has occurred. But is the Poisson distribution the one to use? Also, how do I take into account trends? Does the Poisson distribution assume that the underlying process does not change? This may be a problem because crime has been going down for years. - Tom Feledy :Well, IANAS, but my advice would be to first set up a simple Poisson model and assess its goodness of fit. My guess is there could easily be several problems with a simple Poisson model: First of all, it has only a single parameter, so you cannot adjust the mean independently of the variance; you may want to look into a Poisson mixture like the negative binomial distribution as an alternative with more parameters. Second, as you point out yourself, zero counts (fortunately) dominate for many types of crimes. This suggests that you need a zero-inflated or "adjusted" distribution, like a zero-inflated Poisson model in the simplest case. Finally, if you have independent variables that could potentially explain differences in the frequency of certain crimes, then a conditional model (e.g. Poisson regression analysis) will be more appropriate than a model that ignores background information and trends. --User:MarkSweep 02:26, 31 May 2005 (UTC) ::You might also look at a non-constant rate parameter. But estimating that might be delicate. User:Michael Hardy 02:52, 31 May 2005 (UTC)


See other meanings of words starting from letter:

P

PA | PB | PC | PD | PE | PF | PG | PH | PI | PJ | PK | PL | PM | PN | PO | PR | PS | PT | PU | PW | PX | PY | PZ |

Words begining with Poisson_distribution:

Poisson_distribution
Poisson_distribution


These materials are based on Wikipedia and licensed under the GNU FDL



YouTube.com videos better site than Turbo Tax 2007
encyklopedia online