In probability theory and statistics, the Poisson binomial distribution is the discrete probability distribution of a sum of independent Bernoulli trials that are not necessarily identically distributed. The concept is named after Siméon Denis Poisson.
In other words, it is the probability distribution of the number of successes in a sequence of n independent yes/no experiments with success probabilities . The ordinary binomial distribution is a special case of the Poisson binomial distribution, when all success probabilities are the same, that is .
YouTube Encyclopedic

1/5Views:59 915199 274917 775136 19139 626

✪ The Relationship Between the Binomial and Poisson Distributions

✪ Statistics  Binomial & Poisson Distributions

✪ Binomial distribution  Probability and Statistics  Khan Academy

✪ Binomial distribution probability (solve with easy steps)

✪ Proof that the Binomial Distribution tends to the Poisson Distribution
Transcription
Let's take a look at the relationship between the binomial and Poisson distributions. The binomial distribution tends toward the Poisson distribution as n tends to infinity, p goes to zero and lambda = np stays constant. Here we have the binomial formula and the Poisson formula Now we know the mean of a binomial random variable is np and we know that the mean of a Poisson random variable is lambda, so if we set n times p equal to lambda, and then p equals lambda over n, and we took this value and we substituted it in for p up here, then, it might not be obvious, but as n gets really large, as n goes off to infinity, this binomial formula tends towards this Poisson so as n goes off to infinity and p goes to 0 because lambda equals np is a constant as that happens the binomial tends toward the Poisson. Now it's not obvious, but I do have a mathematical proof that in another video. An implication of this is that the Poisson distribution can be used to provide a reasonable approximation to the binomial distribution if n is large and p is small. Let's look at an example to illustrate. Albinism is a rare genetic disorder that affects approximately 1 in 20,000 Europeans. This 1 in 20,000 is just an approximate value but let's take it to be exact for the purposes of this question. People with albinism produce little or none of the pigment melanin, but this manifests itself in different ways. Overall they tend to have very fair skin and very light colored hair, and things along these lines. In a random sample of 1,000 Europeans, what is the probability that exactly 2 have albiniism? Well we're looking at 1,000 people, each individual person either has Albinism or they do not, and if we are sampling randomly and independently, then this is really going to follow a binomial distribution, the number that have albinism is going to have a binomial distribution with n of 1000 and p is the probability any one individual person has it. And that was given on the last page as 1/20,000. And we are interested in the probability that X is equal to 2. And so we just use a binomial formula, n choose x, so 1000 choose 2, times p to the x, 1/20,000 squared, times 1p, 1 1/20,000 to the nx, so 10002. And if you put that into your calculator we get 0.001187965. Now let's use the Poisson approximation in this case, just to see how well it works in this particular scenario. We would let lambda equal np, which is one thousand times 1/20,000, which works out to 1 in 20, or 0.05. Then our Poisson formula is lambda to the x, e to the minus lambda, over x! [factorial] So if we want the probability that X is equal to 2 using our Poisson approximation, this is going to be approximately 0.05, lambda, to the x, e to the minus lambda over x factorial. Put that into your calculator or computer and you'd see that this is equal to 0.001189037. And the value we get from the Poisson approximation is very close to the true value from the binomial distribution. That's true in this case because we had such a large n and a very small p, so the Poisson approximation is going to be very reasonable in this particular case. Here's a very rough guideline. The Poisson approximation is reasonable if n is greater than 50 and np is less than 5. But this is just a rough guideline, the approximation is going to work best when n is a really big and p is really close to zero. So whether it's a reasonable approximation or not depends on your needs, but this is one rough guideline one could use. Why use this approximation? If something truly has a binomial distribution then we should calculate probabilities based on the binomial distribution. But sometimes those factorials and exponentials in the binomial formula can become problematic to calculate. If n is very large we might end up getting some round off error or a computer or calculator might actually just give us an error and say it can't calculate it. That's a little bit less of an issue for the Poisson distribution. Also a problem maybe binomial conceptually, but n and p may not be known. In order to calculate binomial probabilities, we need to know n and p. Now if we happen to know the mean number of occurrences, we could call that lambda and use that in a Poisson formula, provided we knew we had a very large n and a very small p, But we don't need to know what those actual values are, we just need to know the mean number of occurrences.
Contents
Mean and variance
Since a Poisson binomial distributed variable is a sum of n independent Bernoulli distributed variables, its mean and variance will simply be sums of the mean and variance of the n Bernoulli distributions:
For fixed values of the mean () and size (n), the variance is maximal when all success probabilities are equal and we have a binomial distribution. When the mean is fixed, the variance is bounded from above by the variance of the Poisson distribution with the same mean which is attained asymptotically^{[citation needed]} as n tends to infinity.
Probability mass function
The probability of having k successful trials out of a total of n can be written as the sum ^{[1]}
where is the set of all subsets of k integers that can be selected from {1,2,3,...,n}. For example, if n = 3, then . is the complement of , i.e. .
will contain elements, the sum over which is infeasible to compute in practice unless the number of trials n is small (e.g. if n = 30, contains over 10^{20} elements). However, there are other, more efficient ways to calculate .
As long as none of the success probabilities are equal to one, one can calculate the probability of k successes using the recursive formula ^{[2]} ^{[3]}
where
The recursive formula is not numerically stable, and should be avoided if is greater than approximately 20. Another possibility is using the discrete Fourier transform.^{[4]}
where and .
Still other methods are described in ^{[5]} .
Entropy
There is no simple formula for the entropy of a Poisson binomial distribution, but the entropy is bounded above by the entropy of a binomial distribution with the same number parameter and the same mean. Therefore, the entropy is also bounded above by the entropy of a Poisson distribution with the same mean.^{[6]}
The Shepp–Olkin concavity conjecture, due to Lawrence Shepp and Ingram Olkin in 1981, states that the entropy of a Poisson binomial distribution is a concave function of the success probabilities .^{[7]} This conjecture was proved by Erwan Hillion and Oliver Johnson in 2015.^{[8]} The SheppOlkin monotonicity conjecture, also from the same 1981 paper, is that the entropy is monotone increasing in , if all . This conjecture was also proved by Hillion and Johnson, in 2019 ^{[9]}
Chernoff bound
The probability that a Poisson binomial distribution gets large, can be bounded using its moment generating function:
where we took . This is similar to the tail bounds of a binomial distribution.
See also
References
 ^ Wang, Y. H. (1993). "On the number of successes in independent trials" (PDF). Statistica Sinica. 3 (2): 295–312.
 ^ Shah, B. K. (1994). "On the distribution of the sum of independent integer valued random variables". American Statistician. 27 (3): 123–124. JSTOR 2683639.
 ^ Chen, X. H.; A. P. Dempster; J. S. Liu (1994). "Weighted finite population sampling to maximize entropy" (PDF). Biometrika. 81 (3): 457. doi:10.1093/biomet/81.3.457.
 ^ Fernandez, M.; S. Williams (2010). "ClosedForm Expression for the PoissonBinomial Probability Density Function". IEEE Transactions on Aerospace and Electronic Systems. 46 (2): 803–817. Bibcode:2010ITAES..46..803F. doi:10.1109/TAES.2010.5461658.
 ^ Chen, S. X.; J. S. Liu (1997). "Statistical Applications of the PoissonBinomial and conditional Bernoulli distributions". Statistica Sinica. 7: 875–892.
 ^ Harremoës, P. (2001). "Binomial and Poisson distributions as maximum entropy distributions" (PDF). IEEE Transactions on Information Theory. 47 (5): 2039–2041. doi:10.1109/18.930936.
 ^ Shepp, Lawrence; Olkin, Ingram (1981). "Entropy of the sum of independent Bernoulli random variables and of the multinomial distribution". In Gani, J.; Rohatgi, V.K. (eds.). Contributions to probability: A collection of papers dedicated to Eugene Lukacs. New York: Academic Press. pp. 201–206. ISBN 0122744608. MR 0618689.
 ^ Hillion, Erwan; Johnson, Oliver (20150305). "A proof of the SheppOlkin entropy concavity conjecture". Bernoulli. 23: 3638–3649. arXiv:1503.01570. doi:10.3150/16BEJ860.
 ^ Hillion, Erwan; Johnson, Oliver (20191109). "A proof of the SheppOlkin entropy monotonicity conjecture". Electronic Journal of Probability. 24 (126): 1–14. doi:10.1214/19EJP380.