Binomial Distribution#
The Binomial Distribution is the discrete probability distribution of the number of successes in a sequence of repeated Bernoulli trials, such as repeated coin tosses.
Parameters | |
---|---|
Notation | \(\text{B}(n, p)\) |
Support | \(k \in \{0, 1, ..., n \}\) |
Mean | \(n \cdot p\) |
Variance | \(np(1-p)\) |
PMF | \(f(k, n, p) = \binom{n}{k} p^k (1-p)^{n-k}\) |
Probability Mass Function#
\[f(k, n, p) = \P( X = k ) = \binom{n}{k} p^k (1-p)^{n-k}\]
with the number of trials \(n\), the number of successes \(k\), and the success probability \(p\).
Explanation of the Terms
The term \(p^k\) gives us the probability to get exactly \(k\) successes in a row of \(k\) trials. Since we have \(n\) trials and not \(k\), the term \((1-p)^{n-k}\) gives us the probability to get only misses (or failures) for the remaining \(n-k\) trials. Since the successes can appear anywhere among the \(n\) trials, we multiply by the term \(\binom{n}{k}\), which corresponds to the number of possible permutations of \(k\) successes within the \(n\) trials.
Example Coin Tosses
Imagine we toss a fair coin 10 times. The outcome of each toss is either head or tail. The binomial distribution gives us the probability to get a certain amount of heads (or tails).
For example, we can ask: What is the probability to get 7 heads? Answer:
Cumulative Distribution Function#
The cumulative distribution function states the probability to get at least \(k\) successes.
\[F(k;n,p)=\Pr(X\leq k)=\sum_{i=0}^{\lfloor k\rfloor }{n \choose i}p^{i}(1-p)^{n-i}\]
where \(\lfloor k\rfloor\) is the greatest integer less than or equal to k.
Urn Model#
We have 1 urn with \(N\) balls (\(pN\) red and \((1-p)N\) black). The binomial distribution describes the probability to draw \(k\) red balls from the urn with \(n\) trials while putting the balls back into the urn after each trial.
Example Urn
Imagine an urn with 20 balls, 8 are red and 12 are black. We draw 15 times from the urn. What is the probability to get 9 black balls? Answer:
Note that in the urn model, \(n\) is not the number of balls in the urn but the number of draws.
Properties#
-
Sum of Binomials: The sum of two binomial distributions is again a binomial distribution. If \(X ~ B(n, p)\) and \(Y ~ B(m, p)\) are independent binomial variables with the same probability \(p\), then \(X + Y ~ B(n+m, p)\)
-
Normal Approximation: If \(n\) is large enough, \(B(n, p)\) can be approximated as normal distribution \(\mathcal{N}(np,\; np(1-p))\). As a rule of thumb, \(n\) is large enough if \(n \gt 9\left(\frac{1-p}{p}\right)\ \text{and}\ n \gt 9\left(\frac{p}{1-p}\right)\)
-
The binomial distribution is the generalization of the Bernoulli trial, which can be expressed as a binomial distribution with \(n = 1\).
Implementations#
Python#
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st
n = 30; p1 = 1/6.0; p2 = 0.5;
lx = np.arange(0,n+1)
plt.plot(lx, st.binom.pmf(lx, n, p1), label='n= 30, p= 1/6 ' )
plt.plot(lx, st.binom.pmf(lx, n, p2), label='n= 30, p= 1/2 ' )
plt.show()
Matlab#
N = 10;
p = 0.5;
x = 0:N;
y = binopdf(x,N,p);
figure
bar(x,y,1)
xlabel('Observation')
ylabel('Probability')
R#
N <- 20
p <- 0.5
x <- 0:N
y <- dbinom(x, size=N, prob=0.2)
plot(x, y)