Binomial distributions

This is about probability of having exactly, at most, or at least kk successes in nn trials.

Contents

When to Use

If it exhibits the following symptoms, binomial distribution may be for you

  • for number of times something happened out of nn tries
  • each value or measurement is independent from the rest

For picking items from a large pool (in the ~thousands), selection without replacement can still be approximate by binomial distribution.

Definition

Random variable XX is binomially distributed is notated as

XB(n,p)X \sim \text{B}(n, p)

with nn trials or attempts, and probability pp of success for each trial. It has the probability distribution

P(X=x)=(nx)px(1p)nx\text{P}(X = x) = \binom nx p^x(1-p)^{n-x}

where (nx)\displaystyle \binom nx is the binomial coefficient or nCx^nC_x.

This formula is not on the formula booklet. But it is very similar to the one for binomial expansion.

Note: In nn tries, xx could take on n+1n + 1 values from 00 to nn.

Binomial distribution is about defining the success for a single trial, and treating different successes separately.

Calculator

All approved graphing calculators support probability P(X=x)\text{P}(X = x) and cumulative probability P(Xx)\text{P}(X \leq x).

Some calculators also allow an interval of probability, such as P(3X6)\text{P}(3 \leq X \leq 6). Others do not, and would require doing P(X6)P(X2)\text{P}(X \leq 6) - \text{P}(X \leq 2).

Probability and cumulative probability built-in functions can be graphed when one of nn, pp, xx is unknown, and can also be tabulated.

Example: A coin is manufactured to land on head 52% of the time and the rest on its tail.

a) 800 people are each flipping such a coin 10 times in a row. Find the probability that exactly one person gets all tails or all heads.

b) Find the number of people needed to have a 95% chance of at least two people each flipping at least eight tails or at least eight heads in ten flips.


a) Let XX be the number of heads in ten flips, such that XB(10,0.52){X \sim \text{B}(10, 0.52)}.

success=P(X=0)+P(X=10)=(100)0.5200.4810+(1010)0.52100.4800.0020948\begin{align*} \text{success} &= \text{P}(X = 0) + \text{P}(X = 10) \\ &= \binom{10}{0}0.52^0 \cdot 0.48^{10} + \binom{10}{10}0.52^{10} \cdot 0.48^0 \\ &\approx 0.0020948 \dots \end{align*}

Then let YY be number of times someone gets all tails or heads in ten flips, out of 800800 people. YB(800,0.0020948){Y \sim \text{B}(800, 0.0020948)}.

P(Y=1)=(8001)0.00209481(10.0020948)7990.314\begin{align*} \text{P}(Y = 1) &= \binom{800}{1}0.0020948^1 \cdot (1 - 0.0020948)^{799} \\ &\approx 0.314 \qed \end{align*}

b) At least 8 tails or at least 8 heads means

P(X2)+P(X8)\text{P}(X \leq 2) + \text{P}(X \geq 8)

using the XX from previous part.

Many calculators can only do cumulative binomial distribution probability from 0 to xx, so this is

P(X2)+1P(X7)=0.11218749\text{P}(X \leq 2) + 1 - \text{P}(X \leq 7) = 0.11218749\dots

Now we can define ZB(n,0.11218749){Z \sim \text{B}(n, 0.11218749)} for number of people getting at least 8 tails or at least 8 heads and solve

P(Z2)>0.95\text{P} (Z \geq 2) > 0.95

or equivalently

P(Z1)<0.05\text{P}(Z \leq 1) < 0.05

As nn is a discrete value, taking on only positive integers, this cannot be solved via solver, but rather through a table of values.

nn P(Z1)\text{P}(Z \leq 1)
3939 0.05720.0572
4040 0.05190.0519
4141 0.04700.0470
4242 0.04260.0426

As n=41n = 41 is the first value to drop below 0.05000.0500, the answer is n=41n = 41 \qed

Expected value

The expected value is

E(X)=np\text{E}(X) = np

This is why

expected count=number of trials×probability\text{expected count} = \text{number of trials}\times\text{probability}

Variance and standard deviation

Var(X)=np(1p)\text{Var}(X) = np(1-p)
σ=np(1p)\sigma = \sqrt{np(1-p)}

Some ways that they could be used on exam is if you are given expected value and variance and you need to solve for nn and pp.

Mode and median

The mode and median are very close to npnp. Precise formulas are beyond scope of the course. If asked on exam, use methods discussed in discrete random variables to investigate.

The median is not necessarily equal to the mode(s).

Tips