Discrete random variables

Yes RV used to mean random variables.

Contents

Random Variables

A regular variable xx can take on different values. A random variable can take on different values, based on specific probability of each value. For example, P(X=3)\text{P}(X = 3) represents the probability that XX is measured to be 33. The notation P(X=x)\text{P}(X = x) represents the general probability of measuring some general value.

This means we can expect xx to appear P(X=x)×100%\text{P}(X = x) \times 100\% of the time.

“Random” simply means different values can appear with different probability or frequency. The word does not retain the colloquial meaning of “arbitrary” or “chaotic”. In fact, our discussion of random variables usually require it to have a fixed distribution of values.

The sum of probabilities of a random variable add to one.

xP(X=x)=1\sum_{x} \text{P}(X = x) = 1

A random variable is discrete when there are a finite number of possible values, and continuous when there the possible values consist of an interval or multiple intervals of real numbers.

Example: Random variable XX is defined by

P(X=x)={ax1if x=3,4,6,80otherwise\text{P}(X = x) = \begin{cases} ax^{-1} & \text{if } x = 3, 4, 6, 8 \\ 0 & \text{otherwise} \end{cases}

Find aa.


a(13+14+16+18)=1a(78)=1a=87\begin{align*}a\left(\frac 13 + \frac 14 + \frac 16 + \frac 18\right) &= 1 \\ a\left(\frac 78\right) &= 1 \\ a &= \frac 87 \qed \end{align*}

For the following measures, be able to find them algebraically and on a graphing calculator using one-variable statistics.

Probability and cumulative probability built-in functions can be graphed when one of nn, pp, xx is unknown, and can also be tabulated.

Expected value, E(X)\text{E}(X)

The expected value is useful to find the sum or mean of all values measured.

mean=sumnumber of measurements\text{mean} = \frac{\text{sum}}{\text{number of measurements}}

and E(X)\text{E}(X) is the expected mean.

The expected value for for a discrete random variable is

E(X)=μ=xxP(X=x)\text{E}(X) = \mu = \sum_{x} x\cdot\text{P}(X = x)

Example: Random variable XX is defined by

P(X=x)={87x1if x=3,4,6,80otherwise\text{P}(X = x) = \begin{cases} \frac 87x^{-1} & \text{if } x = 3, 4, 6, 8 \\ 0 & \text{otherwise} \end{cases}

Find P(X=x)\text{P}(X = x).


The expected value is

E(X)=87(3(3)1+4(4)1+6(8)1+3(8)1)=327\begin{align*}\text{E}(X) &= \frac 87 \left(3 (3)^{-1} + 4 (4)^{-1} + 6 (8)^{-1} + 3 (8)^{-1} \right) \\ &= \frac {32}{7} \qed\end{align*}

Mode

Mode is the value with the highest probability.

Example: Random variable XX is defined by

P(X=x)={87x1if x=3,4,6,80otherwise\text{P}(X = x) = \begin{cases} \frac 87x^{-1} & \text{if } x = 3, 4, 6, 8 \\ 0 & \text{otherwise} \end{cases}

Identify the mode.


The probability distribution is

value xxprobability P(X=x)\text{P}(X = x)
330.3810.381
440.2560.256
660.1900.190
880.1430.143

The mode is value with the highest probability. It is 33 \qed

A value xmx_m is a candidate for a mode if in some consecutive values of xx, it is the first (least) value that satisfies

P(X=xm+1)P(X=xm)<1\frac{\text{P}(X = x_{m+1})}{\text{P}(X = x_m)} < 1

or the last (greatest) value that satisfies

P(X=xm)P(X=xm1)>1\frac{\text{P}(X = x_m)}{\text{P}(X = x_{m-1})} > 1

or that

P(X=xm)P(X=xm1)=1\frac{\text{P}(X = x_m)}{\text{P}(X = x_{m-1})} = 1

The candidates for mode should be compared to select the one with the highest probability.

Median

Median is the value that crosses the 0.50.5 probability

Example: Random variable XX is defined by

P(X=x)={87x1if x=3,4,6,80otherwise\text{P}(X = x) = \begin{cases} \frac 87x^{-1} & \text{if } x = 3, 4, 6, 8 \\ 0 & \text{otherwise} \end{cases}

Identify the median.


The median is the first value whose cumulative probability crosses 0.5000.500.

value xxcumulative probability P(Xx)\text{P}(X \leq x)
330.3810.381
440.6670.667
660.8570.857
881.0001.000

The median is 44 \qed

HL: Variance and standard deviation

Variance is σ2\sigma^2 and standard deviation is σ\sigma

Var(X)=σ2=x(xμ)2P(X=x)\text{Var}(X) = \sigma^2 = \sum_{x} (x - \mu)^2\cdot \text{P}(X = x)

Variance is typically easier to calculate, especially in calculations beyond the syllabus. Standard deviations are easier to interpret, as the “average” distance to the mean. In reality the average is weighted by (xμ)2(x - \mu)^2, so values that are further from the mean are given more weight.

In practice, variance is often calculated by

Var(X)=E(X2)μ2\text{Var}(X) = \text{E}(X^2) - \mu^2

Example: Random variable XX is defined by

P(X=x)={87x1if x=3,4,6,80otherwise\text{P}(X = x) = \begin{cases} \frac 87x^{-1} & \text{if } x = 3, 4, 6, 8 \\ 0 & \text{otherwise} \end{cases}

a) Find X2X^2.

b) Find variance of standard deviation of X2X^2.


a) X2X^2 is distributed as

P(X2=x2)={8731if x2=98741if x2=168761if x2=368781if x2=640otherwise\text{P}(X^2 = x^2) = \begin{cases} \frac 87\cdot 3^{-1} & \text{if } x^2 = {\color{blue}9}\\ \frac 87\cdot 4^{-1} & \text{if } x^2 = {\color{blue}16} \\ \frac 87\cdot 6^{-1} & \text{if } x^2 = {\color{blue}36} \\ \frac 87\cdot 8^{-1} & \text{if } x^2 = {\color{blue}64} \\ 0 & \text{otherwise} \end{cases}

b) Earlier it was shown that μ=327\mu = \frac{32}{7}.

The variance is

Var(X)=E(X2)μ2=xx2P(X2=x2)μ2=87(93+164+366+648)(327)2=152493.10\begin{align*}\text{Var}(X) &= \text{E}(X^2) - \mu^2 \\ &= \sum_x {\color{blue}x^2}\cdot\text{P}(X^2 = x^2) - \mu^2 \\ &= \frac 87 \left(\frac {\color{blue}9}3 + \frac{\color{blue}16}{4} + \frac{\color{blue}36}{6} + \frac{\color{blue}64}{8}\right) - \left(\frac{32}{7}\right)^2 \\ &= \frac{152}{49} \approx 3.10 \qed \end{align*}

The standard deviation is

σ=15249=23871.76\sigma = \sqrt{\frac{152}{49}} = \frac{2\sqrt{38}}{7} \approx 1.76 \qed

See also

Probability formulas

Example on TI-84 Plus