Data lists and frequency distributions
This looks at data presented in frequency tables with precise values (not intervals). We view enumerated (listed out) data as a special case where the frequency is .
Major re-write: 2026-03-24.
Contents
- Frequency table data
- Formulas
- Technology
- Median
- Quartiles
- Interquartile range and outliers
- Box and whisker diagram
Frequency table data
Example: An IBDP Year 1 math class has students. Here is their grade distribution.
| Grade, | ||||||
|---|---|---|---|---|---|---|
| Frequency, |
Formulas
The formulas are not necessarily there for you to compute values by hand in Paper 1, but they may allow you to set up an equation solve for an unknown value with help of calculators in Paper 2. The purpose of the examples are to illustrate the formulas.
mean
For our example
standard deviation and variance
SL students should be able to compute standard deviation and variance when given a GDC.
HL students need to know the following formulas.
Formula for variance:
but usually we use
Note that the is outside the summation.
A proof for the special case of is provided at Sigma notation, sequences, series, finance..
For our example
The standard deviation, in our case would be
Technology
In general we use lists and 1-variable statistics on our GDC to compute statistics. Provide a list of values followed by a list of frequencies.
In stat 1:Edit, enter the values in L1 and the frequencies in L2.
In stat CALC, choose 1:1-Var Stats. L1 can be retrieved from 2nd 1.
To retrieve statistics such as σx, use vars 5:Statistics.
Do not use 2nd stat Math 7:stdDev(, as that is for Sx.
Using List & Spreadsheets App, store list in A and list in B.
In menu 4:Statistics > 1:Stat Calculations > 1:One-Variable Statistics, set Num of lists to 1.
Set X1 list, Frequency list, 1st Result Column. Use square brackets from ctrl (
To use any of these values eg standard deviation, use var and find stat.σx.
Tip: You can also just ctrl C, ctrl V. I like to ctrl var the copied value in a variable before using.
In Statistics (STAT) App, enter values in List 1, and values in List 2.
In CALC, SET add the suitable lists. To enter list 1, press LIST and enter 1.
Exit and use 1-VAR to see the statistics.
To use one of the values, in VARS, STAT, X and choose for instance σx.
In Statistics 1Var app, enter the values in D1 and values in D2.
In Symb ✗ , H1, enter D1 and D2 together in that row. They are available from Column button on screen.
Go back to Num ⊞ , press Stats to view the statistics.
Press OK to go back. To use a particular statistics, you can use Vars App Statistics 1Var Results and choose for example σx.
Tip: You can also just Shift ⧉View , Shift Menu . I like to ▶ the copied value in a variable before using.
In Statistics app, enter values in V1 and frequencies in N1.
Use arrow keys to the Stats tab to see the statistics.
To use a statistic, copy-paste using shift var, shift ⊟. I like to shift xʸ the copied value in a variable before using.
In addition to mean x̄, standard deviation σx, and quartiles, notice the relevance of as Σx, and as Σx² in the above calculations.
Median
Sort the list.
For an odd number of data, the median is the value at the th position.
For an even number of data, the median is the mean between the th and st position.
Calculating medians by hand for frequency tables is required for SL and HL.
Quartiles
Q1, or 25th percentile, is greater than 25% of the data.
Median, or Q2, or 50th percentile, is greater than 50% of the data.
Q3, or 75th percentile, is greater than 75% of the data.
Q1 and Q3 should be found using 1-variable statistics on the calculator, as shown above.
Interquartile range and outliers
Outliers lies more than the interquartile range away from the nearest quartile.
Outliers don’t necessarily need to be thrown away. In contrast, they can be central in many scientific studies. They could mean measuring a value that has a very small chance of occurring, observing an effect that only impacts extreme conditions, in addition to indicating a potential mistake.
Presence of an outlier does not require recomputing any statistical value or measure, unless otherwise specified in the question.
For our example, . We want to check if is an outlier. It is closest to .
Therefore, is an outlier.
Box and whisker diagram
The boxes are drawn from Q1 to Q2, and Q2 to Q3. Lines extend from the minimum to Q1, and from Q3 to the max. Outliers are shown as a cross.
Here in our example, the box and whisker diagram has a box from to and from to . There are lines (whiskers) from minimum to and from to maximum. There is an outlier at marked with a cross.
Outliers only affect the min and max values drawn, and not the boxes.
Mode and mean cannot be determined from a box and whisker plot; instead they require access to the original, individual values.