Statistics and probability definitions
This is a list of statistics and probability terms.
Data that we collect can be discrete meaning over integers or specific types, or continuous meaning on a number line.
sample size, : number of measurements taken
mean, : sum of all measurements divided by the sample size.
mode: most frequent discrete value(s) measured, has to appear more than once.
median, aka , the middle value when ranked, or their mean, if multiple middle values
quartiles: , , , the , , and marks when ordering values from least to greatest
inter-quartile range, IQR: -
outlier: A value that is further than IQR away from the nearest quartile. Only extreme values are potential outliers. The presence of an outlier does not retroactively change other metrics such as median or inter-quartile range.
percentile: 100 equal-sized divisions where st percentile is higher than of the data, and th percentile is higher than of the data.
skew: left skew means the lower-end values are sparse and further away. Graphically left skew looks like a tail. Left skew also means some of the lower values may be outliers. The opposite is true for right skew.
standard deviation, : “average” distance to the mean. Here “average” refers to the root mean square or quadratic mean.
variance, : square of standard deviation
Technically speaking, , , and are population mean, standard deviation, and variance. The sample counterparts are , , and .