Correlation and regression
Loosely speaking, regression is a model and correlation is about the strength of the model.
Contents
Linear regression
In on regression (), we assume is very certain and known, and we are looking to predict .
In on regression (), we assume the opposite, and tries to predict using that we trust or is given.
The student should make a judgement on which one is the known value and which one is the unknown.
On some calculators, only on regression is supported. In such cases, for on regression, swap and for the calculator inputs, then for final answer, write it as .
Both on and on regression pass through the point , ie the average and average value.
See an example on the TI-84 Plus.
Pearson’s correlation coefficient,
This coefficient only evaluates a linear regression or linear model.
The coefficient takes on values between and . Values near means no linear correlation. A value near means there is a strong indication that increases linearly as . While a value near means decreases linearly as increases.
If the absolute value of the coefficient is above a critical value, the correlation is statistically significant. The critical value depends on the number of data points, and will be given on exams; otherwise it would require a -test, which is beyond the syllabus.
The between is same as that of , meaning that it is unaffected under linear transformations.
The changes upon the regression line itself, however, can be investigated using function transformations.