Equation of the Line I
- Dishwasher Example
- response - Height of suds (mm)
- explanatory - Amount of soap (g)
- slope is 12.4
- For each additional gram of soap, the height of suds increases by 12.4 mm, on average.
- For 0 grams of soap, the height of suds is -20.2 mm, on average. [This is an extrapolation.]
- Asked to predict height of suds for an amount of soap between 3.5 and 8.0 grams. Example … “What is the predicted height of suds for 5 g of soap?” Answer … -20.2+12.4×5=41.8 mm.
- Asked to predict height of suds for an amount of soap not between 3.5 and 8.0 grams. Example … “What is the predicted height of suds for 10 g of soap?”
Note: In the applied questions below …
- The words “response”, “explanatory”, “y”, and “x” are not used at all. These words should be replaced with the actual variable names used in the problem.
- The x-axis (explanatory) variable is increased by 1 unit when describing the slope. If the slope is positive then the y-axis (response) variable increases by the slope amount for each 1 unit increase of the explanatory variable. If the slope is negative then the y-axis (response) variable decreases by the slope amount (with the negative sign removed) for each 1 unit increase of the explanatory variable.
- Interpret the y-intercept value even if it does not make sense, which will happen often as the y-intercept is often an extreme extrapolation.
- When computing the residual, make sure to do the observed y-axis (response) value minus the predicted y-axis (response) value. About half the time this will result in a negative value. That is okay, it just means that the observed value was less than the predicted value (i.e., the observed point was below the best-fit line).
- The words “proportion of variability explained” always imply using r2 as the answer.
- The correlation coefficient is always the value of r. If you compute r by taking the square root of r2 then remember to put a negative sign on it if the slope (i.e., direction or association) is negative (your calculator will always return a positive value).
- When assessing whether anything “concerns” you about the regression then you should address whether the form looks linear and if there is homoscedasticty. Non-linear forms will look obviously curved, heteroscedastic forms will look like a funnel (usually from narrow on the left to wider on the right). If you do not see a curve or a funnel, then explicitly say that you do not have any concerns about the lack of linearity or homoscedasticity.
Beach Sand
- The equation of the best-fit line is SAND = 0.16+0.053×ANGLE
- For each 1 degree increase in beach angle, the median sand diameter increases by 0.053 mm, on average.
- If there is 0 beach angle, then the median sand diameter is 0.16 mm, on average.
- This prediction should not be computed, because 15 degrees of beach angle is outside of the domain of the data used to define the relationship (i.e., this is an extrapolation).
- The predicted median sand diameter for a beach angle of 4 degrees is 0.372 mm.
- The residual for an observed median sand diameter of 0.2 mm and beach angle of 5 degrees is -0.225 mm.
- The correlation coefficient between median sand diameter and beach angle is -0.954.
- The proportion of variability in median sand diameter that is explained by knowing the angle of the beach is 0.910.
- I would expect the median sand diameter to increase by four slopes or 0.212 mm, on average, if the beach angle increased by 4 degrees.
- The relationship between median sand diameter and beach angle appears to be nonlinear (i.e., curved). I am not concerned about homoscedastiticty as no funnel-shape is apparent.
Everest Temperatures
- The equation of the best-fit line is AIRTEMP = 29.8-0.0059×ALTLAPSERATE
- If the altitude lapse rate is 0, then the mean air temperature is 29.8oC, on average.
- For each 1 degree increase in altitude lapse rate, the mean air temperature decreases by 0.0059oC, on average.
- The predicted mean air temperature for an altitude lapse rate of 4000 oC/km is 6.20oC.
- The residual for an observed mean air temperature of 0oC and altitude lapse rate of 3500oC/km is -9.15oC.
- The correlation coefficient between mean air temperature and altitude lapse rate is 0.961.
- I would expect the mean air temperature to decrease by 1000 slopes or 5.900oC, on average, if the altitude lapse rate increased by 1000oC/km.
- This prediction should not be computed, because 1000oC/km of altitude lapse rate is outside of the domain of the data used to define the relationship (i.e., this is an extrapolation).
- The proportion of variability in mean air temperature that is explained by knowing the altitude lapse rate is 0.923.
- I have no concerns as the relationship between mean air temperature and altitude lapse rate appears to be linear (no evident curve) and homoscedastic (no evident funnel).