Chapter 2 Simple Linear Regression
2.1 Getting started
Putting text here
2.2 Foundation
Putting text here
2.3 Inference
Putting text here
2.4 Prediction
2.5 Checking conditions
2.6 Partioning variability
2.7 Derivation for slope and intercept
This document contains the mathematical details for deriving the least-squares estimates for slope (β1β1) and intercept (β0β0). We obtain the estimates, ˆβ1^β1 and ˆβ0^β0 by finding the values that minimize the sum of squared residuals ().
SSR=n∑i=1[yi−ˆyi]2=[yi−(ˆβ0+ˆβ1xi)]2=[yi−(ˆβ0−ˆβ1xi]2SSR=n∑i=1[yi−^yi]2=[yi−(^β0+^β1xi)]2=[yi−(^β0−^β1xi]2
Recall that we can find the values of ˆβ1^β1 and ˆβ0^β0 that minimize () by taking the partial derivatives of () and setting them to 0. Thus, the values of ˆβ1^β1 and ˆβ0^β0 that minimize the respective partial derivative also minimize the sum of squared residuals. The partial derivatives are
∂SSR∂ˆβ1=−2n∑i=1xi(yi−ˆβ0−ˆβ1xi)∂SSR∂ˆβ0=−2n∑i=1(yi−ˆβ0−ˆβ1xi)
Let’s begin by deriving ˆβ0.
∂SSR∂ˆβ0=−2n∑i=1(yi−ˆβ0−ˆβ1xi)=0⇒−n∑i=1(yi+ˆβ0+ˆβ1xi)=0⇒−n∑i=1yi+nˆβ0+ˆβ1n∑i=1xi=0⇒nˆβ0=n∑i=1yi−ˆβ1n∑i=1xi⇒ˆβ0=1n(n∑i=1yi−ˆβ1n∑i=1xi)⇒ˆβ0=ˉy−ˆβ1ˉx
Now, we can derive ˆβ1 using the ˆβ0 we just derived
∂SSR∂ˆβ1=−2n∑i=1xi(yi−ˆβ0−ˆβ1xi)=0⇒−n∑i=1xiyi+ˆβ0n∑i=1xi+ˆβ1n∑i=1x2i=0(Fill in ˆβ0)⇒−n∑i=1xiyi+(ˉy−ˆβ1ˉx)n∑i=1xi+ˆβ1n∑i=1x2i=0⇒(ˉy−ˆβ1ˉx)n∑i=1xi+ˆβ1n∑i=1x2i=n∑i=1xiyi⇒ˉyn∑i=1xi−ˆβ1ˉxn∑i=1xi+ˆβ1n∑i=1x2i=n∑i=1xiyi⇒nˉyˉx−ˆβ1nˉx2+ˆβ1n∑i=1x2i=n∑i=1xiyi⇒ˆβ1n∑i=1x2i−ˆβ1nˉx2=n∑i=1xiyi−nˉyˉx⇒ˆβ1(n∑i=1x2i−nˉx2)=n∑i=1xiyi−nˉyˉxˆβ1=n∑i=1xiyi−nˉyˉxn∑i=1x2i−nˉx2
To write ˆβ1 in a form that’s more recognizable, we will use the following:
∑xiyi−nˉyˉx=∑(x−ˉx)(y−ˉy)=(n−1)Cov(x,y)
∑x2i−nˉx2−∑(x−ˉx)2=(n−1)s2x
where Cov(x,y) is the covariance of x and y, and s2x is the sample variance of x (sx is the sample standard deviation).
Thus, applying () and (), we have
ˆβ1=n∑i=1xiyi−nˉyˉxn∑i=1x2i−nˉx2=n∑i=1(x−ˉx)(y−ˉy)n∑i=1(x−ˉx)2=(n−1)Cov(x,y)(n−1)s2x=Cov(x,y)s2x
The correlation between x and y is r=Cov(x,y)sxsy. Thus, Cov(x,y)=rsxsy. Plugging this into (), we have
ˆβ1=Cov(x,y)s2x=rsysxs2x=rsysx