Derivation for slope and intercept
This document contains the mathematical details for deriving the least-squares estimates for slope (β1) and intercept (β0). We obtain the estimates, ^β1 and ^β0 by finding the values that minimize the sum of squared residuals ().
SSR=n∑i=1[yi−^yi]2=[yi−(^β0+^β1xi)]2=[yi−(^β0−^β1xi]2
Recall that we can find the values of ^β1 and ^β0 that minimize () by taking the partial derivatives of () and setting them to 0. Thus, the values of ^β1 and ^β0 that minimize the respective partial derivative also minimize the sum of squared residuals. The partial derivatives are
∂SSR∂^β1=−2n∑i=1xi(yi−^β0−^β1xi)∂SSR∂^β0=−2n∑i=1(yi−^β0−^β1xi)
Let’s begin by deriving ^β0.
∂SSR∂^β0=−2n∑i=1(yi−^β0−^β1xi)=0⇒−n∑i=1(yi+^β0+^β1xi)=0⇒−n∑i=1yi+n^β0+^β1n∑i=1xi=0⇒n^β0=n∑i=1yi−^β1n∑i=1xi⇒^β0=1n(n∑i=1yi−^β1n∑i=1xi)⇒^β0=¯y−^β1¯x
Now, we can derive ^β1 using the ^β0 we just derived
∂SSR∂^β1=−2n∑i=1xi(yi−^β0−^β1xi)=0⇒−n∑i=1xiyi+^β0n∑i=1xi+^β1n∑i=1x2i=0(Fill in ^β0)⇒−n∑i=1xiyi+(¯y−^β1¯x)n∑i=1xi+^β1n∑i=1x2i=0⇒(¯y−^β1¯x)n∑i=1xi+^β1n∑i=1x2i=n∑i=1xiyi⇒¯yn∑i=1xi−^β1¯xn∑i=1xi+^β1n∑i=1x2i=n∑i=1xiyi⇒n¯y¯x−^β1n¯x2+^β1n∑i=1x2i=n∑i=1xiyi⇒^β1n∑i=1x2i−^β1n¯x2=n∑i=1xiyi−n¯y¯x⇒^β1(n∑i=1x2i−n¯x2)=n∑i=1xiyi−n¯y¯x^β1=n∑i=1xiyi−n¯y¯xn∑i=1x2i−n¯x2
To write ^β1 in a form that’s more recognizable, we will use the following:
∑xiyi−n¯y¯x=∑(x−¯x)(y−¯y)=(n−1)Cov(x,y)
∑x2i−n¯x2−∑(x−¯x)2=(n−1)s2x
where Cov(x,y) is the covariance of x and y, and s2x is the sample variance of x (sx is the sample standard deviation).
Thus, applying () and (), we have
^β1=n∑i=1xiyi−n¯y¯xn∑i=1x2i−n¯x2=n∑i=1(x−¯x)(y−¯y)n∑i=1(x−¯x)2=(n−1)Cov(x,y)(n−1)s2x=Cov(x,y)s2x
The correlation between x and y is r=Cov(x,y)sxsy. Thus, Cov(x,y)=rsxsy. Plugging this into (), we have
^β1=Cov(x,y)s2x=rsysxs2x=rsysx