Cost Function
We can measure the accuracy of our hypothesis function by using acost function. This takes an average difference (actually a fancier version of an average) of all the results of the hypothesis with inputs from x's and the actual output y's.
J(θ0,θ1)=12m∑i=1m(y^i?yi)2=12m∑i=1m(hθ(xi)?yi)2
To break it apart, it is12xˉwherexˉis the mean of the squares ofhθ(xi)?yi, or the difference between the predicted value and the actual value.
This function is otherwise called the "Squared error function", or "Mean squared error". The mean is halved(12)as a convenience for the computation of the gradient descent, as the derivative term of the square function will cancel out the12term. The following image summarizes what the cost function does: