Loading [MathJax]/jax/output/HTML-CSS/jax.js

Tuesday, 4 September 2018

Math of Intelligence : Linear Regression

Linear Regression

Linear Regression

Some javascript to enable auto numbering of mathematical equations. Reference

In [33]:
%%javascript
MathJax.Hub.Config({
    TeX: { equationNumbers: { autoNumber: "AMS" } }
});

MathJax.Hub.Queue(
  ["resetEquationNumbers", MathJax.InputJax.TeX],
  ["PreProcess", MathJax.Hub],
  ["Reprocess", MathJax.Hub]
);

Let x be the input feature and y is the output that we are interested in.

For linear regression, we need a hypothesis function that predicts y, given the input feature x.

Let us assume that y is linearly dependent on x, so our hypothesis function is:

hθ(x)=θ0+θ1x

Here θi's are the parameters(or weights). To simplify the notation, we will drop the θ in the subscript of hθ(x) and mention it simply as h(x).

Now, we need to find a way to measure the error between our predicted output h(x) and the actual value y for all our training examples.

One way to measure this error is the ordinary least squared method. TODO: Explore other cost functions

So, the cost function(or loss function)* J(θ) according to the ordinary least square method will be as follows:

*there's some debate about whether they are the same or not but for now we'll assume they are the same

J(θ)=12(h(x)y)2

On expanding h(x), we get

J(θ)=12(θ0+θ1xy)2

Our objective is to find the values of θ0 and θ1 that minimize the loss function.

One way to do this is by using the Gradient descent method. TODO:Explore other methods to find the global minima of a function

θj=θjαθjJ(θ)

In this method, we first initialize θj randomly and then update it according to the above rule to come closer the minima with each update.

Here, α is the learning rate.

Hence, in order to update θj, we need to find out the partial derivative of J(θ) w.r.t. θj. In our case j = 0 and 1

w.r.t. θ0

θ0=212(θ0+θ1xy).(1)

θ0=θ0+θ1xy

w.r.t. θ1

θ1=212(θ0+θ1xy).(x)

θ1=(θ0+θ1xy)x

Combining equations (4) and (6) as well as (4) and (8) we get:

θ0=θ0α(θ0+θ1xy)

θ1=θ1α(θ0+θ1xy)(x)

The above equations can be used to update the weights and hence improve the hypothesis function with every training example.

No comments:

Post a Comment

Math of Intelligence : Logistic Regression

Logistic Regression Logistic Regression ¶ Some javascript to enable auto numberi...