true

Learn Machine Learning from the Best Tutors

Affordable fees
1-1 or Group class
Flexible Timings
Verified Tutors

Search in

Linear Regression Without Any Libraries

Saumya Rajen Shah

19/08/2017 2 0

I am here to help you understand and implement Linear Regression from scratch without any libraries. Here, I will implement this code in Python, but you can implement the algorithm in any other programming language of your choice just by basically developing 4-5 simple functions.

So now, first of all, what exactly is Linear Regression?

As discussed in my blog on Machine Learning, Linear Regression is used to identify linear relationships between the input features x_(i) and the output labels of the training set y_(i) and thus form a function F(x_(i),θ), which would help in predicting future values.

This function, is called hypothesis and is usually denoted by h(x⁽ⁱ⁾_,θ). Note that, x(lowercase) is used to denote a single training example as a whole, where as we use X_(i,j)is used to point the j^th feature for the i^th training example. But confusing?? Let's simplify it!!

As shown, to show the whole feature set for a single example, we use x₍₁₎. We can also point to the first feature in third example in training set as X_(3,1)or x₁⁽³⁾.

For simplicity of understanding, let's assume, there is only one feature in our dataset of 10 examples. Let's plot it on a graph.

So, let's say, the equation of the line we are supposed to fit the curve is h(x⁽ⁱ⁾_,θ)= θ₀+ x₁θ₁.

Or for simplicity, let's say we have one more feature x₀ which is always equal to 1 so that we can rewrite h_θ (x⁽ⁱ⁾)= θ₀x₀+θ₁x₁.

Or for multiple features, we can write it as h(x⁽ⁱ⁾_,θ)= θ₀x₀+ θ₁x₁+ θ₂x₂+ θ₃x₃…… θ_nx_nfor n number of features.

Therefore, in short, h_θ(x⁽ⁱ⁾)= _i=1ⁿ∑ (θ_i*x_i).

Now, the code for our hypothesis will be as follows:

def HypoThe(Theta,xi):
    if(len(Theta)==len(xi)):
        sum=0
        for i in range(len(xi)):
            sum+=(Theta[i]*xi[i])
        return sum
    else:
        return False

Now, since we have defined a mathematical model for prediction, the next problem we face will be finding the rest set of parameters Theta(θ).

To find Theta, let's break it into basic steps:

1. Assume a pair of Theta.

2. Compute the prediction error.

3. Assume again so that it minimizes the Prediction Error.

Sounds complicated? Well it's quite simple actually. Suppose we have selected some particular values for parameter θ. Now, we will devise a new cost function, which basically calculates the prediction error for θ.

J(θ)= _i=1^m∑( h_θ(x⁽ⁱ⁾)-y⁽ⁱ⁾)²/ (2*m)

What is means is that, we predict the output using the h_θ(x⁽ⁱ⁾) function, and subtract it from actual label y⁽ⁱ⁾, and then square it and add it for all training examples(1….m). We average it out using (2*m). Why did we use 2*m instead of m, we'll find it out soon. Let’s just say, it's for the simplicity, and also because we are dealing with a mean-squared computed error function which we are supposed to minimize.

Although, cost function is useful only for debugging purposes, and not in actual implementation, we'll still code it out.

def RegCostFunc(Theta,X,Y):
    sum1=0
    for i in range(len(X)):
        sum1+=((HypoThe(Theta, X[i])-Y[i])**2)
    J=sum1
    return J/(2*len(X))

Now, what we are supposed to do next is find those value for Theta for which J(θ) reaches it minimum. If you are familiar with calculus, what we're supposed to do is to find a the minima for the J(θ) function. To solve this optimization problem, we'll be using the gradient descent algorithm.

So what exactly does it do?

Let's write down the algorithm first.

repeat until convergence: {

for all values of θ simultaneously update,

θ_j=θ_j - α * (? J(θ)/ ?θ_j)

}

Let's understand this equation first.

(? J(θ)/ ?θ_j) means that take partial derivative of J(θ) w.r.t θ_jwhile rest of the θ are constant.

Therefore

? J(θ)/ ?θ_j= ? ( _i=1^m∑( h_θ(x⁽ⁱ⁾)-y⁽ⁱ⁾)²/ (2*m) ) / ?θ_j.

If you solve this using calculus, we find that,

? J(θ)/ ?θ_j= _i=1^m∑( h_θ(x⁽ⁱ⁾)-y⁽ⁱ⁾)²* x_j⁽ⁱ⁾/ m

This basically means, while keeping the rest constant, decrease the value of θ_jin the direction of the slope of the function J(θ).

α is the learning rate by which, we reduce the value of the θ_j. If α is too large, the function will never converge actually.

Note that, do not update θ_jcontinuously, but update the whole θ simultaneously.

i.e. after one complete loop.

Rewriting the Algorithm, we get.

repeat until convergence: {

for all values of θ simultaneously update,

θ_j=θ_j - α * ( _i=1^m∑( h_θ(x⁽ⁱ⁾)-y⁽ⁱ⁾)²* x_j⁽ⁱ⁾/ m )

}

Let's code this now,

def GradTerm(X,Y,Theta,i):
    sum1=0
    for j in range(len(X)):
        sum1+=((HypoThe(Theta,X[j])-Y[j])*X[j][i])
    return sum1

Note that, here, I have divided the Gradient Descent Algorithm from the Gradient Term above, for the simplicity of the understanding of our code.

def GradDesc(Theta,alpha,Xfeature,Ylabels):
    Theta_=[]
    for i in range(0,len(Theta)):
        Theta_.append(Theta(i)-(alpha* GradTerm(Xfeature,Ylabels,Theta,i)/len(Xfeature)))
    return Theta_

Now we are done with everything required. So how to Run all these functions for proper functioning?

Basically, here is how we'll proceed:

1. We'll call linear regression and provide it with a Features and Labels.

2. We'll add feature x₀= 1 to every example in data set.

3. We'll initialize Theta=[0]*len(Features[0]). (i.e. number of parameters should be equal to number of features )

4. We'll select a particular value of alpha and no. of iterations, and run gradient descent algorithm to minimize the parameters θ.

Make sure to keep alpha as small as possible in the order of 0.000001 or less and the number of iterations to be around 100,000-1,000,000 for better results.

def LinearRegression(Xfeature,Ylabels,alpha,iterations):
    if len(Xfeature)!=len(Ylabels):
        print("Missing Data");
        return False
    else:   
        for i in range(len(Xfeature)):
            Xfeature[i].insert(0,1)
        Theta=[0]*len(Xfeature[0])
        for i in range(iterations):
            print("\nIteration Number ",i)
            print(Theta)
            Theta=GradDesc(Theta, alpha, Xfeature, Ylabels)
        print(Theta)
        return Theta

Now, I know you can easily implement it using scikit-learn library, but then, it's just a black-box module for machine learning. You'll never understand what is happening inside of it. However, if you have studied machine learning in detail, once, you'll have better understanding in using the sci-kit library and tweak the values accordingly for better results.

2 Like 0 Dislike

Follow 3

Other Lessons for You

Top 6 Technology Trends for 2020

Technology has been evolving at a pace that the annual predictions about trends may seem to be outdated before they go live as a published blog post or article. The technology when evolves...

TEQ Stories Solutions Pvt Ltd

0 0

How Machine Learning Predict things - A Short Video for absolute beginners

Here is a short video for Machine Learning Beginners, who want to know how Machine Learning Algorithm predict things

Damodar

0 0

Naive Bayes Classifiers

Hello everyone, I thought to post an article on Machine learning. There are supervised classifiers which are used to classify test data in some class. For example, seeing an image if you want to predict...

Abhi S

0 0

Machine Learning With Python

1. Course description: Machine Learning with Python has been designed for the provision of having strong hold in creating Machine learning algorithms with the base of Python. This has been preferred as...

Johnnie Clark

0 0

What Is Cart?

CART means classification and regression tree. It is a non-parametric approach for developing a predictive model. What is meant by non-parametric is that in implementing this methodology, we do not have...

Ashish R.

0 0

Looking for Machine Learning ?

Learn from Best Tutors on UrbanPro.

Are you a Tutor or Training Institute?

Join UrbanPro Today to find students near you

Machine Learning Questions

Is it possible to do Machine learning course after B.com and MBA Finance and marketing? Has it got fresher job opportunities?

85 Answers

What is cost of machine learning training online?

7 Answers

What is machine learning algorithm?

5 Answers

Can i do machine learning course after done B.com,MBA ?

75 Answers

What's the difference between Machine Learning and AI?

5 Answers

Looking for Machine Learning Classes?

The best tutors for Machine Learning Classes are on UrbanPro

Select the best Tutor
Book & Attend a Free Demo
Pay and start Learning

Learn Machine Learning with the Best Tutors

The best Tutors for Machine Learning Classes are on UrbanPro

This website uses cookies

We use cookies to improve user experience. Choose what cookies you allow us to use. You can read more about our Cookie Policy in our Privacy Policy

Accept All

Decline All

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 55 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 7.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more

I am a Student I am a Tutor
Name*	Please enter your full name. Please enter institute name.
Email*	Please enter your email address.
Phone*	Please enter a valid phone number.
Location*	Please enter a pincode or area name.
City*	Please enter city name.
Category*	Please enter category.
Gender*	Male Female Please select your gender.
Email ID/ Mobile No.*	Please enter either mobile no. or email.
Enter Password*	Please enter OTP Please enter Password Sorry, this phone number is not verified, Please login with your email Id.