In this project you will implement gradient descent for linear regression on Spark using Scala.

Question

In this project you will implement gradient descent for linear regression on Spark using Scala.

The gradient descent update for linear regression is:

Wi+1 = Wi · αἱ Σ(wx; - yj)x;

- Part 1 (20 points)

First, implement a function that computes the summand (w¹x - y)x,, and test this function on two examples. Use (Vectors) to create dense vectors w and use (Labeled Point)

to create training dataset with 3 features. You can also use (Breeze) to do the dot product.

Part 2 (20 points)

Implement a function that takes in vector w and an observation's Labeled Point and returns a (label, prediction) tuple. Note that we can predict by computing the dot

product between weights and an observation's features. Test this function on a Labeled Point RDD.

Part 3 (20 points)

Implement a function to compute (RMSE) given an RDD of (label, prediction) tuples:

Test this function on an example RDD.

RMSE =

-18

n

Σ(vi - 1/2

i=1 Part 4 (40 points)

Implement a gradient descent function for linear regression:

The function will take trainData (RDD of Labeled Point) as an argument and return a tuple of weights and training errors. Reuse the code that you have written in Part 1 and 2.

Initialize the elements of vector w = 0 and a = 1. Update the value of a in ith iteration using the formula:

Wi+1 = Wi-ai (w/ x; -yj)xj

Bonus (20 points)

Implement the closed form solution:

Test the function on and example RDD. Run it for 5 iterations and print the results.

You can assume X is a (DenseMarix).

α₂ =

α

n√i.

w = (x¹x)¯¹x²

In this project you will implement gradient descent for linear regression on Spark using Scala.

Get Instant Homework HelpOn Your Mobile

Get Instant Homework HelpOn Your Mobile