Machine Learning

Search for question

Questions & Answers

3. (a) What is regularization in the context of machine learning? (b) Define and briefly discuss ridge regression. (c) What is the LASSO in regression analysis? What is the main difference between the ridge regression and LASSO?[10 marks] (d) What is "subset selection"? Discuss the two stepwise selection methods in the context of linear regression. (e) What is meant by a maximal margin classifier? Name one example of maximal margin classifiers. (f) What is the "kernel trick" in machine learning?


Q1 Consider the problem where we want to predict the gender of a person from a set of input parameters, namely height, weight, and age. a) Using Cartesian distance, Manhattan distance and Minkowski distance of order 3 as the similarity measurements show the results of the gender prediction for the Evaluation data that is listed below generated training data for values of K of 1, 3, and 7. Include the intermediate steps (i.e., distance calculation, neighbor selection, and prediction). b) c) To evaluate the performance of the KNN algorithm (using Euclidean distance metric), implement a leave- one-out evaluation routine for your algorithm. In leave-one-out validation, we repeatedly evaluate the algorithm by removing one data point from the training set, training the algorithm on the remaining data set and then testing it on the point we removed to see if the label matches or not. Repeating this for each of the data points gives us an estimate as to the percentage of erroneous predictions the algorithm makes and thus a measure of the accuracy of the algorithm for the given data. Apply your leave-one-out validation with your KNN algorithm to the dataset for Question 1 c) for values for K of 1, 3, 5, 7, 9, and 11 and report the results. For which value of K do you get the best performance? d) Repeat the prediction and validation you performed in Question 1 c) using KNN when the age data is removed (i.e. when only the height and weight features are used as part of the distance calculation in the KNN algorithm). Report the results and compare the performance without the age attribute with the ones from Question 1 c). Discuss the results. What do the results tell you about the data? Implement the KNN algorithm for this problem. Your implementation should work with different training data sets as well as different values of K and allow to input a data point for the prediction.


1. Introduction In this assignment you will build on your knowledge of classification image classification problem using a convolutional neural network. This assignment aims to guide you through the processes by following the four fundamental princi- ples. in particular you will solve an • Data: Data import, preprocessing, and augmentation. • Model: Designing a convolutional neural network model for classifying the images of the parts. • Fitting: Training the model using stochastic gradient descent. • Validation: Checking the model's accuracy on the reserved test data set and investigating where the most improvement could be found. Additionally, looking into the uncertainty in the predictions. This is not necessarily a lincar process, after you have fit and/or validated your model, you may need to go back to carlier steps and adjust your processing of the data or your model structure. This may need to be done several times to achieve a satisfactory result. This assignment is worth 35% of your course grade and is graded from 0 35 marks. An additional two bonus marks are available to the student who's model performs best on a previously unseen data sel.


For this programming assignment you will implement the Lenet 5 CNN using either pytorch or tensorflow, but not Keras. You can take a look to other implementations in internet but please, when coding use your personal coding style and add references to your sources.


(a) Machine learning is being increasingly used in many applications. i. When should machine learning be used? ii. Discuss the two stages of supervised learning. iii. What is the goal of unsupervised learning?[3 ma) Consider the regression problem with the following training set containing (b) Consider the regression problem with the following training set containing 5 examples and apply -nearest neighbours (NN) algorithm with Euclide an distance. i. Predict the label of the test object (–1, 1) for k = 2. ii. What is the prediction for the test object (−1,1) for k = 3? iii. What are the advantages of ANN? (c) Briefly discuss two approaches to directly estimating the test error associ-ated with fitting a particular machine learning method on a set of observa-tions.[6 marks] (d) Briefly explain generalised additive models (GAMs). What are the advantages and disadvantages of the GAMS?[6 marks]


Task 0: Naïve Logistic Regression Make a logistic regression and report the accuracy. Task 1: Train Data Transformation Perform the pre-processing to transform the original data into a new feature space by doing feature engineering so the features are linear in the new space. Confirm four assumptions required for a linear classifier. Task 2: Linear Parametric Classification Implement logistic regression model using Scikit-learn. Using the GridSearchCV, optimize the model. 1. Make a logistic regression model. Report the weights and the accuracy of the model. 2. Using the GridSearchCV at various 100 a values from 10-5 to 10, build a logistic regression model. Visualize how the model accuracy behaviors. Then report the best model. If the accuracy is 100%, then the model is overfitted. In this case, the model should be regularized. 3. Using the best model, classify the test data set. Task 3: Transformation using Kernel Method Kernelize the original to a Kernel space using five different valid Kernel functions. Then repeat Task 2. Task 4: Non-parametric KNN Classification 1. Classify the original data with K values from 1 to 200. Then report the accuracy with visualization. 2. Repeat step 1 with the final train data sets from Tasks 1 and 3. Report: Write a report summarizing the work. In the report, all steps must be explicitly explained with visualizations.


Problem 3. Neutral Network Consider classifying a single hidden layer neural network for the


3. (a) Give two possible advantages of discarding irrelevant attributes before per-[4 marks]forming linear regression. (b) Write pseudocode for the algorithm of best subset selection. (c) For a given training set of size n = 100, Model A selects d = 10 attributes giving the Residual Sum of Squares RSS = 100 with the estimate ở = 3 of the standard deviation of noise, whereas Model B selects d = 6 attributes giving RSS = 200 with ở = 2. Answer the following questions showing all your calculations (if any): i. Which model is better according to C,,? ii. Which model is better according to BIC? iii. Which model is better according to adjusted R² when the Total Sum of Squares is TSS=600? (d) Describe how best model selection is modified to obtain forward stepwise selection and backward stepwise selection.[8 marks] (e) How many models need to be fitted for a regression problem with p = 6 attributes when using: i. best subset selection? ii. forward stepwise selection?


No Question Found forMachine Learning

we will make sure available to you as soon as possible.