作业代写｜STA314 Fall 2022 Homework 3

这是一篇来自加拿大的关于对问题1-4回答的**作业代写**，问题3和问题4要求R代码和R输出

**Submission: **Read the submission instruction carefully! There are 4 questions in this assignment.

You need to submit two fifiles through Quercus for this assignment.

You need to ensure that this fifile has the exact name as indicated. DO NOT set or modify the working directory within this fifile.

**Neatness Point: **You will be deducted one point if we have a hard time reading your solutions or understanding the structure of your code.

**Late Submission: **10% of the total possible marks will be deducted for each day late, up to a maximum of 3 days. After that, no submissions will be accepted.

**Problem 1 (3 pts)**

Consider the classifification problem with the label of *Y *belong to *C *:= *{*1*, *2*, . . . , K**} *and any realization *x *of *X **∈ *R*p *. Let *f *be any classififier that maps any *x **∈ *R*p *to a label in *C*.

- (
**2 pts**) Prove that the best function*f**∗*(i.e. the Bayes classififier)

*f **∗ *:= argmin

*f*:R*p**→C *Eh 1*{**Y **6 *=*f*(*X*)*} | **X *= *x*i satisfifies *f **∗*(*x*) = argmax *k**∈C *P(*Y *= *k **| **X *= *x*)*. *

(0.1)

- (
**1 pt**) Argue that the Bayes error equals to Eh 1*{**Y**6*=*f**∗*(*X*)*} |**X*=*x*i = 1*−*max

*k**∈C *

P(*Y *= *k **| **X *= *x*)*. *

**Problem 2 (3 pts)**

Consider a classifification problem. Assume that the response variable *Y *can only take value in *C *= *{*1*, *2*, *3*}*. For a fifixed *x*0, assume that the conditional probability of *Y *given *X *= *x*0 follows P(*Y *= 1 *| **X *= *x*0) = 0*.*6; P(*Y *= 2 *| **X *= *x*0) = 0*.*3; P(*Y *= 3 *| **X *= *x*0) = 0*.*1*. *

Consider a naive classififier *f*ˆ, called random guessing, which randomly picks one label from *C *= *{*1*, *2*, *3*} *with equal probability.

- (
**2 pts**) Compute the expected test error rate of*f*ˆ at*X*=*x*0. - (
**1 pt**) Compute the Bayes error rate at*X*=*x*0 and compare it with that of*f*ˆ.

**Problem 3 (21 pts)**

In this problem, you will implement logistic regression by completing the provided code in penalized logistic regression.R & hw3 starter.R and experiment with the completed code.

Throughout this homework, you will be working with a subset of hand-written digits, 2’s and 3’s, represented as 16 *× *16 pixel arrays. We show the example digits in Figure 1. The pixel intensities are between 0 and 1, and were read into the vectors in a raster-scan manner. You are given one training set: train which contains 300 examples of each class. You can access and load this training set by using functions

source(“hw3_starter/utils.R”)

data_train <- Load_data(“hw3_starter/data/train.csv”)

x_train <- train$x

y_train <- train$y

y_train contains the labels of these 300 images while x_train are the 256 pixel values. You are also given a validation set that you should use for tuning and a test set that you should use for reporting the fifinal performance. Optionally, the code for visualizing the dataset is located at utils.py.

Figure 1: Example digits. Top and bottom show digits of 2s and 3s, respectively.

You need to implement the penalized logistic regression model by minimizing the cost

*J *(*β**, β*0) := *− *1*n **n*X*i*=1

n *y**i *log *p*(*x**i*; *β**, β*0) + (1 *− **y**i*) log 1 *− **p*(*x**i*; *β**, β*0) o + *λ*

2*k **β**k *22over (*β**, β*0) *∈ *(R*p**, *R), where *p*(*x**i*; *β**, β*0) =*e**β*0+*x**>**i *** β **1 +

Here *n *is the total number of data points, *p *is the number of features in *x**i*, *λ **≥ *0 is the regularization parameter and ** β **and

- (
**2 pts**) Verify that the gradients of*J*(*β**, β*0) at any (¯*β**, β*¯0) have the following expression,

*∂**J *(*β**, β*0)*∂**β *** β**¯

*∂**J *(*β**, β*0)*∂β*0** β**¯

- (
**4 pts**) Implement the functions

Evaluate ,Predict logis,Comp gradient and Comp loss located at penalized logistic regression.R. While implementing the functions, remember to vectorize the operations; you should not have any for-loops in these functions. Include your code in the report.

*Important note: carefully read the provided code in *penalized logistic regression.R*. *

*You should understand the code and its structure instead of using it as a black box! *

- (
**2 pts**) Complete the missing parts in function Penalized Logistic Reg located at penalized logistic regression.R. This function should train the penalized logistic regression model using gradient descent on given training set. You may use the implemented functions from step 2. Include your code in the report.

*For parts 2 and 3, your completed *penalized logistic regression.R *should NOT **import other *R *packages. *

- (
**4 pts**) Complete the part (a) in hw3 starter.R.

In this part, you need to fifix your regularization parameter, lbd = 0, and to experiment with the hyperparameters for stepsize (the learning rate) and max iter (the number of iterations).

[Hints: (1) You only need to use the training data for this part. (2) A too small learning rate takes longer to converge. (3) A too large learning rate is also problematic.]

**– **In the write-up, report and brieflfly explain which hyperparameter settings you found worked the best.

**– **For this choice of hyperparameters, generate and report a plot that shows how the training loss changes (iteration counter on x-axis and training loss on y-axis).

**– **For this choice of hyperparameters, generate and report a plot for the training 0-1 error (iteration counter on x-axis and training error on y-axis).

**– **Did the training 0-1 error have the same pattern as the training loss? Is your fifinding aligned with your expectation? State you reasoning.

- (
**7 pts**) Complete the part (b) in hw3 starter.R.

Using the selected setting of hyperparameters (for learning rate and number of iteration) that you identifified in step 4, fifit the model by using *λ **∈ {*0*, *0*.*01*, *0*.*05*, *0*.*1*, *0*.*5*, *1*}*.

**– **(**1 pts**) Does your selected setting of hyperparameters guarantee convergence for all *λ*’s? If not, re-identify hyperparameters for those *λ*’s for which convergence is not guranteed. Report the hyperparameter setting(s) you used for each *λ*.

**– **(**2 pts**) Generate and report one plot that shows how the training 0-1 error changes as you train with difffferent values of *λ*.

**– **(**2 pts**) Generate and report one plot that shows how the validation 0-1 error changes as you train with difffferent values of *λ*.

**– **(**2 pts**) Comment on the effffects of *λ *based on these two plots. Which is the best value of *λ *based on your experiment?

- (
**2 pts**) Complete the part (c) in hw3 starter.R.

Fit the model by using the best value of *λ *identifified in step 5 and report its test 0-1 error. Compare your test error with the model fifitted by using glmnet with the same *λ*.

**Problem 4 (10 pts)**

In this problem, you will develop a model to predict whether a given car gets high or low gas mileage based on the Auto data set.

- (
**1 pts**) Create a binary variable,**mpg01**, that contains a 1 if**mpg**contains a value above its median, and a 0 if**mpg**contains a value below its median. You can compute the median using the median() function.

Split the data into a training set (70%) and a test set (30%). (Use set.seed(0) to ensure reproducibility.)

- (
**2 pts**) Perform LDA on the training data in order to classify**mpg01**using the variables**cylinders**,**displacement**,**horsepower**,**weight**,**acceleration**, and**year**. What is the test error of the model obtained?

- (
**2 pts**) Perform QDA on the training data in order to classify**mpg01**using the same variables in part 3. What is the test error of the model obtained?

- (
**2 pts**) Perform logistic regression on the training data in order to classify**mpg01**using the same variables in part 3. What is the test error of the model obtained?

- (
**3 pts**) Draw the ROC curves of LDA, QDA and logistic regression on the test data.

Compute their AUCs and comment on which classififier you would choose. (You may fifind the R package pROC useful.)

You may also like:

2022年12月6日 作业代写

作业代写｜INTRODUCTION TO DATA SCIENCE AND SYSTEMS (M) COMPSCI5089 2022年12月5日 作业代写

作业代写｜OPERATIONS MANAGEMENT 4 MAEE10003 Semester One 2022-23
扫描二维码或者

添加微信skygpa

添加微信skygpa