Load the MNIST data, pre-split into training and test sets.
library(keras)
<- dataset_mnist()
mnist <- mnist$train$x
X_train <- mnist$test$x
X_test <- mnist$train$y
y_train <- mnist$test$y
y_test
<- array_reshape(X_train, c(nrow(X_train), 784)) /255
X_train <- array_reshape(X_test, c(nrow(X_test), 784)) / 255
X_test
# We also assign each digit to a class
<- to_categorical(y_train, num_classes = 10)
y_train <- to_categorical(y_test, num_classes = 10) y_test
Penalization
In keras, we add ridge, LASSO, or elastic net penalization to the
parameters of each layer. For instance, if we wanted to add elastic net
penalization to the weights of the first layer of a NN, inside of the
layer_dense()
we would add the argument
kernel_regularizer=regularizer_l1_l2(l1=1e-4, l2=1e-4)
. To
penalize the biases, the corresponding argument is
bias_regularizer
.
The options are regularizer_l1
,
regularizer_l2
and regularizer_l1_l2
.
1. Modify the following code to create a NN that uses elastic net regularization in the two hidden layers to penalize the weights. Do not penalize the output layer.
<- keras_model_sequential() %>%
nn_model layer_dense(units = 256, activation = "relu",
input_shape = c(784)
%>%
) layer_dense(units = 128, activation = "relu") %>%
layer_dense(units = 10, activation = "softmax") # This is the output layer
2. Fit the model and report the test set accuracy. You may want to refer to the previous class demo to get the code for this. How does your loss/accuracy curve compare to the one below, using the same model architecture and no regularization?