MNIST Classification on Google Colab
|View in Github||Download Notebook|
In this lesson we discuss in how to create a simple IPython Notebook to solve an image classification problem. MNIST contains a set of pictures
from __future__ import absolute_import from __future__ import division from __future__ import print_function import numpy as np from keras.models import Sequential from keras.layers import Dense, Activation, Dropout from keras.utils import to_categorical, plot_model from keras.datasets import mnist
Warm Up Exercise
First we load the data from the inbuilt mnist dataset from Keras Here we have to split the data set into training and testing data. The training data or testing data has two components. Training features and training labels. For instance every sample in the dataset has a corresponding label. In Mnist the training sample contains image data represented in terms of an array. The training labels are from 0-9.
Here we say x_train for training data features and y_train as the training labels. Same goes for testing data.
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11493376/11490434 [==============================] - 0s 0us/step
Identify Number of Classes
As this is a number classification problem. We need to know how many classes are there. So we’ll count the number of unique labels.
num_labels = len(np.unique(y_train))
Convert Labels To One-Hot Vector
Read more on one-hot vector.
y_train = to_categorical(y_train) y_test = to_categorical(y_test)
The training model is designed by considering the data as a vector. This is a model dependent modification. Here we assume the image is a squared shape image.
image_size = x_train.shape input_size = image_size * image_size
Resize and Normalize
The next step is to continue the reshaping to a fit into a vector and normalize the data. Image values are from 0 - 255, so an easy way to normalize is to divide by the maximum value.
x_train = np.reshape(x_train, [-1, input_size]) x_train = x_train.astype('float32') / 255 x_test = np.reshape(x_test, [-1, input_size]) x_test = x_test.astype('float32') / 255
Create a Keras Model
Keras is a neural network library. The summary function provides tabular summary on the model you created. And the plot_model function provides a grpah on the network you created.
# Create Model # network parameters batch_size = 4 hidden_units = 64 model = Sequential() model.add(Dense(hidden_units, input_dim=input_size)) model.add(Dense(num_labels)) model.add(Activation('softmax')) model.summary() plot_model(model, to_file='mlp-mnist.png', show_shapes=True)
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_5 (Dense) (None, 512) 401920 _________________________________________________________________ dense_6 (Dense) (None, 10) 5130 _________________________________________________________________ activation_5 (Activation) (None, 10) 0 ================================================================= Total params: 407,050 Trainable params: 407,050 Non-trainable params: 0 _________________________________________________________________
Compile and Train
A keras model need to be compiled before it can be used to train the model. In the compile function, you can provide the optimization that you want to add, metrics you expect and the type of loss function you need to use.
Here we use adam optimizer, a famous optimizer used in neural networks.
The loss funtion we have used is the categorical_crossentropy.
Once the model is compiled, then the fit function is called upon passing the number of epochs, traing data and batch size.
The batch size determines the number of elements used per minibatch in optimizing the function.
Note: Change the number of epochs, batch size and see what happens.
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(x_train, y_train, epochs=1, batch_size=batch_size)
469/469 [==============================] - 3s 7ms/step - loss: 0.3647 - accuracy: 0.8947 <tensorflow.python.keras.callbacks.History at 0x7fe88faf4c50>
Now we can test the trained model. Use the evaluate function by passing test data and batch size and the accuracy and the loss value can be retrieved.
MNIST_V1.0|Exercise: Try to observe the network behavior by changing the number of epochs, batch size and record the best accuracy that you can gain. Here you can record what happens when you change these values. Describe your observations in 50-100 words.
loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size) print("\nTest accuracy: %.1f%%" % (100.0 * acc))
79/79 [==============================] - 0s 4ms/step - loss: 0.2984 - accuracy: 0.9148 Test accuracy: 91.5%
This programme can be defined as a hello world programme in deep learning. Objective of this exercise is not to teach you the depths of deep learning. But to teach you basic concepts that may need to design a simple network to solve a problem. Before running the whole code, read all the instructions before a code section.
Solve Exercise MNIST_V1.0.