# Classifying pictures of cats and dogs with Keras¶

The purpose of this notebook is to provide a quick demo of the ease-to-use of Keras.
We implement in a few dozens of lines a classifier for pictures of cats and dogs.

After several training epochs on 2x1024 pictures of cats and dogs, we obtain an accuracy of ~80% on the 2x416 pictures training set despite the small dataset size.

This noteboook goes along the blog post "Building powerful image classification models using very little data" written by François Chollet on blog.keras.io.

#### Overview :¶

• Model definition, training and evaluation
• Data augmentation
• Using a pre-trained network with bottleneck

## Data¶

### Folder structure¶

data/
train/
dogs/ ### 1024 pictures
dog001.jpg
dog002.jpg
...
cats/ ### 1024 pictures
cat001.jpg
cat002.jpg
...
validation/
dogs/ ### 416 pictures
dog001.jpg
dog002.jpg
...
cats/ ### 416 pictures
cat001.jpg
cat002.jpg
...


Note : for this example we only consider 2x1000 training images and 2x400 testing images among the 2x12500 available.

Note 2 : this notebook require the Pillow framework to process images. You can install it using pip3 install Pillow

In [1]:
from keras.preprocessing.image import ImageDataGenerator

Using Theano backend.

In [2]:
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'

In [3]:
# used to rescale the pixel values from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1./255)

# automagically retrieve images and their classes for train and validation sets
train_generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')

validation_generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')

Found 2048 images belonging to 2 classes.
Found 832 images belonging to 2 classes.


## Model¶

### Imports¶

In [4]:
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense


### Model architecture definition¶

In [5]:
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))


In [6]:
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])


### Training¶

In [7]:
nb_epoch = 1
nb_train_samples = 2048
nb_validation_samples = 832

In [8]:
model.fit_generator(
train_generator,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)

Epoch 1/1
2048/2048 [==============================] - 147s - loss: 0.7462 - acc: 0.5225 - val_loss: 0.6793 - val_acc: 0.5829

Out[8]:
<keras.callbacks.History at 0x10bdc30f0>
In [9]:
model.save_weights('models/1000-samples--1-epochs.h5')


In [10]:
model.load_weights('models/without-data-augmentation/1000-samples--32-epochs.h5')


### Evaluating on validation set¶

Computing loss and accuracy :

In [11]:
model.evaluate_generator(validation_generator, nb_validation_samples)

Out[11]:
[1.541881765310581, 0.72115384615384615]

Evolution of accuracy on training (blue) and validation (green) sets for 1 to 32 epochs :

After ~10 epochs the neural network reach ~70% accuracy. We can witness overfitting, no progress is made over validation set in the next epochs

## Data augmentation¶

By applying random transformation to our train set, we artificially enhance our dataset with new unseen images.
This will hopefully reduce overfitting and allows better generalization capability for our network.

Example of data augmentation applied to a picture:

In [12]:
train_datagen_augmented = ImageDataGenerator(
rescale=1./255,        # normalize pixel values to [0,1]
shear_range=0.2,       # randomly applies shearing transformation
zoom_range=0.2,        # randomly applies shearing transformation
horizontal_flip=True)  # randomly flip the images

# same code as before
train_generator_augmented = train_datagen_augmented.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode='binary')

Found 2048 images belonging to 2 classes.

In [13]:
model.fit_generator(
train_generator_augmented,
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator,
nb_val_samples=nb_validation_samples)

Epoch 1/1
2048/2048 [==============================] - 147s - loss: 0.6537 - acc: 0.7046 - val_loss: 0.6855 - val_acc: 0.6863

Out[13]:
<keras.callbacks.History at 0x10cfd1668>
In [14]:
model.save_weights('models/1000-samples-augmented--1-epochs.h5')

In [15]:
model.load_weights('models/with-data-augmentation/1000-samples-augmented--100-epochs.h5')


### Evaluating on validation set¶

Computing loss and accuracy :

In [16]:
model.evaluate_generator(validation_generator, nb_validation_samples)

Out[16]:
[0.71632749358048808, 0.82331730769230771]

Evolution of accuracy on training (blue) and validation (green) sets for 1 to 100 epochs :

Thanks to data-augmentation, the accuracy on the validation set improved to ~80%

## Using pre-trained model¶

The process of training a convolutionnal neural network can be very time-consuming and require a lot of datas.

We can go beyond the previous models in terms of performance and efficiency by using a general-purpose, pre-trained image classifier.
We consider VGG16, a model trained on the ImageNet dataset - which contains millions of images classified in 1000 categories.

On top of it, we add a small multi-layer perceptron and we train it on our dataset.

### VGG16 + small MLP¶

#### VGG16 model architecture definition¶

In [17]:
import os
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense

In [18]:
model_vgg = Sequential()



This part is a bit complicated because the structure of our model is not exactly the same as the one used when training weights.
Otherwise, we would use the model.load_weights() method.

Note : the VGG16 weights file (~500MB) is not included in this repository. You can download from here :

In [19]:
import h5py
f = h5py.File('models/VGG16/vgg16_weights.h5')

for k in range(f.attrs['nb_layers']):
if k >= len(model.layers):
# we don't look at the last (fully-connected) layers in the savefile
break
g = f['layer_{}'.format(k)]
weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])]
model_vgg.layers[k].set_weights(weights)
f.close()


### Using the VGG16 model to process samples¶

In [20]:
train_generator_bottleneck = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode=None,
shuffle=False)

validation_generator_bottleneck = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=32,
class_mode=None,
shuffle=False)

Found 2048 images belonging to 2 classes.
Found 832 images belonging to 2 classes.


This is a long process, so we save the output of the VGG16 once and for all.

In [ ]:
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck, nb_train_samples)
np.save(open('models/bottleneck_features_train.npy', 'wb'), bottleneck_features_train)

In [ ]:
bottleneck_features_validation = model_vgg.predict_generator(validation_generator_bottleneck, nb_validation_samples)
np.save(open('models/bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)


In [21]:
train_data = np.load(open('models/bottleneck_features_train.npy', 'rb'))
train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))

validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))


And define and train the custom fully connected neural network :

In [22]:
model_top = Sequential()

model_top.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

In [23]:
nb_epoch=10

In [24]:
model_top.fit(train_data, train_labels,
nb_epoch=nb_epoch, batch_size=32,
validation_data=(validation_data, validation_labels))

Train on 2048 samples, validate on 832 samples
Epoch 1/10
2048/2048 [==============================] - 2s - loss: 1.2591 - acc: 0.7407 - val_loss: 0.7070 - val_acc: 0.7200
Epoch 2/10
2048/2048 [==============================] - 2s - loss: 0.4085 - acc: 0.8281 - val_loss: 0.2701 - val_acc: 0.8870
Epoch 3/10
2048/2048 [==============================] - 2s - loss: 0.3320 - acc: 0.8540 - val_loss: 0.2590 - val_acc: 0.8918
Epoch 4/10
2048/2048 [==============================] - 2s - loss: 0.2903 - acc: 0.8794 - val_loss: 0.2676 - val_acc: 0.8930
Epoch 5/10
2048/2048 [==============================] - 2s - loss: 0.2557 - acc: 0.9058 - val_loss: 0.2950 - val_acc: 0.8822
Epoch 6/10
2048/2048 [==============================] - 3s - loss: 0.2198 - acc: 0.9136 - val_loss: 0.2382 - val_acc: 0.8990
Epoch 7/10
2048/2048 [==============================] - 4s - loss: 0.2074 - acc: 0.9219 - val_loss: 0.3355 - val_acc: 0.8750
Epoch 8/10
2048/2048 [==============================] - 4s - loss: 0.1661 - acc: 0.9297 - val_loss: 0.3402 - val_acc: 0.8858
Epoch 9/10
2048/2048 [==============================] - 4s - loss: 0.1388 - acc: 0.9487 - val_loss: 0.4845 - val_acc: 0.8594
Epoch 10/10
2048/2048 [==============================] - 4s - loss: 0.1412 - acc: 0.9492 - val_loss: 0.2980 - val_acc: 0.9050

Out[24]:
<keras.callbacks.History at 0x10c785a58>

The training process of this small neural network is very fast : ~2s per epoch

In [25]:
model_top.save_weights('models/1000-samples-bottleneck--10-epochs.h5')


### Bottleneck model evaluation¶

In [26]:
model_top.load_weights('models/with-bottleneck/1000-samples--100-epochs.h5')


Loss and accuracy :

In [27]:
model_top.evaluate(validation_data, validation_labels)

800/832 [===========================>..] - ETA: 0s
Out[27]:
[0.92134428626069653, 0.90384615384615385]

Evolution of accuracy on training (blue) and validation (green) sets for 1 to 32 epochs :

We reached a 90% accuracy on the validation after ~1m of training (~20 epochs) and 8% of the samples originally available on the Kaggle competition !