Cats vs Dogs Classification

Training VGG16 with Transfer Learning

About Competition

As part of Kaggle competitions, this one refers to a classification between cats and dogs according to a set of images. In this post, the algorithm that we will use to solve this problem is with Convolutional Neural Networks (CNN), using the VGG16 arquitecture with ImageNet as a pre-train model with Transfer Learning.

...
Kaggle training dataset.

Coding

We start importing the libraries for the analysis, image pre-processing and Keras models for the neural network training.

                    
                        import numpy as np
                        import pandas as pd
                        import matplotlib.pyplot as plt
                        import random
                        import os

                        from keras.preprocessing.image import ImageDataGenerator, load_img
                        from keras.utils import to_categorical
                        from sklearn.model_selection import train_test_split
                        from keras.applications import VGG16
                        from keras.models import Model
                        from keras.layers import Dropout, Flatten, Dense
                        from keras.layers.convolutional import Conv2D, MaxPooling2D
                        from keras import backend as K
                        from keras import optimizers
                        from skimage.transform import resize
                        from sklearn.preprocessing import StandardScaler
                    
                

Prepare data for training process. First, we import the training dataset. The label in each image gives the category already, so we split the label and build a dataframe to save the filename and its category. We assign zero for cat category, and one for dogs.

                    
                        filenames = os.listdir("/content/drive/My Drive/train_catsdogs")

                        categories = []

                        for f_name in filenames:
                            category = f_name.split('.')[0]
                            if category == 'dog':
                                categories.append(1)
                            else:
                                categories.append(0)

                        df = pd.DataFrame({'filename': filenames, 'category': categories})
                    
                
...
Random image from training dataset.

Before setting the training and validation datasets, we assign the values for image size, epochs and batch size.

                    
                        image_size = 224
                        input_shape = (image_size, image_size, 3)
                        epochs = 20 
                        batch_size = 40
                    
                

With this information, we prepare the training categories in the dataframe by replacing the values of cat and dog for 0 and 1, respectively. Next, we define the data for training and validation. The validation set is done to evaluate the model during the training and corresponds to the 10% of the dataset.

                    
                        df["category"] = df["category"].replace({0:'cat',1:'dog'})
                        train_df, validate_df = train_test_split(df, test_size=0.10) 
                        train_df = train_df.reset_index()
                        total_train = train_df.shape[0]
                    
                

To develop the training data, we use data augmentation. This is to "create" more data based on the already dataset we have. This is done by rotating and adding noise to the images in the training set.

                    
                        train_datagen = ImageDataGenerator(rotation_range=15, rescale=1./255, 
                                    shear_range=0.2, zoom_range=0.2, 
                                    horizontal_flip = True, width_shift_range=0.1, 
                                    height_shift_range=0.1)

                        train_generator = train_datagen.flow_from_dataframe(train_df, 
                                    "/content/drive/My Drive/train_catsdogs", x_col='filename', 
                                    y_col='category', target_size=(image_size, 
                                    image_size), class_mode='binary', batch_size=batch_size)
                    
                
...
Data augmentation visualization.

The same method is done for the validation set, we use data augmentation to "obtain" more data.

                    
                        validate_df = validate_df.reset_index()
                        total_validate = validate_df.shape[0]

                        validation_datagen = ImageDataGenerator(rescale=1./255)

                        validation_generator = validation_datagen.flow_from_dataframe(
                                validate_df, 
                                "/content/drive/My Drive/train_catsdogs", 
                                x_col='filename', y_col='category',
                                class_mode='binary',
                                target_size=(image_size, image_size),
                                batch_size=batch_size)
                    
                

Develop the neural network model. Now that we have our training and validation datasets, we develop the model by importing the VGG16 arquitecture and the weights from ImageNet.

                    
                        model = VGG16(input_shape=input_shape, weights='imagenet', include_top=False)
                    
                

In this model, we use transfer learning. This is, we freeze all layers so they won't change during the training.

                    
                        for layer in model.layers:
                            layer.trainable = False
                    
                

The layers are arange as...

                    
                        x = model.output

                        x = Conv2D(1, (1,1), activation='relu')(x)
                        x = Flatten()(x)
                        x = Dense(512, activation='relu')(x)
                        x = Dropout(0.5)(x)
                        x = Dense(1, activation='sigmoid')(x)

                        model = Model(model.input, x)
                    
                

We add loss function, optimizer and metrics to the model.

                    
                        model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
                    
                

We train the model using the training and validation datasets.

                    
                        history = model.fit_generator(train_generator, epochs=epochs, 
                                validation_data=validation_generator, 
                                validation_steps=total_validate//batch_size, 
                                steps_per_epoch=total_train//batch_size)
                    
                
...
Display training.

The best accuracy obtained was...

                    
                        Accuracy = 0.875000  ;  loss = 0.379309 
                    
                

Test model. A way to prove the training was efficient is to display the predictions for the test set. Same as with the training and validation sets, we make a dataframe for each image file but here we do not have the class, this will be predicted. These images were not seen by the training or evaluation process.

                    
                        test_filenames = os.listdir("/content/drive/My Drive/test2_catsdogs")
                        test_df = pd.DataFrame({
                                'filename': test_filenames
                                })
                        nb_samples = test_df.shape[0]
                    
                

Here, we set the image size and the images will be read by the predicting function.

                    
                        test_gen = ImageDataGenerator(rescale=1./255)
                        test_generator = test_gen.flow_from_dataframe(
                                    test_df, 
                                    "/content/drive/My Drive/test2_catsdogs", 
                                    x_col='filename', y_col=None,
                                    class_mode=None,
                                    batch_size=batch_size,
                                    target_size=(image_size, image_size),
                                    shuffle=False)   
                    
                

We call the prediction function and assign the category of 0/1 according to the probabilities for each class.

                    
                        predict = model.predict_generator(test_generator, steps=np.ceil(nb_samples/batch_size))
                        threshold = 0.5

                        test_df['category'] = np.where(predict > threshold, 1,0)
                    
                

Prediction Visualization

Here we show some images with its respective prediction using the testing set by the competition.

                    
                        sample_test = test_df.sample(n=9).reset_index()
                        sample_test.head()
                        plt.figure(figsize=(12, 12))
                        for index, row in sample_test.iterrows():
                            filename = row['filename']
                            category = row['category']
                            img = load_img("/content/drive/My Drive/test_catsdogs/"+filename, target_size=(256, 256))
                            plt.subplot(3, 3, index+1)
                            plt.imshow(img)
                            plt.xlabel(filename + '(' + "{}".format(category) + ')')
                        plt.tight_layout()
                    
                
...
Testing dataset.

Last, we use another dataset based on personal and internet images to test the model.

...
Testing dataset.