I’m working on an image classification problem with VGG16.
My dataset is balanced (150 images per class)
I have segmented the data set into three sets, train, val, test
I’m testing the impact of data augmentation using this methodology but whatever I do, data augmentation makes my model performance lower.
Here is my flow :
tf_train=tf.data.Dataset.from_tensor_slices((np.array(images_np_train), y_train)).batch(10)
tf_test=tf.data.Dataset.from_tensor_slices((np.array(images_np_test), y_test)).batch(10)
tf_val=tf.data.Dataset.from_tensor_slices((np.array(images_np_val), y_val)).batch(10)
def create_model_fct2() :
IMG_SIZE = 224
resize_and_rescale = Sequential([
Resizing(IMG_SIZE, IMG_SIZE,input_shape=(224, 224, 3)),
])
# Data augmentation
data_augmentation = Sequential([
# RandomFlip("horizontal_and_vertical", input_shape=(224, 224, 3)),
RandomRotation(0.2, input_shape=(224, 224, 3)),
RandomZoom(0.1),
Rescaling(1./255)
])
model_base = VGG16(include_top=False, weights="imagenet", input_shape=(224, 224, 3))
for layer in model_base.layers:
layer.trainable = False
# Définition du nouveau modèle
model = Sequential([
resize_and_rescale,
data_augmentation,
model_base,
GlobalAveragePooling2D(),
Dense(256, activation='relu'),
Dropout(0.5),
Dense(7, activation='softmax')
])
# compilation du modèle
model.build()
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
print(model.summary())
return model
model4 = create_model_fct2()
model4_save_path = "./model4_best_weights.keras"
checkpoint = ModelCheckpoint(model4_save_path, monitor="val_accuracy", verbose=1, save_best_only=True, mode="max")
es = EarlyStopping(monitor="val_accuracy", mode="max", verbose=1, patience=15)
callbacks_list = [checkpoint, es]
history4 = model4.fit(tf_train,
validation_data=tf_val,
batch_size=10, epochs=100, callbacks=callbacks_list, verbose=1)
loss, accuracy = model4.evaluate(tf_test, verbose=False)
print("Test Accuracy : {:.4f}".format(accuracy))`
Regardless of the number of epochs, patience, etc., my entrained model with data augmentation performs less well in the evaluation on my test data set.
I have two questions:
1 – I can’t explain this performance degradation.
2 – Does the evaluation exclude the data augmentation layer? Shouldn’t it? Shouldn’t data augmentation only concern the training phase and not the test/validation phases?
loss, accuracy = model4.evaluate(tf_test, verbose=False)
Thanks !