How can I integrate the metric aucroc in my multiclassification problem?

Question

I have a dataset of pixels where each pixel can be classified by 1 of 5 classes and I have 3 models that I trained to classify those pixels. Now, I am developing a python script that will help me analyze the performance of each model by computing the values of some metrics. As you can see I already have some metrics but I would like to include in my script Auc_roc because I think it is a right metric to have when analyzing an imbalanced dataset for a multi-classification problem.
MY PROBLEM: I don’t know how to implement the auc roc while maintaining the structure of my DataFrame.

The structure looks like this:

enter image description here

The code:

data = pd.read_csv(data_path)

# Create a list of classifier names
classifiers = ['KNN', 'RF', 'XGBoost']

# Define a list of class labels
class_labels = [0, 1, 2, 3, 4]  # Modify this based on your actual class labels

# Create an empty list to store class-level metrics
class_metrics = []

# Iterate over each classifier
for classifier in classifiers:
    # Filter the data DataFrame for the specific classifier
    classifier_data = data[data['Classifier'] == classifier]

    # Iterate over each class
    for class_label in class_labels:
        # Filter data for the specific class
        class_data = classifier_data[classifier_data['GT_Class'] == class_label]

        # True ground truth values for the class
        y_true = class_data['GT_Class']
        # Predicted values by the classifier for the class
        y_pred = class_data['Pred_Class']

        # Calculate classification metrics for the class
        accuracy = balanced_accuracy_score(y_true, y_pred)
        precision = precision_score(y_true, y_pred, average="weighted")
        recall_sensitivity = sensitivity_score(y_true, y_pred, average="weighted")
        f1 = f1_score(y_true, y_pred, average="weighted")
        

        # Append the metrics to the list
        class_metrics.extend([
            [classifier, class_label, 'Accuracy', accuracy],
            [classifier, class_label, 'Precision', precision],
            [classifier, class_label, 'Recall', recall_sensitivity],
            [classifier, class_label, 'F1-Score', f1]
        ])

# Create a DataFrame from the class-level metrics
class_metrics_df = pd.DataFrame(class_metrics, columns=['Classifier', 'Class', 'Metric Name', 'Metric Value'])

print(class_metrics_df)

When I tried to only analyze the auc_roc metric I use this code and I am not sure that it worked.

data = pd.read_csv(data_path)

# Create a list of classifier names
classifiers = ['KNN', 'RF', 'XGBoost']

# Define a list of class labels
class_labels = [0, 1, 2, 3, 4]  # Modify this based on your actual class labels

# Iterate over each classifier
for classifier in classifiers:
    print(f"Classifier: {classifier}")
    # Filter the data DataFrame for the specific classifier
    classifier_data = data[data['Classifier'] == classifier]

    # Iterate over each class
    for class_label in class_labels:
        # Filter data for the specific class
        class_data = classifier_data.copy()  # Make a copy to avoid modifying the original DataFrame
        class_data['GT_Class'] = (class_data['GT_Class'] == class_label).astype(int)

        # True ground truth values for the class
        y_true = class_data['GT_Class']
        # Predicted values by the classifier for the class
        y_pred = class_data['Pred_Class']

        # Calculate AUC-ROC for the class
        auc_roc = roc_auc_score(y_true, y_pred)
        print(f"Class {class_label} AUC-ROC: {auc_roc:.4f}")

The reason why I am saying that I am not sure if it work is the class 0 for KNN I am getting a result of Class 0 AUC-ROC: 0.0151 and it seems way to low. The KNN Confusion Matrix Looks like this:

[[9696   79   50   30   62] [  36 3044  466   47    7] [  50  427 2198  525    1] [  13   48  395 5942    1] [  19    7    0    0 1034]]

Leave a Comment Cancel reply