使用 Python 计算 DL 预测后的分类精度

狗头Rser 2021-10-08

759

分类结果的混淆矩阵

真实情况	预测结果
正例	反例
正例	TP（真正例）	FN（假反例）
反例	FP（假正例）	TN（真反例）

查准率( )代表的是预测结果中正例占所有预测的比值，例如西瓜的案例，即所谓的好瓜占预测结果中好瓜的和的比例。而查准率是预测结果中的正例中占实际正例样本的比值，即预测的好瓜在实际中的好瓜的比重。即:

这个有点弯弯绕，一般有直接定义好的 function，直接调用即可.

对于预测多个建筑结果不连续无法拼接成一幅图像的时候，这时候这个任务就变成了多个二分类问题，对于多个二分类问题，计算精度的时候有两种方法，先将各个分类结果的混淆矩阵求平均成为一个混淆矩阵，然后再计算 $、、$ 。 $、、$ 即：

或者也可以先求出各个精度指标的平均数然后再求出这这三个参数，记为： $、、、$ ，再根据这些平均值求相应的参数，即所谓的微查全率(macro-R)、微查准率(macro-P)和微 F1 (macro-F1) 即：

现在我自己有两个文件，一个是 predict 的文件，另一个是所谓的 ground_truth，算 IOU、P、R、F1，accuracy 等等参数，当然这个是个二分类问题，而我只求建筑的 IOU，可以直接认为 $IOU=mIou$.

这里的文件由于文件名是用同样的裁剪代码裁出来的，掩膜的位置也是相同的，同时经过前期的一些条件的筛选，所有文件名也是一样的，然后就可以算 $、、$ 等等参数了

from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import os
import glob
from sklearn import metrics
from sklearn.metrics import jaccard_similarity_score
import csv



def caculate_IOU(predict_dir,predict_name,ground_truth_dir,ground_truth_name):
    """
    计算交并比
    参数：输入图像 和 真实图像
    """
    prdict_image = Image.open(os.path.join(predict_dir,predict_name))
    true_mask_image = Image.open(os.path.join(ground_truth_dir,ground_truth_name))

    # PIL 转成矩阵
    predict_array = np.array(prdict_image)
    true_mask_array = np.array(true_mask_image)

    prdict_image.close()
    true_mask_image.close()

    #转 0-1 二值图像
    predict_array[np.where(predict_array == 127)] = 1
    true_mask_array[np.where(true_mask_array==255)] = 1

    #求交集和并集
    intersection = np.logical_and(predict_array==1,true_mask_array==1)
    union_array = np.logical_or(predict_array==1,true_mask_array==1) 

    IOU = np.sum(intersection) / np.sum(union_array)

    return IOU

def caculate_index(predict_dir,predict_name,ground_truth_dir,ground_truth_name):
    """
    计算 accuracy.
    """
    prdict_image = Image.open(os.path.join(predict_dir,predict_name))
    true_mask_image = Image.open(os.path.join(ground_truth_dir,ground_truth_name))

    predict_array = np.array(prdict_image)
    true_mask_array = np.array(true_mask_image)

    prdict_image.close()
    true_mask_image.close()

    #转 0-1 二值图像
    predict_array[np.where(predict_array == 127)] = 1
    true_mask_array[np.where(true_mask_array==255)] = 1


    #转 list 
    predict_array = predict_array.flatten()
    true_mask_array = true_mask_array.flatten()
    predict_list = predict_array.tolist()
    true_mask_list = true_mask_array.tolist()

    #返回精度指标,精度,
    accuracy = metrics.accuracy_score(true_mask_list,predict_list)
    precision = metrics.precision_score(true_mask_list,predict_list)
    recall = metrics.recall_score(true_mask_list, predict_list)
    F1 = metrics.f1_score(true_mask_list, predict_list)

    return [accuracy,precision,recall,F1]




if __name__== '__main__' :

    true_mask_dir = './ref_dir'
    predict_dir = './predict_dir'

    true_mask_files = os.listdir(true_mask_dir)
    predict_files = os.listdir(predict_dir)

    true_mask_files.sort(key = lambda x:int(x[:-4]))
    predict_files.sort(key = lambda x:int(x[:-4]))

    #Initialize
    IOU,accuracy,precision,recall,F1 = 0,0,0,0,0    


    for i in range(len(true_mask_files)):
        predict_name = predict_files[i]
        ground_truth_name = true_mask_files[i]

        #计算 IOU 的和
        IOU += caculate_IOU(predict_dir,predict_name,true_mask_dir,ground_truth_name)
        index_matrix = caculate_index(predict_dir,predict_name,true_mask_dir,ground_truth_name)
        accuracy += index_matrix[0]
        precision += index_matrix[1]
        recall += index_matrix[2]
        F1 += index_matrix[3]

# 相当于 macro-index
    # print(IOU/len(true_mask_files))
    # print(accuracy/len(true_mask_files),precision/len(true_mask_files),recall/len(true_mask_files),F1/len(true_mask_files))

复制

IOU 好像没找到 sklearn 的函数，就自己手撕了一下，反正就是交集除以一个并集，还是比较简单的，然后就是用 sklearn 里面的 metrics.accuracy_score
、metrics.precision_score
、metrics.recall_score
、metrics.f1_score
算一下，就是需要把 PIL 读取的影像变成一个 list。中途需要用 np.array.flatten
铺平，然后转 list，然后累加求一个平均值。可以 print
出来，也可以用一个 panda 写出 csv，其实也可以直接写一个 function 转化成 LaTeX 的表格的形式，怎么快怎么来，慢慢地模块化。

今天还在和同门的同学在讨论，这个 coding 的问题，讲道理，我的 coding 基础相比于计算机同学来说还是比较弱的，我很多东西也要查，但是我觉得吧...看 Python 的基础课真的不要花太多的时间，什么 B 站黑马，小甲鱼的，有一个实际目的驱动写个代码，再写个笔记，用多了也就习惯了，当然如果大学有时间能系统的学还是挺好的，等到了 post-..., 效率为王，结果为证，并没有那么多探索的时间，如何快速得出结果才是最为重要的。

之前录的一个 QGIS 下影像裁剪的视频放 B 站上，就只是像仓库一样存储着，竟然有个人问我为什么地图转栅格？问我怎么知道影像有偏移？然后用一种独特的口气说我啥都知道，我最厉害。我就奇了怪了，我就是无聊放个视频做仓库，何必酸我。我当时也就回他，要的就是栅格，偏移不偏移自己不知道吗，我也说他没必要酸我。我就觉得很抽象了。我又没义务解答你的问题，你自己态度有问题，还不知道去探索，怪我太刻薄了。

听会儿远方的风解解乏...睡觉...

python

文章转载自狗头Rser，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

使用 Python 计算 DL 预测后的分类精度

评论

相关阅读