Performance metrics for classification models

Published in

Nerd For Tech

4 min readMay 20, 2021

Performance of a model must be considered to choose the best model for our data. It is also important to choose specific performance metric for specific scenario. In this blog, I will explain

confusion matrix
TRP,FPR,TNR,FNR
Type I and II error
Sensitivity, subjectivity and miss rate
Precision and Recall
positive prediction value(ppv)
F beta score

and when to use them. Looks like a long list? but its a short blog.

Confusion Matrix:

This matrix helps in representing classification results. This representation helps in analyzing the performance of the model.

Follow the below steps to avoid confusion while building a confusion matrix,

In the above image “actual” is at the left and “predicted” is at top. Some may swap this and create confusion matrix. If you are new to data science you may get confused so when you see a matrix representation always make sure that you noted where actual and predicted are. Follow a single method as mentioned in above image or vice versa but stick to a method untill you get familiar with confusion matrix
To locate TP, FP, FN, TN in your matrix follow there 2 steps,

If actual and predicted values are same then it’s True else False.
If the predicted header says positive(1 in above image) it’s positive. If the predicted header says negative(0 in above image) it’s negative

TNR,TPR,FNR,FPR:

TPR and TNR are similar. As the name indicates for these two we divide the TP and TN with corresponding actual +ve and -ve values.

It is different for FPR and FNR. We divide the FP with actual -ve values for FPR and FN with actual +ve values.

These four rates are nothing but the calculation of rate of TP, TN, FP and FN.

Note:

TPR is also known as sensitivity and recall.
TNR is also known as specificity.
FNR is also known as miss rate and Type II error.
FRP is type I error.

Accuracy, precision and recall:

Note: We cannot use accuracy as metric for all dataset. Accuracy can be effective only for balanced dataset. If a dataset is imbalanced our model may get biased so use precision, recall and F beta score.

Sometime even when a dataset is balanced, based on the domains we work, we choose precision ,recall and F beta over accuracy.

Accuracy

Out of all the values how many values are correctly predicted by our model.

Precision:

Out of total predicted positive results how many are positive.

Also known as Positive prediction value(ppv).

Application of precision:

Spam detection — If a mail is not a spam and classified as spam, we gonna miss a important mail. We should reduce FP. When FP values impact our model focus on increasing precision.

Recall:

Out of total positive actual values, how many values did we predict correctly as positive.

Application of Recall:

Cancer prediction — If a person don’t have cancer and our model says that he/she have cancer there is no risk because on further test the person may know that he/she doesn’t have cancer. If a person have cancer and our model says that he/she doesn’t have cancer there is a great risk for that persons life. We should reduce FN. When FN values impact our model focus on increasing recall.

F beta score:

In some cases we need both FP and FN values to be considered. In this situation we use F Beta score as our performance metric.

F-beta score is the weighted harmonic mean of precision and recall

Still there are some scenarios, even when we consider both FP and FN, among those two we may need to give more importance to one and less to other. These variety of situations can be handle by selecting a appropriate Beta value. Here is the guide to select that Beta value,

A default beta value is 1.0, which is the same as the F-measure. A smaller beta value, such as 0.5, gives more weight to precision and less to recall, whereas a larger beta value, such as 2.0, gives less weight to precision and more weight to recall in the calculation of the score.

Thankyou :-)