Skip to content

Specificity and recall—which is which, again?

Specificity and recall are two important metrics used in the field of machine learning and statistics, particularly in classification problems. They are used to evaluate the performance of a classification model, such as a binary classifier.

She has trouble keeping it straight too

Table of Contents

I asked my friend Claude for a refresher, because he's read the internet recently.

  1. Specificity: Specificity measures the proportion of actual negatives that are correctly identified as such (e.g., the percentage of healthy people who are correctly identified as not having a condition). It is also known as the true negative rate. The formula for specificity is: $$\text{Specificity} = \frac{\text{True Negatives}}{\text{True Negatives + False Positives}}$$ In other words, specificity answers the question: "Out of all the actual negatives, how many did the model correctly identify?"
  2. Recall: Recall, also known as sensitivity or the true positive rate, measures the proportion of actual positives that are correctly identified as such (e.g., the percentage of sick people who are correctly identified as having the condition). The formula for recall is: $$\text{Recall} = \frac{\text{True Positives}}{\text{True Positives + False Negatives}}$$ Recall answers the question: "Out of all the actual positives, how many did the model correctly identify?"

In summary, while recall is focused on the model's ability to identify all relevant instances (true positives), specificity is concerned with the model's ability to identify only relevant instances and exclude non-relevant ones (true negatives). In practical applications, the choice of emphasizing recall or specificity depends on the specific requirements and consequences of false positives and false negatives in the given context. For example, in medical testing, a high recall is often prioritized to ensure that as many true cases of a disease as possible are identified.

This has the taste of Type I and Type II errors in basic Stats. I always thought of that like a job search—The job you have on the table might not be the best offer that will come in over the transom. On the other hand it might be the only.


Mini-memoir. Through happenstance, in 1971, I found myself pursuing a second masters after being kicked out of another field with the terminal masters consolation prize. A combination of serious computational deficiencies and contributions to having the campus shut down, another tale.

This time, in my new program, my parlor trick of order-of-magnitude math quickly established me as the authority on all things trans-arithmetical. I ended up developing the method used to decide if metrics of something recorded as "landscape quality" had improved or declined between 1950 and 1970 air photo interpretations. At the end of a tedious series of planimetric calculations with a little wheeled instrument over arial photographs and translating those into scaleless figures of merit, was there a net improvement or decline? I secretly believed that given the mismash of stuff we were trying to overcompress that there would be a high potential for an improvement, say, in "diversity" because more forest open land edge had been introduced, there might be an offseting potential for deterioration on "runoff potential" going on at the same time.

There was nothing for it but to take up a standard text and try to plumb the mysteries of statistics. In an early example of an unfortunate tendency for motivated reasoning, I settled on a Student's t metric on the basis of computational feasibility. That is, what we could hope to accomplish on a high-end mechanical calculator. At the time, I suspected that I was in the position of an ant colony newly established in the Amazon Basin whose first worker had ventured out to cut its first leaf. Progress since has been gratifying considering what else I've been spending time on. There might even be a little patch big enough to build a hut. But I keep nibbling away, undeterred.

Mascot of the Day

Latest

Counting chickens

Counting chickens

Refining polling models. It's the received wisdom of the ages—don't rely on events that haven't happened yet. On the other hand, don't bring a knife to a gunfight, either.

Members Public