Every diagnostic test has a unique series of characteristics that indicate its accuracy in a given context. An online version of the tables and calculations below is available here.

Sensitivity and Specificity are two of the most basic characteristics of a diagnostic test. Both descriptions are independent of the prevalence of the disease in question.

The sensitivity describes the ability of a diagnostic test to identify true disease without missing anyone by leaving the disease undiagnosed. Thus, a high sensitivity test has few false negatives and is effective at ruling conditions “out” (SnOut).

The specificity describes the ability of a diagnostic test to be correctly negative in the absence of disease without mislabeling anyone. Thus, a high specificity test has few false positives and is effective in ruling conditions “in” (SpIn).

Both of these tests report features of a given diagnostic test that are independent of the actual prevalence of the disease. However, since clinicians typically practice with the explicit or implicit knowledge of disease prevalence, the positive and negative predictive values of tests (PPV & NPV) are often more useful than raw sensitivity and specificity.

Negative Predictive Value (NPV) puts Sensitivity into the context of known disease prevalence by reporting the “probability that an individual is not affected with the condition when a negative test result is observed” [NCI].

NPV = (true negative) / (true and false negatives)

Positive Predictive Value (PPV) puts Specificity into the context of known disease prevalence by reporting the “probability that an individual is affected with the condition when a positive test result is observed” [NCI].

PPV = (true positive) / (true and false positives)

Taking a different slant, likelihood ratios express the relative odds that a positive test represents a correct clinical diagnosis or true positive (LR+) and that a negative test represents a true negative (LR-). Likelihood ratios greater than 5 and less than 0.2 are typically the most clinically relevant (citation??)

LR+ = (Test Sensitivity) / (1 - Test Specificity)

LR- = (1 - Test Sensitivity) / (Test Specificity)

To illustrate these concepts, let’s suppose that, hypothetically, that over the past few months, you have seen 200 patients in whom you ordered a standing 2-view knee x-ray. For the sake of the hypothetical discussion, you know that your radiograph has a sensitivity is 80% and a specificity is 90% for osteoarthritis (OA), and for the sake of discussion, the prevalence of OA is 15 in 100 (15%) in the population.

Among the 30 patients who prove to eventually have true OA, your initial standing knee x-ray was positive in only 24, leaving 6 with a false negative result. This – (24/(6+24)) – illustrates an 80% sensitivity.

Similarly, among the 170 patients who later prove to not have true OA, your initial radiograph was negative in most (152), leaving 17 with a confusing false positive result. This – (153/(17+153)) – illustrates an 90% specificity.

OA present OA absent

X-ray “positive” 24 17 41

X-ray “negative” 6 153 159

30 170 200

You can see the prevalence of OA in this population as 15% (30/200). Remembering that if you don’t know the true prevalence of a disease, the following equations can’t be used with certainty, the negative predictive value of this x-ray for ruling out OA is 96% (153/(6+153)) and the positive predictive value for diagnosing OA is about 59% (24/(24+17)).

Now what if we have a hypothetical test for Sjögren syndrome with the exact same sensitivity and specificity (.8 and .9, respectively) with a much lower (hypothetical) prevalence of 1 in 100?

Sjögren present Sjögren absent

Test “positive” 8 100 108

Test “negative” 2 900 902

10 1000 1010

In this circumstance (same test characteristics, lower prevalence), the negative and positive predictive values would be:

NPV: 99.78% (900/(2+900))

PPV: 7.41% (8/(8+100))

Thus, for rare diseases, the discriminatory power of a diagnostic test with reasonable sensitivity and specificity for ruling out a rare disease (i.e., NPV) is quite robust. However, when diseases are rare, ruling in a test (i.e., PPV) based upon a single positive test is simply not possible. This theme has broad implications for most of the tests and conditions that we are discussing.

An online version of these tables and calculations is available here.

The concepts of SnOut and SpIn have been softly criticized by some authors (Pewnsner et al., BMJ, 2004; Hegedus et al., j Man Ther. 2009). One of their critics is that for instance, both specificity and sensitivity have to be considered together in assessing the validity of a test.

This critic is so deeply true that it even proofs that SnNouts and SpIns do not exist except as mighty dragons in a fairy-tale.

Indeed a test with a specificity of 99% and a sensitivity of 1% does not allow to rule a diagnosis in and the same must be said for all test for wich the sum of the sensitivity and the specificity equals 100%. Such tests do contribute not at all to a diagnosis. Thus even if their specificity should be extremely high it cannot be tests ruling a diagnosis in given a positive test result.

Posted by: michel g soete | August 01, 2010 at 10:24 AM

Yesterday I wrote some comment. Perhaps it should be preceeded by following:

'The article 'SpIn and SnOut' is very well and clearly written and represents by far the most widely spread and common opinion but....'

What I mean is perhaps best illustrated by an example

disease present disease absent

positive 5 5

negative 95 95

The test is highly specific (95%) but the pre-test probability was 50% and the post-test probability given a positive result is 50%. Thus a high specificity is no guaranty that a test is a SpIn.

I am not very happy with it but it is a fact.

Posted by: michel g soete | August 02, 2010 at 03:34 AM