# Adjustment of a Prevalence Estimate for Misclassification

Normally, a prevalence (or incidence) estimate consists simply of a
proportion: those with the disease in the numerator, and those with and
without the disease in the denominator. However, if measurement error
is acknowledged, it must be accepted that the subjects in the numerator
truly reflect a mixture of true positives and false positives with
respect to the disease variable.
Denote those who truly have the disease as 'A', those who truly do not as
'B'. Subjects determined using an imperfect measure to have the disease,
'a' and those determined using an imperfect measure to be disease negative,
'b'.

Based on the above, we can say that:

a = TP + FP

or
a = (Se)A + (1-Sp)(B)

similarly,

b = (1-Se)A + (Sp)B

These equations can be re-arranged to calculate an estimate of the true
prevalence:

A / (A+B)

from the observed prevalence:

a / (a+b)

and the error rates (Se and Sp). The adjusted prevalence
should represent the population value resulting
in the observed prevalence, given the misclassification rates.

For a more detailed discussion, and an example, see: Patten SB.
Integrating Data from Clinical and Administrative Databases in
Pharmacoepidemiological Research. Canadian Journal of Clinical
Pharmacology 1998; 5(2): 92-97.

## Explore the Impact of Misclassification Bias by Putting
Some Values into this Calculator