P-values and bias
March 11, 2025•266 words
The accuracy of measurements and statistical estimates has two components, precision and validity. For example, a pistol shooter trying to hit the bull's-eye may shake his hand when firing (inaccuracy) or have the rear sight poorly set (systematic error).
P-values, confidence intervals and statistical significance are the precision measures used in statistical inference. Systematic errors (validity problems) are known as bias. The statistical precision of a hypothesis test or effect size estimate is related to the sample size.
The validity of p-values or confidence intervals depends on how the data have been collected or measured. For example, observational screening studies are generally considered to be more susceptible to bias than randomised trials because screening participants in observational studies are not randomly assigned to screening but are self-selected, leading to healthy screenee bias if subjects with symptoms of the condition being screened decline to participate.
An investigator can avoid foreseeable bias by taking it into account in the design of a trial, for example by randomising treatment allocation and masking allocated treatments. However, this is not possible in observational studies.
There are many forms of bias, but they can be grouped into three main categories: selection bias, information bias and confounding bias. The first category relates to the selection of study subjects, the second to the collection of information from the subjects, such as recall bias, and the third to analysis problems, such as confounding by association, confounding by indication, effect modification and adjustment bias.
The inferential uncertainty of a research finding as shown by p-values or confidence intervals, does not include the uncertainty about bias consequences.