Confounding refers to a situation where the effect of a studied risk factor on a specific outcome is mixed with the effect of a factor not accounted for in the analysis. This other factor, known as a confounder, can distort (bias) the risk factor’s estimated effect and mislead the investigator. The problem can arise if a confounder i) affects the studied outcome and ii) is unequally distributed among subjects unexposed and exposed to the studied risk factor. There are basically two ways to adjust the risk estimate for the effect of the confounder. One way is to account for the confounder’s effect on the outcome by condition on the confounder, i.e. including the confounder as a covariate in the analysis. Another way is to account for the unequal distribution of the confounder between unexposed and exposed subjects by stratifying or matching the analysis on the propensity of being exposed to the risk factor. Given the average exposure risk in the form of a specially calculated propensity score, the exposure to the risk factor is considered random.

Both methods are problematic because all confounders are usually unknown or measured and registered in the investigator’s database, making the adjustment imperfect and leaving residual confounding. Both methods also require careful selection of adjustment factors to avoid adjustment bias.

Adjusting by including covariates in a multivariable statistical model must be based on assumptions regarding cause and effect. Adjusting for a factor unrelated to the studied outcome reduces the statistical precision, and adjusting for a factor influenced by both the risk factor and the outcome (a collider) can bias the risk factor’s estimated effect.

Adjusting by propensity score matching is related to similar problems. Including non-confounders in the propensity score estimation, e.g. variables affected by the treatment but unrelated to the outcome or on the pathway between studied cause and effect, can result in overmatching that biases the risk estimate and reduces the statistical precision.

Furthermore, the matching procedure itself may be problematic as matching tends to reduce available sample size and can introduce selection bias in the matched dataset.