Logit models with small samples often have biased coefficient estimates, particularly away from zero. This article introduces Firth's penalized maximum likelihood (PML) estimator as a solution to this issue. While prior research highlighted PML for cases of separation, we demonstrate its broader benefit by showing it significantly reduces bias and variance in both small (50 observations) and larger samples.
What is Separation?
Separation occurs when one predictor perfectly predicts the outcome in some sample subset, causing standard maximum likelihood methods to fail or produce unreliable estimates. Political science datasets often encounter this problem due to their unique structures.
How PML Helps:
By adding a penalty term to the likelihood function during estimation, PML effectively shrinks coefficients toward zero without distorting interpretation as much as alternative shrinkage techniques (e.g., ridge regression). This approach provides more stable and accurate estimates than standard ML across various sample sizes.
Why It Matters for Political Science:
Using PML allows researchers to avoid the bias-variance tradeoff typically associated with estimation choices. The article analyzes Monte Carlo simulations alongside a re-evaluation of George and Epstein's (1992) classic American Political Science Review paper, showing that PML improves substantive findings related to descriptive representation or any dependent variable modeled by logit.
Conclusion:
Researchers analyzing small samples should consider using the PML estimator. It offers substantial improvements even in medium-sized datasets, making it a valuable tool for political science research.






