FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
How Overlapping News Sources Can Fix Underreported Event Data
Insights from the Field
event data
misclassification
maximum likelihood
media
SCAD
Pol. An.
11 R files
3 Datasets
1 Text
Dataverse
Two Wrongs Make a Right was authored by Scott Cook, Betsabe Blas, Raymond Carroll and Samiran Sinha. It was published by Cambridge in Pol. An. in 2017.

šŸ“° The problem with media-based event data

Media-based event data—records compiled from reporting by news outlets—are widely used in political science but often miss events (strikes, protests, conflict). Underreporting by primary and secondary sources produces incomplete data that can bias estimates and undermine inference.

āš™ļø A new correction that uses multiple sources

A novel maximum likelihood estimator is proposed to correct misclassification when event data come from multiple, overlapping news sources (for example, Agence France-Presse and Reuters). The estimator leverages the overlap across sources rather than treating missing reports as an irrecoverable flaw.

šŸ” How the estimator is specified

  • The general formulation allows separate sets of predictors for:
  • the true-event model (what causes an event to occur), and
  • each source’s misclassification model (what predicts whether a given source fails to report an event).
  • This structure permits simultaneous testing of theories about both the causes of an event and the mechanisms of reporting failure.

šŸ“ˆ Evidence from simulations

  • Simulations show the estimator regularly outperforms common strategies that either ignore misclassification, ignore the data-generating process’s special features, or both.
  • Performance gains are consistent across settings where sources overlap to varying degrees and where reporting depends on covariates.

šŸ“Š Empirical illustration

  • The method is illustrated with a repression model using the Social Conflict in Africa Database (SCAD), demonstrating practical gains in inference when multiple news sources are available.

āœ… Why it matters

  • When multiple news outlets report on the same domain, the overlap is a source of information, not merely redundancy. Correcting for source-specific underreporting using the proposed estimator reduces bias and improves the ability to test substantive political theories about events and media reporting.
data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan