Existing survey methods struggle to efficiently capture state-level polling data, making it costly and sparse. This article addresses this gap by combining 1,200 polls from the 2012 US presidential election with over 100 million political tweets.
Data & Methods
We model these polls using Twitter text through a novel linear regularization feature-selection approach designed for high-dimensional data like social media streams. Our analysis highlights specific textual elements that proved predictive of poll outcomes:
- Key Findings
- Twitter-based measures closely tracked existing opinion polls when properly modeled.
- These methods could be extended to predict unpolled states and potentially refine polling at sub-state or even near real-time levels.
- Why It Matters
This work reveals the specific topics and events driving opinion shifts during campaigns, offering new insights into partisan attention differences and information processing patterns.
It provides a more accessible methodology for generating timely poll approximations using readily available social media data—a valuable tool for understanding political dynamics.