Aligning Votes and Speeches Reveals Shared Ideology and Distinct Issue Signals

Insights from the Field

scaling

latent factor

resampling

text analysis

Senate

Scaling Data from Multiple Sources was authored by Ted Enamorado, Gabriel Lopez-Moctezuma and Marc Ratkovic. It was published by Cambridge in Pol. An. in 2021.

🔧 What the Method Does

Introduces a method for scaling two datasets from different sources by estimating a latent factor common to both and idiosyncratic factors unique to each source. The approach also lets the scaled locations depend on covariates and enables efficient inference via resampling.

📐 How the Model Handles Data and Inference

Models a shared latent factor that captures the subspace common to both datasets.
Simultaneously estimates idiosyncratic latent factors for each dataset to capture source-specific variation.
Permits scaled locations to be modeled as functions of covariates to increase flexibility and interpretability.
Uses an efficient implementation that supports inference through resampling techniques.

🧪 Evidence from Simulations

A simulation study demonstrates that the proposed method outperforms existing alternatives in two respects:

Better recovery of the variation common to both datasets.
Improved identification of latent factors that are specific to each dataset.

🏛️ Applied Example: Votes and Speeches in the 112th U.S. Senate

Applied the method to roll-call voting and speech data from the 112th U.S. Senate.
Recovered a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives.
Identified the words most strongly associated with each senator's position in that shared subspace.
Estimated a word-specific subspace that spans topics from national security to budget concerns.
Estimated a vote-specific subspace that places Tea Party senators at one extreme and senior committee leaders at the other.

⚖️ Why It Matters

Provides a practical and flexible way to combine different data sources (e.g., text and votes) to uncover both shared political dimensions and source-specific signals, with usable inference for applied political science work.