๐ง What the Method Does
Introduces a method for scaling two datasets from different sources by estimating a latent factor common to both and idiosyncratic factors unique to each source. The approach also lets the scaled locations depend on covariates and enables efficient inference via resampling.
๐ How the Model Handles Data and Inference
- Models a shared latent factor that captures the subspace common to both datasets.
- Simultaneously estimates idiosyncratic latent factors for each dataset to capture source-specific variation.
- Permits scaled locations to be modeled as functions of covariates to increase flexibility and interpretability.
- Uses an efficient implementation that supports inference through resampling techniques.
๐งช Evidence from Simulations
A simulation study demonstrates that the proposed method outperforms existing alternatives in two respects:
- Better recovery of the variation common to both datasets.
- Improved identification of latent factors that are specific to each dataset.
๐๏ธ Applied Example: Votes and Speeches in the 112th U.S. Senate
- Applied the method to roll-call voting and speech data from the 112th U.S. Senate.
- Recovered a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives.
- Identified the words most strongly associated with each senator's position in that shared subspace.
- Estimated a word-specific subspace that spans topics from national security to budget concerns.
- Estimated a vote-specific subspace that places Tea Party senators at one extreme and senior committee leaders at the other.
โ๏ธ Why It Matters
Provides a practical and flexible way to combine different data sources (e.g., text and votes) to uncover both shared political dimensions and source-specific signals, with usable inference for applied political science work.