A key challenge in political science has been creating reliable measures of political sophistication through communication analysis. Existing methods fall short due to their limitations.
The authors introduce a new approach that uses crowdsourced comparisons and develops a statistical model incorporating parts of speech and word rarity metrics from the Google Books Ngrams dataset.
Methodology & Features:
* Leverages thousands of text snippet comparisons via crowdsourcing
* Incorporates previously excluded elements like parts-of-speech counts and rare words identified through dynamic term frequencies (Google Books)
This technique offers several advantages:
* Provides a measure specifically suited for political communication
* Enables easy application, scaling, and probabilistic comparison across texts.
Demonstration:
They reanalyze the State of the Union corpus using their improved method to show how different conclusions arise regarding sophistication levels.
What's Next?
This tool allows researchers to compare text complexity effectively as a function of various covariates.