Open-ended survey responses are infrequently collected and typically require human coding in political science research. This article introduces Structural Topic Models (STM), a machine learning method that automatically analyzes text while incorporating document-level information like author gender or political affiliation.
Key Innovation: STM draws on recent advances in topic modeling but includes auxiliary information about documents to improve interpretation.
* Leverages recent advances in topic modeling
* Incorporates auxiliary information about documents
This approach provides a powerful alternative for survey researchers and experimentalists:
Analysis Advantages:
* Makes interpreting open-ended responses easier, revealing themes missed through manual coding alone
* Capable of estimating treatment effects from text data to complement traditional analysis methods
We demonstrate these features by analyzing survey data and experimentally collected political text.