FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
Can Speech Recognition Replace Human Transcripts in Political Text Analysis?
Insights from the Field
speech recognition
text-as-data
bag-of-words
WERSIM
R
Methodology
Pol. An.
12 R files
5 archives
1 PDF files
Dataverse
Testing the Validity of Automatic Speech Recognition for Political Text Analysis was authored by Sven-Oliver Proksch, Christopher Wratil and Jens Wäckerle. It was published by Cambridge in Pol. An. in 2019.

🔍 What This Paper Does

Examines the validity of automatic speech recognition (ASR) for quantitative political text analysis, focusing on how ASR transcripts perform with standard bag-of-words methods when human transcription is unavailable or prohibitively expensive.

🧾 Where This Matters

  • Political speech sources that are often not routinely transcribed: parliamentary speeches, party conferences, television interviews and talk shows, and other recorded political events
  • Contexts where on-demand human transcription is cost-prohibitive for research projects

🧪 How Validity Was Tested

  • Introduces a novel word error rate simulation (WERSIM) procedure to probe how transcription errors affect downstream bag-of-words analyses
  • Implements WERSIM in R and uses it to simulate varying levels of ASR error
  • Applies quantitative text-analysis workflows to ASR-generated transcripts to evaluate robustness to transcription noise

📈 Key Findings

  • Demonstrates the potential for ASR outputs to be used with bag-of-words models to address open questions in political science
  • Uses two substantive applications to illustrate practical uses across different kinds of political speech
  • Shows that systematic robustness checks (via WERSIM) are essential for interpreting results from ASR-derived text

⚠️ Limitations and Practical Challenges

  • Accuracy of ASR varies by context, speaker, audio quality, and language, which affects downstream inferences
  • ASR does not eliminate the need for validation: researchers must assess error sensitivity for their specific research designs
  • Practical hurdles include model choice, preprocessing decisions, and integration with existing text-as-data pipelines

🔧 Tools Provided

  • An R implementation of WERSIM and a workflow demonstrating how to combine ASR transcripts with bag-of-words text-analysis methods

Why it matters: Expands the set of usable political speech sources for text-as-data research while offering a practical framework to evaluate and report the risks posed by transcription error.

data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan