🔎 What Was Investigated
Matching for causal inference is well understood for low-dimensional data, but standard approaches break down for text documents. High dimensionality makes exact matching infeasible, propensity scores produce incomparable matches, and assessing match quality becomes difficult. The study frames text matching as two design choices: the choice of text representation and the choice of distance metric, and asks how these choices affect both the quantity and quality of matches.
🧪 How Methods Were Compared
A systematic multifactor evaluation experiment using human subjects was used to compare text-matching procedures. Key features of the evaluation:
- Over 100 unique text-matching methods were evaluated, along with 5 comparison methods drawn from the literature.
- Human judgments of subjective match quality were collected to assess pairwise match quality.
- A predictive model was developed to estimate match quality for document pairs as a function of the various distance scores derived from the tested methods.
📈 Key Findings
- Certain combinations of text representation and distance metric produced matches with higher subjective match quality than current state-of-the-art techniques.
- The predictive model successfully mimics human judgment and can predict match quality from distance scores, enabling approximate or unsupervised evaluation of new procedures.
- Both the number of matches identified and their subjective quality vary substantially with the choice of representation and distance metric.
🧩 Demonstrations of Use
Two applications illustrate practical benefits of the identified best method:
- Media bias: Text matching was used to control for topic selection when comparing news articles from thirteen news sources, clarifying comparisons across outlets.
- Observational causal inference: Conditioning on text data in a study of a medical intervention produced more precise causal estimates.
🔚 Why This Matters
The work provides a practical framework for matching documents by separating representation and distance choices, identifies methods that improve subjective match quality, and offers a predictive tool to approximate human match judgments—making text-based causal inference more reliable and easier to evaluate without extensive manual labeling.