FIND DATA: By Author | Journal | Sites   ANALYZE DATA: Help with R | SPSS | Stata | Excel   WHAT'S NEW? US Politics | Int'l Relations | Law & Courts
   FIND DATA: By Author | Journal | Sites   WHAT'S NEW? US Politics | IR | Law & Courts
If this link is broken, please report as broken. You can also submit updates (will be reviewed).
MIDAS: Deep Learning That Fixes Missing Data Fast
Insights from the Field
multiple imputation
denoising autoencoder
missing data
surveys
deep learning
Methodology
Pol. An.
12 R files
1255 datasets
12 other files
12 PDF files
1 archives
1 text files
2 LaTeX files
Dataverse
The MIDAS Touch: Accurate and Scalable Missing-Data Imputation With Deep Learning was authored by Ranjit Lall and Thomas Robinson. It was published by Cambridge in Pol. An. in 2022.

🔍 What This Paper Introduces

Multiple imputation is a widely used, principled approach for handling missing values but often breaks down on very large or complex datasets. MIDAS (Multiple Imputation with Denoising Autoencoders) offers an accurate, fast, and scalable alternative by adapting a class of unsupervised neural networks—denoising autoencoders—to the imputation task.

🧠 How MIDAS Works

MIDAS repurposes denoising autoencoders by treating missing entries as an extra type of corruption. The model is trained to reconstruct the originally observed data while the missing entries are treated like corrupted inputs. Imputations are then drawn from the trained model that minimizes reconstruction error on the observed portion of the data.

📋 Key Features and Procedure

  • Reformulates multiple imputation using denoising autoencoders.
  • Treats missing values as corrupted data during training and draws multiple imputations from the reconstruction distribution.
  • Optimizes a loss that focuses on reconstructing the originally observed values, ensuring imputations align with observed structure.

📈 Tests on Simulated and Real Social Science Data

Systematic evaluations include both simulations and empirical social science datasets. An applied example uses a large-scale electoral survey to demonstrate performance in a real-world setting.

  • Findings show MIDAS delivers strong accuracy across a range of missingness patterns and data complexities.
  • MIDAS demonstrates computational efficiency and scalability relative to common multiple imputation approaches.

⚙️ Practical Takeaways and Tools

  • MIDAS provides a practical route to multiply impute large, high-dimensional datasets that challenge traditional methods.
  • Open-source software is provided to implement MIDAS in applied settings, enabling replication and adoption.

🔎 Why It Matters

MIDAS bridges principled multiple imputation and modern deep learning, offering political scientists and social researchers a scalable tool to handle missing data in large surveys and complex datasets without sacrificing accuracy.

data
Find on Google Scholar
Find on JSTOR
Find on CUP
Political Analysis
Podcast host Ryan