Data Augmentation and Disaggregation

Data Science

Neal Fultz

Research Data Scientist / Statistician at UCLA SSC

Machine learning models may be very powerful, but many data sets are only released in aggregated form, precluding their use directly. Various heuristics can be used to bridge the gap, but they are typically domain-specific. The data augmentation algorithm, a classic tool from Bayesian computation, can be applied more generally. We will present a brief review of DA and how to apply it to disaggregation problems. We will also discuss a case study on disaggregating daily pricing data, along with a reference implementation R package.