Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of information by instead representing the data as distributions. We develop an interpretable method for distributional data analysis that ensures trustworthy and robust decision-making: Analyzing Distributional Data via Matching After Learning to Stretch (ADD MALTS). We (i) provide analytical guarantees of the correctness of our estimation strategy, (ii) demonstrate via simulation that ADD MALTS outperforms other distributional data analysis methods at estimating treatment effects, and (iii) illustrate ADD MALTS' ability to verify whether there is enough cohesion between treatment and control units within subpopulations to trustworthily estimate treatment effects. We demonstrate ADD MALTS' utility by studying the effectiveness of continuous glucose monitors in mitigating diabetes risks.
翻译:许多现代因果问题关注的是,通过可穿戴设备和传感器测量的复杂结果如何受到处理影响。当前的分析方法需要将这些数据总结为标量统计量(如均值),但这些摘要可能具有误导性。例如,不同的分布可能具有相同的均值、方差及其他统计量。研究者可通过将数据表示为分布来克服信息损失。我们开发了一种可解释的分布型数据分析方法,以确保可信且稳健的决策:通过拉伸后匹配分析分布型数据(ADD MALTS)。我们(i)提供了估计策略正确性的分析保证,(ii)通过模拟证明ADD MALTS在估计处理效应方面优于其他分布型数据分析方法,以及(iii)展示了ADD MALTS验证子群体中处理组与对照组之间是否存在足够一致性以可信地估计处理效应的能力。我们通过研究连续血糖监测仪在降低糖尿病风险中的有效性,展示了ADD MALTS的实用价值。