To generate accurate and reliable predictions, modern AI systems need to combine data from multiple modalities, such as text, images, audio, spreadsheets, and time series. Multi-modal data introduces new opportunities and challenges for disentangling uncertainty: it is commonly assumed in the machine learning community that epistemic uncertainty can be reduced by collecting more data, while aleatoric uncertainty is irreducible. However, this assumption is challenged in modern AI systems when information is obtained from different modalities. This paper introduces an innovative data acquisition framework where uncertainty disentanglement leads to actionable decisions, allowing sampling in two directions: sample size and data modality. The main hypothesis is that aleatoric uncertainty decreases as the number of modalities increases, while epistemic uncertainty decreases by collecting more observations. We provide proof-of-concept implementations on two multi-modal datasets to showcase our data acquisition framework, which combines ideas from active learning, active feature acquisition and uncertainty quantification.
翻译:为生成准确可靠的预测,现代人工智能系统需整合来自文本、图像、音频、电子表格及时间序列等多模态数据。多模态数据为不确定性解耦带来了新的机遇与挑战:机器学习界通常认为认知不确定性可通过收集更多数据来降低,而偶然不确定性则不可消减。然而,当信息从不同模态获取时,这一假设在现代人工智能系统中受到挑战。本文提出一种创新的数据获取框架,通过不确定性解耦形成可操作的决策机制,支持在样本数量与数据模态两个维度进行采样。核心假设认为:偶然不确定性随模态数量增加而降低,认知不确定性则通过收集更多观测样本得以减少。我们在两个多模态数据集上提供了概念验证实现,展示了融合主动学习、主动特征获取与不确定性量化思想的数据获取框架。