Real-world applications with multiple sensors observing an event are expected to make continuously-available predictions, even in cases where information may be intermittently missing. We explore methods in ensemble learning and sensor fusion to make use of redundancy and information shared between four camera views, applied to the task of hand activity classification for autonomous driving. In particular, we show that a late-fusion approach between parallel convolutional neural networks can outperform even the best-placed single camera model. To enable this approach, we propose a scheme for handling missing information, and then provide comparative analysis of this late-fusion approach to additional methods such as weighted majority voting and model combination schemes.
翻译:现实世界中,当多个传感器观测同一事件时,即使在信息可能间歇性缺失的情况下,也需要持续输出预测结果。我们探索了集成学习和传感器融合方法,利用四个摄像头视图之间的冗余和共享信息,并将其应用于自动驾驶中的手部活动分类任务。我们特别展示了并行卷积神经网络之间的后融合方法能够超越甚至最优的单摄像头模型。为实现这一方法,我们提出了一种处理缺失信息的方案,并将该后融合方法与加权多数投票和模型组合方案等其他方法进行了对比分析。