With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than traditional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated. This bulletin is based on the summary of the author's dissertation. The research summarized in the dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.
翻译:随着深度学习技术的迅猛发展,机器学习研究在基础与应用两个层面均聚焦于提升模型预测的可解释性与预测性能。尽管深度学习模型的预测性能远超传统机器学习模型,但其具体预测过程仍难以解读与阐释。这一现象被称为机器学习模型的"黑箱化"问题,在制造、商业、机器人等已普遍应用该技术的行业,以及不容许出现失误的医疗领域等广泛研究范畴中,被视为尤为重要的课题。本简报基于作者博士论文的总结。该论文所综述的研究聚焦于近年来备受瞩目的注意力机制,从两方面探讨其潜力:一方面从基础研究角度探讨其在提升预测性能与可解释性方面的作用,另一方面从应用研究角度评估其在超越实验室环境的大规模数据集实际场景中的表现。论文最终还总结了这些发现对后续研究的启示,并展望了该领域的发展前景。