The recent advancements in Deep Learning models and techniques have led to significant strides in performance across diverse tasks and modalities. However, while the overall capabilities of models show promising growth, our understanding of their internal reasoning processes remains limited, particularly concerning systematic inconsistencies or errors patterns of logical or inferential flaws. These inconsistencies may manifest as contradictory outputs, failure to generalize across similar tasks, or erroneous conclusions in specific contexts. Even detecting and measuring such reasoning discrepancies is challenging, as they may arise from opaque internal procedures, biases and imbalances in training data, or the inherent complexity of the task. Without effective methods to detect, measure, and mitigate these errors, there is a risk of deploying models that are biased, exploitable, or logically unreliable. This thesis aims to address these issues by producing novel methods for deep learning models that reason over knowledge graphs, natural language, and images. The thesis contributes two techniques for detecting and quantifying predictive inconsistencies originating from opaque internal procedures in natural language and image processing models. To mitigate inconsistencies from biases in training data, this thesis presents a data efficient sampling method to improve fairness and performance and a synthetic dataset generation approach in low resource scenarios. Finally, the thesis offers two techniques to optimize the models for complex reasoning tasks. These methods enhance model performance while allowing for more faithful and interpretable exploration and exploitation during inference. Critically, this thesis provides a comprehensive framework to improve the robustness, fairness, and interpretability of deep learning models across diverse tasks and modalities.
翻译:深度学习模型与技术的近期进展,已在多种任务和模态上取得了显著的性能突破。然而,尽管模型的整体能力呈现出良好的增长态势,我们对其内部推理过程的理解仍然有限,特别是在系统性的不一致性或逻辑与推断缺陷的错误模式方面。这些不一致性可能表现为矛盾的输出、在相似任务间泛化失败,或在特定情境下得出错误结论。即使检测和衡量此类推理差异也颇具挑战,因为它们可能源于不透明的内部处理过程、训练数据中的偏见与不平衡,或任务固有的复杂性。若缺乏有效的方法来检测、衡量并缓解这些错误,则存在部署具有偏见、可被利用或逻辑上不可靠模型的风险。本论文旨在通过提出针对知识图谱、自然语言和图像进行推理的深度学习模型的新方法,以解决这些问题。论文贡献了两种技术,用于检测和量化源自自然语言与图像处理模型中不透明内部处理过程的预测不一致性。为缓解训练数据偏见导致的不一致性,本论文提出了一种数据高效采样方法以提升公平性与性能,以及在低资源场景下的合成数据集生成方法。最后,论文提供了两种优化模型以应对复杂推理任务的技术。这些方法在提升模型性能的同时,允许在推理过程中进行更忠实且可解释的探索与利用。至关重要的是,本论文提供了一个全面的框架,以提升深度学习模型在多种任务和模态上的鲁棒性、公平性与可解释性。