Artificial Neural Networks (ANNs), including fully-connected networks and transformers, are highly flexible and powerful function approximators, widely applied in fields like computer vision and natural language processing. However, their inability to inherently respect causal structures can limit their robustness, making them vulnerable to covariate shift and difficult to interpret/explain. This poses significant challenges for their reliability in real-world applications. In this paper, we introduce Causal Transformers (CaTs), a general model class designed to operate under predefined causal constraints, as specified by a Directed Acyclic Graph (DAG). CaTs retain the powerful function approximation abilities of traditional neural networks while adhering to the underlying structural constraints, improving robustness, reliability, and interpretability at inference time. This approach opens new avenues for deploying neural networks in more demanding, real-world scenarios where robustness and explainability is critical.
翻译:人工神经网络(包括全连接网络和Transformer)是高度灵活且强大的函数逼近器,广泛应用于计算机视觉和自然语言处理等领域。然而,其本质上无法遵循因果结构,这会限制其鲁棒性,使其易受协变量偏移的影响且难以解释/说明。这对其在实际应用中的可靠性构成了重大挑战。本文提出因果Transformer(CaTs),这是一个通用的模型类别,旨在根据有向无环图(DAG)指定的预设因果约束进行运算。CaTs保留了传统神经网络强大的函数逼近能力,同时遵循底层结构约束,从而在推理时提高了鲁棒性、可靠性和可解释性。该方法为在要求更高、鲁棒性和可解释性至关重要的现实场景中部署神经网络开辟了新途径。