At the core of causal inference lies the challenge of determining reliable causal graphs solely based on observational data. Since the well-known backdoor criterion depends on the graph, any errors in the graph can propagate downstream to effect inference. In this work, we initially show that complete graph information is not necessary for causal effect inference; the topological order over graph variables (causal order) alone suffices. Further, given a node pair, causal order is easier to elicit from domain experts compared to graph edges since determining the existence of an edge can depend extensively on other variables. Interestingly, we find that the same principle holds for Large Language Models (LLMs) such as GPT-3.5-turbo and GPT-4, motivating an automated method to obtain causal order (and hence causal effect) with LLMs acting as virtual domain experts. To this end, we employ different prompting strategies and contextual cues to propose a robust technique of obtaining causal order from LLMs. Acknowledging LLMs' limitations, we also study possible techniques to integrate LLMs with established causal discovery algorithms, including constraint-based and score-based methods, to enhance their performance. Extensive experiments demonstrate that our approach significantly improves causal ordering accuracy as compared to discovery algorithms, highlighting the potential of LLMs to enhance causal inference across diverse fields.
翻译:因果推断的核心挑战在于仅凭观测数据确定可靠的因果图。由于著名的后门准则依赖于因果图,图中的任何错误都会向下游传播并影响效应推断。本研究首先证明,因果效应推断并不需要完整的图信息,仅需图变量的拓扑顺序(因果顺序)即可满足。进一步地,给定节点对时,与图边相比,因果顺序更容易从领域专家处获取,因为边是否存在可能严重依赖于其他变量。有趣的是,我们发现这一原理同样适用于大规模语言模型(LLM),如GPT-3.5-turbo和GPT-4,这启发了我们提出一种自动化方法:以LLM作为虚拟领域专家来获取因果顺序(进而推断因果效应)。为此,我们采用不同的提示策略和上下文线索,提出了一种从LLM中稳健获取因果顺序的技术。在承认LLM局限性的基础上,我们还研究了将LLM与约束型、评分型等经典因果发现算法相结合的可能技术,以提升其性能。大量实验表明,与发现算法相比,我们的方法显著提高了因果顺序的准确性,凸显了LLM在增强跨领域因果推断方面的潜力。