Graphs are commonly used to represent and visualize causal relations. For a small number of variables, this approach provides a succinct and clear view of the scenario at hand. As the number of variables under study increases, the graphical approach may become impractical, and the clarity of the representation is lost. Clustering of variables is a natural way to reduce the size of the causal diagram, but it may erroneously change the essential properties of the causal relations if implemented arbitrarily. We define a specific type of cluster, called transit cluster, that is guaranteed to preserve the identifiability properties of causal effects under certain conditions. We provide a sound and complete algorithm for finding all transit clusters in a given graph and demonstrate how clustering can simplify the identification of causal effects. We also study the inverse problem, where one starts with a clustered graph and looks for extended graphs where the identifiability properties of causal effects remain unchanged. We show that this kind of structural robustness is closely related to transit clusters.
翻译:图结构常用于表示和可视化因果关系。当变量数量较少时,这种方式能简洁清晰地呈现所研究场景。随着研究变量数量增加,图方法可能变得不实用,表示的清晰度也会丧失。变量聚类是减小因果图规模的天然途径,但若随意实施,可能会错误改变因果关系的本质属性。我们定义了一种特定类型的聚类——传递聚类,它能保证在特定条件下保持因果效应的可识别性特性。我们提出了一个完备且可靠的算法,用于在给定图中寻找所有传递聚类,并展示了聚类如何简化因果效应的识别。同时,我们还研究了逆问题:从聚类图出发,寻找使因果效应可识别性特性保持不变的外延图。我们证明这种结构鲁棒性与传递聚类密切相关。