Textual event records, such as alarm logs, have become an increasingly common data source in engineering and manufacturing systems. Beyond identifying correlations or recurring patterns, engineers are often interested in understanding which types of events causally trigger or influence other events during system operation. Textual event descriptions may contain semantic clues about such causal relationships, and recent large language models (LLMs) provide a promising tool for extracting these signals. However, relying solely on LLM-encoded textual information is insufficient for accurate causal discovery, since semantic patterns do not directly reveal causal mechanisms and may confuse causation with correlation or frequent sequential patterns. To address these challenges, we propose \textbf{LMT}, a Bayesian causal discovery framework for engineering event data that jointly leverages textual descriptions and timestamps. Specifically, LMT first uses LLMs to extract semantic causal signals from event descriptions and constructs a prior distribution over causal graphs among event types or event clusters. It then incorporates temporal evidence through a Poisson-process-based likelihood, allowing the LLM-informed prior to be refined by timestamp-based statistical evidence. By integrating the textual and temporal information, LMT produces a causal graph that is both interpretable and data-supported. Simulation studies show that the proposed framework is effective across different settings and is especially advantageous in small-sample alarm-event scenarios.
翻译:文本事件记录(如报警日志)已成为工程与制造系统中日益常见的数据源。除识别相关性或重复模式外,工程师通常希望了解系统运行过程中不同类型事件之间是否存在因果触发或影响关系。文本事件描述可能蕴含此类因果关系的语义线索,而近期的大语言模型(LLM)为提取这些信号提供了有前景的工具。然而,仅依赖LLM编码的文本信息不足以实现精确的因果发现,因为语义模式无法直接揭示因果机制,且可能混淆因果关系与相关性或频繁序列模式。针对这些挑战,我们提出LMT——一种面向工程事件数据的贝叶斯因果发现框架,该框架联合利用文本描述与时间戳信息。具体而言,LMT首先使用LLM从事件描述中提取语义因果信号,构建事件类型或事件簇之间因果图的先验分布;随后通过基于泊松过程的似然函数整合时间证据,使得LLM信息驱动的先验分布能够由基于时间戳的统计证据进行修正。通过融合文本与时间信息,LMT生成兼具可解释性与数据支撑的因果图。仿真研究表明,所提框架在不同场景下均有效,且在小样本报警事件场景中具有显著优势。