Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network, which is needed for better understanding social interaction dynamics. In this paper a multiplicative latent factor model is proposed to analyze such relational data. For a given message, all potential receiver actors are placed on a suitability scale, and the actors are included in the receiver set whose suitability score exceeds a threshold value. Unobserved heterogeneity in the social interaction behavior is captured using a multiplicative latent factor structure with latent variables for actors (which differ for actors as senders and receivers) and latent variables for individual messages. A Bayesian computational algorithm, which relies on Gibbs sampling, is proposed for model fitting. Model assessment is done using posterior predictive checks. Based on our analyses of the Enron email data, a mc-amen model with a 2 dimensional latent variable can accurately capture the empirical distribution of the cardinality of the receiver set and the composition of the receiver sets for commonly observed messages. Moreover the results show that actors have a comparable (but not identical) role as a sender and as a receiver in the network.
翻译:定向关系事件数据(如电子邮件数据)通常包含单播消息(即一位发送者向一位接收者发送的消息)和多播消息(即一位发送者向多位接收者发送的消息)。本文重点研究的安然公司电子邮件数据中,多播消息占比达31%。多播消息蕴含着网络中行动者角色信息,这对深入理解社会互动动态至关重要。本文提出一种乘性潜因子模型来分析此类关系数据。对于给定消息,所有潜在接收者被置于适应性量表中,当适应性得分超过阈值时,该行动者被纳入接收者集合。通过引入针对行动者(区分发送者与接收者角色)和个体消息的乘性潜因子结构,捕获社会互动行为中不可观测的异质性。模型拟合采用基于吉布斯采样的贝叶斯计算算法,并通过后验预测检验进行模型评估。基于对安然电子邮件数据的分析,具有二维潜变量的mc-amen模型能够准确捕获常见消息中接收者集合的基数经验分布与构成。此外,结果表明行动者在网络中作为发送者和接收者的角色具有可比性(但非完全等同性)。