Directional relational event data, such as email data, often contain unicast messages (i.e., messages of one sender towards one receiver) and multicast messages (i.e., messages of one sender towards multiple receivers). The Enron email data that is the focus in this paper consists of 31% multicast messages. Multicast messages contain important information about the roles of actors in the network, which is needed for better understanding social interaction dynamics. In this paper a multiplicative latent factor model is proposed to analyze such relational data. For a given message, all potential receiver actors are placed on a suitability scale, and the actors are included in the receiver set whose suitability score exceeds a threshold value. Unobserved heterogeneity in the social interaction behavior is captured using a multiplicative latent factor structure with latent variables for actors (which differ for actors as senders and receivers) and latent variables for individual messages. A Bayesian computational algorithm, which relies on Gibbs sampling, is proposed for model fitting. Model assessment is done using posterior predictive checks. Based on our analyses of the Enron email data, a mc-amen model with a 2 dimensional latent variable can accurately capture the empirical distribution of the cardinality of the receiver set and the composition of the receiver sets for commonly observed messages. Moreover the results show that actors have a comparable (but not identical) role as a sender and as a receiver in the network.
翻译:定向关系事件数据(例如电子邮件数据)通常包含单播消息(即一个发送者向一个接收者的消息)和多播消息(即一个发送者向多个接收者的消息)。本文聚焦的安然公司电子邮件数据中,多播消息占比31%。多播消息蕴含网络中行动者角色的重要信息,对于深入理解社会交互动态至关重要。本文提出一种乘法潜因子模型以分析此类关系数据。对于给定消息,所有潜在接收者被置于一个适用性标度上,当适用性得分超过阈值时,该行动者被纳入接收者集合。社会交互行为中的未观测异质性通过乘法潜因子结构捕获,该结构包含针对行动者(区分其作为发送者与接收者的角色)的潜变量以及针对每条消息的潜变量。本文提出一种基于吉布斯采样的贝叶斯计算算法进行模型拟合,并通过后验预测检验评估模型性能。基于我们对安然电子邮件数据的分析,包含二维潜变量的mc-amen模型能够准确捕获接收者集合基数的经验分布,以及常见消息中接收者集合的组成结构。此外,结果表明,行动者在该网络中作为发送者与接收者的角色具有可类比性(但非完全等同性)。