Previous models for learning the semantic vectors of items and their groups, such as words, sentences, nodes, and graphs, using distributed representation have been based on the assumption that an item corresponds to one vector composed of dimensions corresponding to hidden contexts in the target. Multiple senses of an item are represented by assigning a vector to each of the domains where the item may appear or reflecting the context to the sense of the item. However, there may be multiple distinct senses of an item that change or evolve dynamically, according to the contextual shift or the emergence of novel contexts even within one domain, similar to a living entity evolving with environmental shifts. Setting the scope of disambiguity of items for sensemaking, the author presents a method in which a word or item in the data embraces multiple semantic vectors that evolve via interaction with others, similar to a cell embracing chromosomes crossing over with each other. We obtained two preliminary results: (1) the role of a word that evolves to acquire the largest or lower-middle variance of semantic vectors tends to be explainable by the author of the text; (2) the epicenters of earthquakes that acquire larger variance via crossover, corresponding to the interaction with diverse areas of land crust, are likely to correspond to the epicenters of forthcoming large earthquakes.
翻译:以往基于分布式表示学习物品及其群体(如词语、句子、节点和图表)语义向量的模型,均基于一个假设:每个物品对应一个由目标中隐含语境维度组成的向量。物品的多种意义通过为每个可能出现的领域分配一个向量,或将语境映射到物品的意义中来表征。然而,物品可能拥有多种截然不同的意义,这些意义会随语境变化或新语境的出现而动态变化或演化,即便在同一领域内,也如同生物体随环境变迁而进化一般。为了界定物品在意义建构中的歧义化解范围,作者提出了一种方法:数据中的词语或物品包含多个语义向量,这些向量通过相互交互而演化,类似于细胞携带染色体并相互交叉。我们获得了两项初步结果:(1)演化至语义向量方差最大或中低水平的词语角色,往往可由文本作者解释;(2)通过交叉获得较大方差的地震震中(对应于与地壳不同区域的交互),更可能对应未来大地震的震中。