Scholars often explore literature outside of their home community of study. This exploration process is frequently hampered by field-specific jargon. Past computational work often focuses on supporting translation work by removing jargon through simplification and summarization; here, we explore a different approach that preserves jargon as useful bridges to new conceptual spaces. Specifically, we cast different scholarly domains as different language-using communities, and explore how to adapt techniques from unsupervised cross-lingual alignment of word embeddings to explore conceptual alignments between domain-specific word embedding spaces.We developed a prototype cross-domain search engine that uses aligned domain-specific embeddings to support conceptual exploration, and tested this prototype in two case studies. We discuss qualitative insights into the promises and pitfalls of this approach to translation work, and suggest design insights for future interfaces that provide computational support for cross-domain information seeking.
翻译:学者们经常探索其本学科领域之外的文献。这一探索过程常常受到特定领域术语的阻碍。过去的计算工作通常侧重于通过简化和摘要来消除术语以支持翻译工作;在此,我们探索一种不同的方法,将术语保留为通往新概念空间的有用桥梁。具体而言,我们将不同的学术领域视为不同的语言使用社群,并探索如何将无监督跨语言词向量对齐技术应用于探索特定领域词向量空间之间的概念对齐。我们开发了一个原型跨领域搜索引擎,该引擎使用对齐的特定领域词向量来支持概念探索,并在两个案例研究中测试了该原型。我们讨论了对此翻译工作方法的优势和局限性的定性见解,并为未来支持跨领域信息检索的计算界面提出了设计思路。