Crowdsourcing has been used to collect data at scale in numerous fields. Triplet similarity comparison is a type of crowdsourcing task, in which crowd workers are asked the question ``among three given objects, which two are more similar?'', which is relatively easy for humans to answer. However, the comparison can be sometimes based on multiple views, i.e., different independent attributes such as color and shape. Each view may lead to different results for the same three objects. Although an algorithm was proposed in prior work to produce multiview embeddings, it involves at least two problems: (1) the existing algorithm cannot independently predict multiview embeddings for a new sample, and (2) different people may prefer different views. In this study, we propose an end-to-end inductive deep learning framework to solve the multiview representation learning problem. The results show that our proposed method can obtain multiview embeddings of any object, in which each view corresponds to an independent attribute of the object. We collected two datasets from a crowdsourcing platform to experimentally investigate the performance of our proposed approach compared to conventional baseline methods.
翻译:众包已被广泛应用于多个领域的大规模数据收集。三元组相似性比较是一种众包任务,要求众包工人回答“在给定的三个对象中,哪两个更相似?”。这类问题对人类来说相对容易回答。然而,这种比较有时可能基于多个视角,即颜色和形状等不同的独立属性。对于相同的三个对象,不同视角可能导致不同的比较结果。尽管已有研究提出一种算法用于生成多视角嵌入,但该方法至少存在两个问题:(1)现有算法无法独立预测新样本的多视角嵌入;(2)不同个体可能偏好不同视角。在本研究中,我们提出了一种端到端的归纳深度学习框架,以解决多视角表示学习问题。结果表明,我们所提出的方法能够获取任意对象的多视角嵌入,其中每个视角对应对象的一个独立属性。我们通过众包平台收集了两个数据集,以实验方式研究我们提出的方法相较于传统基线方法的性能表现。