TextANIMAR: Text-based 3D Animal Fine-Grained Retrieval

Trung-Nghia Le,Tam V. Nguyen,Minh-Quan Le,Trong-Thuan Nguyen,Viet-Tham Huynh,Trong-Le Do,Khanh-Duy Le,Mai-Khiem Tran,Nhat Hoang-Xuan,Thang-Long Nguyen-Ho,Vinh-Tiep Nguyen,Tuong-Nghiem Diep,Khanh-Duy Ho,Xuan-Hieu Nguyen,Thien-Phuc Tran,Tuan-Anh Yang,Kim-Phat Tran,Nhu-Vinh Hoang,Minh-Quang Nguyen,E-Ro Nguyen,Minh-Khoi Nguyen-Nhat,Tuan-An To,Trung-Truc Huynh-Le,Nham-Tan Nguyen,Hoang-Chau Luong,Truong Hoai Phong,Nhat-Quynh Le-Pham,Huu-Phuc Pham,Trong-Vu Hoang,Quang-Binh Nguyen,Hai-Dang Nguyen,Akihiro Sugimoto,Minh-Triet Tran

from arxiv, Accepted to Computers and Graphics (3DOR, Journal Track)

3D object retrieval is an important yet challenging task that has drawn more and more attention in recent years. While existing approaches have made strides in addressing this issue, they are often limited to restricted settings such as image and sketch queries, which are often unfriendly interactions for common users. In order to overcome these limitations, this paper presents a novel SHREC challenge track focusing on text-based fine-grained retrieval of 3D animal models. Unlike previous SHREC challenge tracks, the proposed task is considerably more challenging, requiring participants to develop innovative approaches to tackle the problem of text-based retrieval. Despite the increased difficulty, we believe this task can potentially drive useful applications in practice and facilitate more intuitive interactions with 3D objects. Five groups participated in our competition, submitting a total of 114 runs. While the results obtained in our competition are satisfactory, we note that the challenges presented by this task are far from fully solved. As such, we provide insights into potential areas for future research and improvements. We believe we can help push the boundaries of 3D object retrieval and facilitate more user-friendly interactions via vision-language technologies. https://aichallenge.hcmus.edu.vn/textanimar

翻译：3D物体检索是一项重要且具有挑战性的任务，近年来受到越来越多的关注。尽管现有方法已在此问题上取得进展，但它们通常局限于图像和草图查询等受限场景，这对普通用户而言交互并不友好。为克服这些限制，本文提出了一项新颖的SHREC挑战赛赛道，重点关注基于文本的3D动物模型细粒度检索。与以往的SHREC挑战赛不同，本任务难度显著提升，要求参赛者开发创新方法来应对文本检索问题。尽管难度增加，我们相信该任务有望推动实践中的有用应用，并促进与3D物体更直观的交互。共有五个团队参与我们的竞赛，提交了总计114次运行结果。虽然竞赛结果令人满意，但我们注意到该任务所呈现的挑战远未完全解决。因此，我们为未来研究和改进提供了潜在方向的见解。我们相信，通过视觉-语言技术，能够帮助推动3D物体检索的边界，并促进更用户友好的交互。https://aichallenge.hcmus.edu.vn/textanimar