ZeroSearch: Local Image Search from Text with Zero Shot Learning

The problem of organizing and finding images in a user's directory has become increasingly challenging due to the rapid growth in the number of images captured on personal devices. This paper presents a solution that utilizes zero shot learning to create image queries with only user provided text descriptions. The paper's primary contribution is the development of an algorithm that utilizes pre-trained models to extract features from images. The algorithm uses OWL to check for the presence of bounding boxes and sorts images based on cosine similarity scores. The algorithm's output is a list of images sorted in descending order of similarity, helping users to locate specific images more efficiently. The paper's experiments were conducted using a custom dataset to simulate a user's image directory and evaluated the accuracy, inference time, and size of the models. The results showed that the VGG model achieved the highest accuracy, while the Resnet50 and InceptionV3 models had the lowest inference time and size. The papers proposed algorithm provides an effective and efficient solution for organizing and finding images in a users local directory. The algorithm's performance and flexibility make it suitable for various applications, including personal image organization and search engines. Code and dataset for zero-search are available at: https://github.com/NainaniJatinZ/zero-search

翻译：随着个人设备上拍摄的图像数量快速增长，用户目录中图像的整理与检索问题日益严峻。本文提出一种利用零样本学习的解决方案，仅通过用户提供的文本描述即可创建图像查询。论文的主要贡献在于开发了一种算法，该算法利用预训练模型提取图像特征，通过OWL检测边界框的存在，并基于余弦相似度分数对图像进行排序。算法输出按相似度降序排列的图像列表，帮助用户更高效地定位特定图像。实验采用自定义数据集模拟用户图像目录，评估了模型的准确率、推理时间和体积大小。结果表明，VGG模型取得了最高准确率，而Resnet50与InceptionV3模型在推理时间和体积方面表现最优。本文提出的算法为用户本地目录中图像的整理与检索提供了高效且有效的解决方案，其性能与灵活性使其可应用于个人图像管理及搜索引擎等多种场景。零搜索的代码与数据集详见：https://github.com/NainaniJatinZ/zero-search

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日