We present PAPERCLIP (Proposal Abstracts Provide an Effective Representation for Contrastive Language-Image Pre-training), a method which associates astronomical observations imaged by telescopes with natural language using a neural network model. The model is fine-tuned from a pre-trained Contrastive Language-Image Pre-training (CLIP) model using successful observing proposal abstracts and corresponding downstream observations, with the abstracts optionally summarized via guided generation using large language models (LLMs). Using observations from the Hubble Space Telescope (HST) as an example, we show that the fine-tuned model embodies a meaningful joint representation between observations and natural language through tests targeting image retrieval (i.e., finding the most relevant observations using natural language queries) and description retrieval (i.e., querying for astrophysical object classes and use cases most relevant to a given observation). Our study demonstrates the potential for using generalist foundation models rather than task-specific models for interacting with astronomical data by leveraging text as an interface.
翻译:我们提出了PAPERCLIP(提案摘要为对比语言-图像预训练提供有效表示)方法,该方法通过神经网络模型将望远镜成像的天文观测与自然语言相关联。该模型基于预训练的对比语言-图像预训练模型进行微调,使用成功的观测提案摘要及其对应的下游观测数据,并可选地通过大型语言模型引导生成进行摘要浓缩。以哈勃空间望远镜的观测为例,我们通过图像检索(即利用自然语言查询寻找最相关观测)与描述检索(即查询与给定观测最相关的天体物理目标类别及用例)两项测试,证明微调后的模型在观测与自然语言之间建立了有意义的联合表征。本研究证明了利用文本作为交互界面时,通用基础模型替代任务专用模型处理天文数据的潜力。