Radio astronomy is an indispensable discipline for observing distant celestial objects. Measurements of wave signals from radio telescopes, called visibility, need to be transformed into images for astronomical observations. These dirty images blend information from real sources and artifacts. Therefore, astronomers usually perform reconstruction before imaging to obtain cleaner images. Existing methods consider only a single modality of sparse visibility data, resulting in images with remaining artifacts and insufficient modeling of correlation. To enhance the extraction of visibility information and emphasize output quality in the image domain, we propose VVTRec, a multimodal radio interferometric data reconstruction method with visibility-guided visual and textual modality enrichment. In our VVTRec, sparse visibility is transformed into image-form and text-form features to obtain enhancements in terms of spatial and semantic information, improving the structural integrity and accuracy of images. Also, we leverage Vision-Language Models (VLMs) to achieve additional training-free performance improvements. VVTRec enables sparse visibility, as a foreign modality unseen by VLMs, to accurately extract pre-trained knowledge as a supplement. Our experiments demonstrate that VVTRec effectively enhances imaging results by exploiting multimodal information without introducing excessive computational overhead.
翻译:射电天文学是观测遥远天体的不可或缺的学科。来自射电望远镜的波信号测量值(称为可见度)需要转换为图像以供天文观测。这些脏图像混合了真实源的信息和伪影。因此,天文学家通常在成像前执行重建以获得更干净的图像。现有方法仅考虑稀疏可见度数据的单一模态,导致图像存在残留伪影且相关性建模不足。为了增强可见度信息的提取并强调图像域的输出质量,我们提出了VVTRec,一种具有可见度引导的视觉与文本模态增强的多模态射电干涉数据重建方法。在我们的VVTRec中,稀疏可见度被转换为图像形式和文本形式的特征,以获得空间和语义信息方面的增强,从而改善图像的结构完整性和准确性。此外,我们利用视觉-语言模型(VLMs)实现额外的免训练性能提升。VVTRec使得稀疏可见度(作为VLMs未见过的外来模态)能够准确提取预训练知识作为补充。我们的实验表明,VVTRec通过利用多模态信息有效增强了成像结果,且未引入过多的计算开销。