CodeCytos: AI-assisted spatial molecular imaging analysis via code-augmented agent action space

Conventional tissue image analysis software provides foundational capabilities for cellular analysis, including segmentation, basic morphological feature extraction, and spatial organization analysis. However, these tools often require manual intervention and are not well integrated with code-driven automation, limiting efficiency and scalability for complex spatial tissue studies. In addition, they offer limited flexibility for custom analyses, as they typically support only a fixed set of pre-implemented spatial cellular features. To address these limitations, we propose CodeCytos, a coding-based reasoning agent framework that enables dynamic, programmable interaction with spatial molecular imaging data to improve automation and customization. CodeCytos is designed to streamline the exploration of custom spatial cellular features and adapt to diverse research needs. We demonstrate its utility through case studies on four expert-curated datasets from distinct tissue types: frontal cortex, non-small-cell lung cancer, pancreas, and tonsil. We evaluate CodeCytos under a realistic minimal prompt setting, where bioscientists pose simple questions without task-specific instructions or contextual information about spatial cellular analysis, and benchmark multiple LLM backbones with strong coding capabilities. We further show that incorporating tailored, domain-agnostic few-shot in-context coding-reasoning examples (randomly sampled demonstrations outside the spatial analysis domain) can substantially improve performance without requiring costly, expert-crafted in-domain demonstrations. Overall, CodeCytos outperforms baseline approaches, highlighting the potential of code-action agents to assist with custom feature exploration in spatial molecular imaging and to accelerate biomarker discovery.

翻译：传统组织图像分析软件为细胞分析提供了基础能力，包括分割、基本形态特征提取及空间组织分析。然而，这些工具通常需要人工干预，且难以与代码驱动的自动化流程深度融合，限制了复杂空间组织研究的效率与可扩展性。此外，由于仅支持固定预设的空间细胞特征集合，它们在定制化分析方面灵活性有限。为解决上述局限，我们提出CodeCytos——一种基于代码推理的智能体框架，能够实现对空间分子成像数据的动态、可编程交互，从而提升自动化与定制化水平。CodeCytos旨在简化定制化空间细胞特征的探索过程，并适应多样化研究需求。我们通过四个专家标注数据集（涵盖额叶皮层、非小细胞肺癌、胰腺与扁桃体等不同组织类型）的案例研究验证其有效性。在模拟现实场景的极简提示设置下（生物科学家仅提出简单问题，未提供任务特定指令或空间细胞分析背景信息），我们评估了CodeCytos的表现，并测试了多个具有强大编码能力的大语言模型骨干。进一步研究表明，融入经定制的、领域无关的少样本上下文编码推理示例（随机抽取的空间分析领域外演示样本）可显著提升性能，而无需依赖专家精心构建的领域内演示。总体而言，CodeCytos在性能上超越基线方法，彰显了代码行动智能体在辅助空间分子成像中定制化特征探索、加速生物标志物发现的潜力。