Post-Launch Capability Expansion of Vision-Language Models via Prompting for On-Orbit Spacecraft Inspection

Spaceborne inspection systems often deploy perception models prior to launch, after which updating model weights or expanding fixed label sets becomes operationally impractical. While supervised models can be integrated pre-flight, adding new semantic capabilities in orbit requires retraining and re-uploading parameters. We investigate whether prompt-driven vision--language models can enable post-launch semantic expansion, allowing new spacecraft components to be specified via natural-language prompts without modifying onboard weights. We evaluate zero-shot instance segmentation of spacecraft components under a strictly frozen, single-pass inference protocol on a test set of $129$ images of previously unseen satellites. Under fixed global thresholds and no post-processing, SAM3 achieves $0.385$ mAP@$0.5$ and $0.267$ mAP@$0.5{:}0.95$. Performance is strongly scale-dependent: large structural elements like spacecraft bodies ($0.639$ AP@$0.50$) and solar arrays ($0.598$ AP@$0.5$) localize reliably, while relatively small appendages like antennas ($0.221$ AP@$0.5$) and thrusters ($0.081$ AP@$0.5$) remain difficult. Prompt formulation influences performance, with structured prompts incorporating spatial and geometric descriptors yielding up to $82%$ improvement over short category-name prompts. The model operates within the memory and compute envelope of contemporary embedded GPUs, suggesting prompt-driven grounding can provide a practical mechanism for post-launch semantic extension of dominant spacecraft structures while highlighting limitations of zero-shot localization for fine-scale components under orbital domain shift.

翻译：天基检测系统通常在发射前部署感知模型，发射后更新模型权重或扩展固定标签集在操作上变得不切实际。虽然监督模型可在飞行前集成，但要在轨道上增加新的语义能力需要重新训练并重新上传参数。我们研究提示驱动的视觉-语言模型能否实现发射后的语义扩展，从而通过自然语言提示指定新的航天器部件，而无需修改星上权重。我们在一个严格冻结的单次推理协议下，对包含129张先前未见卫星图像的测试集进行了航天器部件的零样本实例分割评估。在固定全局阈值且无后处理的条件下，SAM3达到了0.385 [email protected]和0.267 [email protected]:0.95的性能。性能呈现显著的尺度依赖性：大型结构部件如航天器本体（[email protected]为0.639）和太阳能帆板（[email protected]为0.598）能够可靠定位，而相对较小的附属部件如天线（[email protected]为0.221）和推进器（[email protected]为0.081）仍难以识别。提示表述方式影响性能，融入空间和几何描述符的结构化提示相比简短类别名称提示可提升高达82%。该模型可在当代嵌入式GPU的内存和计算能力范围内运行，表明提示驱动的定位可为主要航天器结构提供一种实用的发射后语义扩展机制，同时揭示了在轨道域迁移条件下，零样本定位对于精细尺度部件的局限性。