The overarching research direction of this work is the development of a ''Responsible Intelligence'' framework designed to reconcile the immense generative power of Large Language Models (LLMs) with the stringent requirements of real-world deployment. As these models become a transformative force in artificial intelligence, there is an urgent need to move beyond general-purpose architectures toward systems that are contextually aware, inherently safer, and deeply respectful of global cultural nuances. This research navigates three interconnected threads: domain adaptation to ensure technical precision, ethical rigor to mitigate adversarial vulnerabilities, and cultural/multilingual alignment to promote global inclusivity. The methodological trajectory moves from classical supervised adaptation for task-specific demands to decoding-time alignment for safety, finally leveraging human feedback and preference modeling to achieve sociolinguistic acuity.
翻译:本研究旨在构建一个"负责任智能"框架,以协调大语言模型强大的生成能力与实际部署的严格要求。随着这些模型成为人工智能领域的变革性力量,迫切需要超越通用架构,发展具有情境感知能力、本质安全性且深度尊重全球文化差异的系统。本研究围绕三个相互关联的维度展开:通过领域自适应确保技术精确性,通过伦理约束缓解对抗性漏洞,以及通过文化/多语言对齐促进全球包容性。方法路径从面向特定任务的经典监督式自适应出发,延伸至基于解码过程的安全对齐机制,最终通过人类反馈与偏好建模实现社会语言感知能力。