Large Language Models (LLMs) are increasingly embedded in child-facing contexts such as education, companionship, creative tools, but their deployment raises safety, privacy, developmental, and security risks. We conduct a systematic literature review of child-LLM interaction risks and organize findings into a structured map that separates (i) parent-reported concerns, (ii) empirically documented harms, and (iii) gaps between perceived and observed risk. Moving beyond descriptive listing, we compare how different evidence streams in surveys, incident reports, youth interaction logs, and governance guidance operationalize "harm," where they conflict, and what mitigations they imply. Based on this synthesis, we propose a protection framework that couples child-specific content safety and developmental sensitivity with security-grade controls for adversarial misuse, including prompt injection and multimodal jailbreak pathways. The framework specifies measurable evaluation targets (e.g., harmful-content avoidance, age-calibrated readability, bias parity checks, prompt-injection robustness, and monitoring transparency) to support developers, educators, and policymakers in assessing and improving child-safe LLM deployments.
翻译:大型语言模型已日益融入面向儿童的教育、陪伴、创意工具等场景,但其部署引发了安全、隐私、发展及安全性风险。我们通过系统性文献综述梳理儿童-LLM交互风险,将发现组织成结构化图谱,区分(i)家长报告的问题、(ii)实证记录的有害结果,以及(iii)感知风险与观察风险之间的差距。超越描述性列举,我们比较了调查、事件报告、青少年交互记录及治理指南中不同证据流对"伤害"的操作化定义、彼此矛盾之处及其隐含的缓解措施。基于综合归纳,我们提出一个保护框架,将针对儿童的内容安全与发展敏感性,与针对对抗性滥用(包括提示注入和多模态越狱路径)的安全级控制相耦合。该框架规定了可量化的评估目标(如有害内容规避、年龄校准可读性、偏差校验、提示注入鲁棒性及监控透明度),以支持开发者、教育工作者及政策制定者评估并改进儿童安全型LLM部署。