Public debate on the alleged decline of language skills among younger generations often focuses on university students, the most highly educated segment of the population. Rather than addressing the ill posed question of linguistic decline, this paper examines how formal written Italian is currently used by university students and whether systematic patterns of competence and heterogeneity can be identified. The analysis is based on data from the UniversITA project, which collected formal texts written by a large and nationally representative sample of Italian university students. Texts were annotated for linguistically motivated features covering orthography, lexicon, syntax, morphosyntax, coherence, register, and sentence structure, yielding low frequency multivariate count data. To analyse these data, we propose a novel model-based clustering approach based on a Poisson factor mixture model that accounts for dependence among linguistic features and unobserved population heterogeneity. The results identify two correlated dimensions of writing competence, interpretable as communicative competence and linguistic grammatical competence. When educational and socio demographic information is incorporated, distinct student profiles emerge that are associated with field of study and educational background. These findings provide quantitative evidence on contemporary writing and offer insights relevant for language education and higher education policy.
翻译:关于年轻一代语言能力下降的公共讨论常聚焦于大学生这一受教育程度最高的人群。本文不探讨语言能力下降这一不恰当的问题,而是考察大学生如何运用正式的书面意大利语,以及能否识别出系统性的能力模式和异质性。分析基于UniversITA项目的数据,该项目收集了具有全国代表性的意大利大学生样本所撰写的正式文本。文本在语言学特征上进行了标注,涵盖正字法、词汇、句法、形态句法、连贯性、语域和句子结构,从而产生了低频率的多元计数数据。为分析这些数据,我们提出了一种基于泊松因子混合模型的新型模型聚类方法,该方法考虑了语言特征间的依赖关系和未观测到的群体异质性。结果识别出写作能力的两个相关维度,可解释为交际能力和语言语法能力。当纳入教育和社会人口学信息时,出现了不同的学生特征,这些特征与学习领域和教育背景相关。这些发现为当代写作提供了定量证据,并为语言教育和高等教育政策提供了相关见解。