Keynote Speech: Barbara Wing-Yee SIU

Corpus-Informed Studies: Methodologies and Applications (CISMA)

About CISMA
CISMA Lectures Series
Past Events
CISMA Attendee Guide 2026
Travel Information
Keynote 2026
Schedule
Download
Memory

Publish with us

Corpus-Informed Studies: Methodologies and Applications (CISMA)

About CISMA
CISMA Lectures Series
Past Events
CISMA Attendee Guide 2026
Travel Information
Keynote 2026
Schedule
Download
Memory

Publish with us

Keynote Speech: Barbara Wing-Yee SIU

Reconsidering Originality in AI-assisted L2 Engineering Reports: A Multidimensional NLP-based Analysis

Reconsidering Originality in AI-assisted L2 Engineering Reports: A Multidimensional NLP-based Analysis

The examination of AI-generated plagiarism has become a critical concern in second language (L2) writing, particularly in safeguarding academic integrity and originality amid the widespread accessibility of large language models (LLMs) among L2 learners. In response to this growing challenge, many higher education institutions (HEIs) have introduced new policies regulating the use of AI in student coursework. The present study addresses this concern by employing Biber’s (1988) multidimensional (MD) framework to systematically examine the functional and situational characteristics of students’ writing tasks. In doing so, we compiled a large corpus, EngiReport consisting of 250 final-year engineering reports submitted by Chinese L2 students from four departments at a university in Hong Kong S.A.R. The results showed that Dimension3 emerges as the strongest predicator for GenAI rates in the regression analysis, also ranking first in the classification model with the co-efficient 0.7623. In addition, dimensional inclusion analysis further reveals that classification model achieves its hight performance with the combination of Dimension3, Dimension6, Dimension1, Dimension2, yielding a Cohen’s Kappa of 0.0666 and an accuracy of 0.3867. A parallel analysis of lexico-grammatical features indicated that model performance peaked with the inclusion of 44 features, led by the type–token ratio (TTR). These findings provide important implications for L2 writing instruction and institutional policies in higher education, offering novel insights into AI–student collaboration in engineering contexts. 

Keywords: NLP; L2 writing; Assessment; Multidimensional analysis; AI plagiarism; AI-assisted writing