Beskrivelse
Abstract: With the increasing digitization of schools, student texts are now by default available in digital formats which makes it easier to collect and annotate corpora of student texts. As a result, new digital methodological approaches for studying writing development in school are now emerging (Crossley, 2020; Moxley et al., 2017). In this paper presentation, I report from a recently conducted PhD study in which I studied linguistic trajectories of writing development in student texts from the 5th-9th grade by means of digital and automated tools from the field of Natural Language Processing (NLP) (Eisenstein, 2019). The research question was twofold. One, I aimed to examine how texts written by students from different grade levels differed from each other from a linguistic perspective. Two, I wanted to study the effect of extra-linguistic variables, such as student gender, writing attitude and social background, on syntactic and lexical features in the texts.The study is based on a corpus of both narrative (n=228) and descriptive (n=236) student texts from grade levels 5-9. In order to answer the first research question, I designed a mixed methods study in which I combined quantitative analyses of five NLP variables with in-depth qualitative lexicogrammatical analyses of 15 student texts. The five NLP variables were text length, sentence length, mean dependency distance, word length and lexical variation (OVIX). To answer the second research question, I collected data on i.a. gender, writing attitude, social background and reading and writing habits through an administered survey to the participating student. I then employed multiple regression modelling to estimate the effect of the extra-linguistic on the five NLP variables in the texts. Beta-coefficients were included to compare effect between variables. The results of the study indicate that a linear conception of writing development across grade levels may not be valid and that variance within grade levels seems to increase over time. Also, the results show that student gender and social background are highly relevant extra-linguistic factors to include when accounting for linguistic differences in student writings. In the presentation, I will present the results of the analyses in more detail as well as elaborate on the background and methodological considerations related to the study.
Keywords: writing development, sociology of writing, corpus linguistics, natural language processing
References
Crossley, S. A. (2020). Linguistic features in writing quality and development: An overview. Journal of Writing Research, 11(3), 415–443. https://doi.org/10.17239/jowr-2020.11.03.01
Eisenstein, J. (2019). Introduction to natural language processing. The MIT Press.
Moxley, J., Elliot, N., Eubanks, D., Vezzu, M., Elliot, S., & Allen, W. (2017). Writing Analytics: Conceptualization of a Multidisciplinary Field. Journal of Writing Analytics, 1, v–xvii. https://doi.org/10.37514/jwa-j.2017.1.1.02
Periode | 26 okt. 2023 |
---|---|
Begivenhedstitel | NNFF9: Nordisk Netværk for Førstesprogsdidaktisk Forskning: Møder og mangfoldighed |
Begivenhedstype | Konference |
Konferencenummer | 9 |
Placering | Helsinki, FinlandVis på kort |
Grad af anerkendelse | International |