AAAL 2023: Invited Colloquium
Convened by Sandra C. Deshors & Charlene Polio
Expanding theoretical and methodological approaches to data-driven learning: Moving forward with research and pedagogy
Although data-driven learning (DDL) has gained much popularity among researchers and teachers as an approach to second language instruction, and much research on DDL has focused on its practicality in second-language classroom contexts. For instance, scholars have assessed its pedagogical benefits and described what useful DDL instruments, tools, and activities look like (Boulton & Vyatkina, 2021). Recently, scholars such as Flowerdew (2015), O’Keeffe (2020), and Boulton and Vyatkina (2021) have started to shift their attention to the “research-practice gap” (Chambers, 2009) in DDL and the importance of exploring, more deeply than ever before, the inter-relation between DDL and second language acquisition (SLA) theories. In this respect, O’Keeffe (2020) points to the lack of theoretical underpinnings in DDL, raising the question of “how and where DDL fits within current SLA models and debates". In the same spirit, Boulton and Vyatkina (2021) recognize the need “to push further and expect theories to drive the continued development of DDL, moving forward.”
This colloquium provides a meeting ground for scholars seeking to bring closer together theory and practice in DDL and discuss issues related to expanding theoretical and methodological approaches to DDL. The colloquium begins with a summary of past approaches to DDL and a call for more theoretically driven research. This is followed by a summary of methodological approaches to researching DDL and suggestions of ways that specific methods can be used to better study the processes and outcomes of DDL pedagogy. Two empirical studies are then presented as illustrations of expanding the theoretical and methodological boundaries. The first uses sociocultural theory to structure a DDL study of collocations. The second demonstrates how a mixed methods design can provide information on the challenges and affordances of DDL pedagogy.
Abstract: Research over the years has shown that DDL carries pedagogical merit, especially for learners from intermediate level and upwards (Boulton, 2021). However, very little thought has gone into why engaging with repeated patterns of language on a screen might be beneficial from the perspective of second language acquisition. In this paper, it is argued that DDL is a method that needs a theoretical underpinning so as to provide a rationale for the importance of repeated exposure to and engagement with language patterns. The usage-based model (Ellis, 2012) will be put forward as a suitable basis for theoretically understanding DDL. Within this model, it is held that, in first and second language acquisition, learners attend to frequently used form-meaning pairings that they experience (Pérez-Paredes et al., 2020). In a process of meaning-mapping, these patterns and their meanings become ‘entrenched as grammatical knowledge in the speaker’s mind’ (Ellis & Ferreira-Junior 2009). It is argued that the usage-based model can offer a principled means of curating data from corpora to aid and possibly accelerate L2 acquisition. Through a usage-based perspective, important issues can be raised in relation to DDL in its current form and how it might be enhanced. It can also provide insight into the hitherto lack of application of DDL to learners below intermediate level (Boulton & Cobb, 2017). From a usage-based viewpoint, it will be argued that for DDL to work at lower levels, there is a need for 1) greater teacher mediation through principled curation; and 2) a richer multi-modal interface and context for learning (O’Keeffe, 2021a, 2021b).
Abstract: DDL popularly involves the explicit use of corpus data, whether hands-on or via prepared materials, for learners and teachers of a foreign or second language. Since the basic concepts were first introduced in the 1980s, hundreds of academic publications have appeared. Drawing and expanding on the results of our recent study (Boulton & Vyatkina, 2021), we present a methodological scoping review of DDL research up to and including 2021, with the focus on the 156 research articles in journals listed by the latest Clarivate Web of Science ranking for Linguistics (192 journals in 2020). The overall picture displays a wide variety of data collection and analysis instruments and methods, with about 15% of the studies being purely qualitative and just over 50% employing at least some statistical methods. We found some encouraging recent trends such as expanding DDL applications to new learner populations and learning environments (e.g., Asian and Middle Eastern regions, younger learners, lower proficiency levels). On the other hand, we noticed remarkably little change in other methodological aspects, with most studies targeting university students, English for general purposes classes, relatively small groups, short DDL interventions using concordancers, and lexico-grammar as the learning target. We conclude with inviting DDL researchers to diversify their research methodology, design multi-institutional studies, integrate contemporary multifactorial data analysis methods, improve the rigor of the methodological reporting, and, while doing the above, to open DDL up to second language acquisition theories and research methods, which would undoubtedly bring both fields forward.
Abstract: Collocation, the characteristic co-occurrence of patterns of words (e.g., heavy rains, pleasantly surprised), has been recognized as a necessary yet challenging component of second language (L2) lexical competence. Verb-noun collocation has been shown to be particularly problematic for L2 learners (Nesselhauf, 2003; Yoon, 2016). We draw on Vygotskyian Sociocultural Theory in designing a data-driven language learning system to assist ESL learners in identifying and correcting problematic verb-noun collocation (e.g., *meet a problem) specific to each individual learner. The system uses natural language processing techniques to extract verb-noun collocations and compares learners’ work against multiword units from the Contemporary Corpus of American English (Davies, 2009). Following the graduated approach (Aljaafreh & Lantolf, 1994), we provide students with corrective feedback in step-wise fashion, from more implicit to more explicit: (1) learners are asked to self-identify potentially problematic verb-noun collocations; (2) the system highlights problematic ones and ask them to self-correct; (3) the system provides learners with candidate verbs and ask them to self-correct; (4) the system presents learners with authentic language samples in Key-Word-In-Context (KWIC) format for further analyses and verification. Fifteen college-level English language learners received graduated feedback on their summary-and-response and argumentative essay. Their iterative revisions are documented through screen video recordings and their thought processes are collected through stimulated recall interviews. Microgenetic analyses suggest that students generally benefit from the “graduated” approach of corrective feedback on verb-noun collocation use. The presentation ends with a discussion of pedagogical implications, ways to improve accuracy in identifying problematic collocations and selecting appropriate candidate verbs, and how theoretical constructs from Sociocultural Theory (e.g., mediation, ZPD) may provide useful guidance on the development of data-driven language learning systems.
Data-driven learning for secondary students: Students and teachers’ perceptions and use of corpora for improving science report writing
Peter Crosthwaite, University of Queensland, firstname.lastname@example.org
Abstract: Secondary school students must quickly develop knowledge of the language features required for disciplinary literacy standards for STEM. Many female students feel frustrated when encountering STEM language, particularly those for whom English is an additional language (Jones & Seilhamer, 2020), while science subject teachers often lack the technological, pedagogical, and content knowledge (TPACK, Koehler & Mishra, 2009) to develop students’ abilities to discover the language of science through technology. This presentation explores how DDL pedagogy can support students’ reporting of science experiments through written research reports, focusing particularly on how DDL can help develop knowledge of the passive voice. Participants included 60 Year 9-10 girls and four secondary teachers who implemented two DDL interventions, one focusing on English as an additional language, the other focusing on for science report writing. In this mixed-methods study, pre/post-tests on passive constructions were conducted, alongside the collection of written reports retelling an observed science experiment, on which written corrective feedback was provided for corpus-assisted revisions. During a 10-week term, students completed guided homework and in-class group DDL activities led by their teachers using freely available online corpus applications. Individual interviews with students and teachers were conducted, and follow-up questionnaires were collected 10 weeks after the treatment. The data suggest that while DDL did not result in increased receptive knowledge of the passive, the DDL treatment resulted in increased production of accurate passive voice constructions in pre/post-test conditions. Analysis of students’ written production contained little evidence of the use of corpora for the passive but did contain numerous revisions to highlighted issues with academic phraseology. Findings shed light on how DDL is perceived by learners when their subject content teachers are responsible for carrying out the intervention and provide an honest appraisal of the affordances and limitations of DDL when carried out by subject-content teachers.