ILTA / AAAL Colloquium
Convened by James Purpura & Heidi Liu Banerjee
Exploring the cross-linguistic insights of using scenario-based assessment across four typologically different languages
For many years now, there has been a growing body of research devoted to the design, validation, and use of educational or psychological assessments that are available in other languages through translation or adaptation. Such tests, if equivalent across languages, are sorely needed in contexts, where learners, not proficient in the language of the test, need access to its content (Stansfield & Bowles 2006). Such tests are also needed in contexts, where intelligence or personality tests are administered to learners (Kadriye & Lyons 2013). Finally, equivalent tests in multiple languages are needed in cross-cultural, comparative studies of educational achievement such as the Program for International Student Assessment (PISA), administered globally to assess “mastery of processes, understanding of concepts, and the ability to function in various situations within the domains of reading literacy, mathematical literacy, and scientific literacy” (Dossey et al. 2006, p. iv).
In Applied Linguistics, few studies have attempted to investigate the parity of assessments across different languages. One noted exception is the European Survey of Language Competencies (European Commission 2012), whose goal was to provide comparative data of foreign language competencies across fourteen European countries, so that progress towards efforts to improve language learning could be examined. Students took two of three translated tests covering reading, listening, or writing ability. The test content measured functional proficiency within each skill in terms of the CEFR. While this study was interesting, competency was limited to isolated skills outside a coherent situated context.
The current symposium examines the validity of using parallel scenarios as a basis for measuring situated L2 proficiency and learning across four typologically different languages. The same construct and specifications were used for test development. The rubrics were similar with minor language-specific variations. And validation procedures were mostly parallel. Insights and challenges will be discussed.
Presentation 1: The affordances of using scenario-based assessment in cross-linguistic assessment contexts (20 min)
Presenter: James E. Purpura
Presentation 2: Designing cross-linguistic scenario-based assessments with parallel scenario narratives (20 min)
Presenter: Heidi Liu Banerjee
Presentation 3: Examining ESL learners’ situated language proficiency and topical learning through a scenario-based assessment (10 min)
Presenters: Daniel Eskin (he/him), Jorge Beltran (he/him), Soo Hyoung Joo (she/her), James E. Purpura, Heidi Liu Banerjee
Presentation 4: Examining Korean learners’ situated language proficiency and topical learning through a scenario-based assessment (10 min)
Presenters: Soo Hyoung Joo, Ji-Young Jung (she/her), Yuna Seong (she/her), Joowon Suh (she/her)
Presentation 5: Examining Persian learners’ situated language proficiency and topical learning through a scenario-based assessment (10 min)
Presenters: Payman Vafaee (he/him), Mahshad Davoodifard (she/her), Nahal Akbari-Saneh
Presentation 6: Examining Italian learners’ situated language proficiency and topical learning through a scenario-based assessment (10 min)
Presenters: Sabrina Machetti (she/her), Giulia Peri (she/her), Paola Masillo (she/her)
Presentation 7: Comparing learners’ situated language proficiency, topical learning, and perceptions in cross-linguistic scenario-based assessments (20 min)
Presenters: Nahal Akbari-Saneh, Sabrina Machetti, James E. Purpura, Joowon Suh
Discussant (10 min): Antony John Kunnan
Q&A (10 min)
The affordances of using scenario-based assessment in cross-linguistic assessment contexts
In mainstream educational contexts, several testing programs have endeavored to measure disciplinary content in multiple languages. This has been done often by using rigorous translation or adaptation protocols. In language assessment, however, few studies have designed tests specifically to measure language competencies cross-linguistically. The noted exception is the European Survey of Language Competencies (European Commission 2012), which measured reading, listening and writing competencies defined in terms of functional statements linked to the Common European Framework of Reference (Council of Europe 2020).
Nonetheless, the world has become more complex, where the competencies needed to function effectively as global citizens have significantly changed. And the construct of situated L2 proficiency has evolved and considerably broadened to address these changes. Language assessments, however, have often failed to approximate the kinds of complex tasks which require examinees to demonstrate the language-driven competencies needed in an increasingly interconnected, diverse, and globalized world (Gordon Commission 2013; National Research Council 2011; Partnership for 21st Century Skills 2009).
To reflect the need to measure broadened constructs of language-driven competencies, and to examine how such competencies are displayed in cross-linguistic contexts, a scenario-based assessment (SBA) approach was adopted. Guided by a learning-oriented assessment framework (Purpura & Turner 2014; Turner & Purpura 2016), the SBA approach presents examinees with a carefully sequenced set of naturally-occurring scenes in which examinees carry out actions and interact with simulated characters until the overarching scenario goal (i.e., the target competency) is brought to resolution. Because the tasks within a scenario are designed to simulate habits of mind, SBA provides insights into how learners of typologically different languages utilize their situated language competencies to fulfill the scenario goal. This paper lays out the theoretical framework for exploring the cross-linguistic insights of using SBA across four typologically different languages.
Designing cross-linguistic scenario-based assessments with parallel scenario narratives
In the field of applied linguistics, cross-linguistic assessment has frequently been used in vocabulary or skill-based assessment contexts (Kempe & MacWhinney 1996). The limited scope of these assessments stemmed from the need to control variables so that the assessed linguistic components across different languages could be properly compared. However, the shortfall of this approach can compromise the types of possible understandings, especially when comparing learners’ situated language proficiency, or their ability to use the target language effectively in real-life situations.
One major affordance of scenario-based assessment (SBA) is that it allows the sequence of the tasks within a scenario to reflect the learners’ habits of mind (Purpura & Banerjee 2021) in problem solving. By specifying relevant tasks within a language use situation, parallel cross-linguistic assessments can be designed, where learners perform similar real-life tasks, demonstrating similar competencies using different languages.
In the current project, four parallel forms of cross-linguistic SBAs were developed by taking a competency-based, learning-oriented approach. Specifically, learners of English, Korean, Persian, and Italian were asked to demonstrate the competencies of building and sharing knowledge (O’Reilly et al. 2015) through collaborative decision-making. To maximize the comparability across the four SBAs, the level of all assessment tasks was targeted at the CEFR B1 level, and the assessment design was guided by the learning-oriented assessment framework (Purpura & Turner 2014; Turner & Purpura 2016). The English SBA served as the prototype of the other languages. While all four SBAs had the same scenario narrative, the content and simulated characters differed to reflect the cultures and differing language learning situations. Both translation and adaptation methods (Ercikan & Lyons-Thomas 2013) were used in the development process.
In this presentation, the design process of the four cross-linguistic SBAs will first be described. Then, challenges and lessons learned will be discussed.
Examining ESL learners’ situated language proficiency and topical learning through a scenario-based assessment
As L2 education moves towards the development of 21st century skills, there is a need to utilize L2 assessments that can provide information about learners’ situated proficiency, particularly when it comes to decisions like course placement (Banerjee 2019; Purpura & Banerjee 2021). Given the potential of using scenario-based assessment (SBA) to measure situated proficiency, an SBA was developed and piloted to examine its viability for placement purposes in an adult ESL program at a large American university. The pilot included 55 participants ranging from beginner to advanced L2 English proficiency levels and it sought to understand not only how well a scenario-based assessment of situated L2 proficiency provided psychometrically sound and meaningful score interpretations, but also how it supported learning across the assessment.
The SBA revolves around a collaborative problem-solving task in which examinees, along with their simulated group members, have to learn about two possible overseas destinations for a class trip, decide which to go to, and present a persuasive, evidence-based argument in support of the group’s decision. To do this, examinees respond to a series of carefully-sequenced independent and integrated skills tasks designed to reflect the habits of mind used to accomplish the scenario goal (i.e., making a spoken pitch). As examinees are expected to build topical knowledge during the assessment, their pre and post-scenario topical knowledge is measured.
Test functionality and score meaningfulness were examined through a series of statistical procedures, including classical test theory, multi-facet Rasch measurement, and multivariate G-theory. Results showed that the assessment provided reasonably good estimates of situated L2 proficiency, applicable for placement purposes. The analyses also showed that learning indeed transpired during the assessment, as evidenced by the observed gains on the pre- and post-scenario topical knowledge measures. Implications of the use of SBA to measure situated L2 proficiency will be discussed.
Examining Korean learners’ situated language proficiency and topical learning through a scenario-based assessment
Korean language proficiency assessments (e.g., Test of Proficiency in Korean [TOPIK], Korean Language Proficiency Test [KLPT], ACTFL Oral Proficiency Interview [OPI] and Writing Proficiency Test [WPT]) have been mostly organized around decontextualized tasks designed to measure independent skills and have depended heavily on selected-response tasks (e.g., multiple choice). As a result, they have overlooked the need to measure how foreign language learners of Korean can effectively use Korean to function in more complex language use contexts--those where they need to display real-life competencies.
To address this gap, a scenario-based KFL assessment was designed to measure the “situated foreign language Korean proficiency” of 51 students from a KFL program at a US university. The participants included both heritage and non-heritage learners and their proficiency levels ranged from low-intermediate to advanced.
The SBA was organized around a study abroad program context in South Korea, where examinees needed to learn about, decide on, and argue for one of two destinations for a class trip (i.e., Jeonju and Ulleng-do). While the scenario narrative was parallel to the English SBA in its design, the context of the scenario was adapted to a study abroad context for foreign language learners and the rubric for the writing and speaking tasks were adapted to address the Korean language specific characteristics. Examinees were expected to learn about the two destinations during the assessment.
Similar to the English SBA, test scores were analyzed using classical test theory, generalizability theory, and Rasch analysis. The results indicated that the test functioned well psychometrically. Also, the gains seen in the pre and post-scenario topical knowledge measures along with results of a post-test survey provided convincing evidence that the test served as a valuable “educational experience in and of itself” (Bennet 2010, p.1).
Examining Persian learners’ situated language proficiency and topical learning through a scenario-based assessment
With an increasing need for standardized assessment of Persian Proficiency, tests such as Persian Computerized Assessment of Proficiency (CAP) and the first standardized test of Persian proficiency (SAMFA) have been developed in alignment with frameworks such as ACTFL or CEFR. These tests, however, measure knowledge of grammar and vocabulary, as well as ability in different skills independently from each other and typically out of context. To overcome this weakness, we created a scenario-based assessment (SBA) to measure “situated” proficiency in Persian. The SBA was designed to guide the test takers through a set of thematically related tasks where they had to obtain information about two destinations in Iran, Kerman University and Gilan University, for a study abroad program hosted by one of these universities. The goal of the SBA was for the students to pitch their idea convincing a committee as to which destination to select.
While the general guidelines of the English SBA test design and scoring were followed, appropriate adjustments were made to adapt for the cultural as well as linguistic features of Persian. The SBA was initially administered to a group of 12 Persian learners at the Persian Flagship Program at the University of Maryland. The test takers were heritage as well as non-heritage language learners, and their Persian proficiency ranged from low-intermediate to advanced level. Due to limited sample size, classical test theory was used to analyze the Persian SBA test scores. The results indicated that the test functioned well psychometrically and served as a valuable educational tool in significantly increasing the students’ topical knowledge.
Examining Italian learners’ situated language proficiency and topical learning through a scenario-based assessment
Italian language proficiency assessments (e.g., CILS, CELI, Cert-IT, PLIDA) have traditionally revolved around independent skills tasks designed to measure the traditional skills (reception, production, interaction). These tests have also been heavily influenced by the Common European Framework of Reference (CoE 2001). The most important international language certification tests are aligned with the CEFR, including those in Italy (CoE 2009; Martyniuk 2010; Barni & Machetti 2019).
While existing tests have attempted to measure language use, there is still the need to measure how second language learners of Italian can effectively use their situated L2 proficiency to function in more complex language use contexts such as those where examinees need to display real-life competencies in complex tasks involving learning and instruction.
To address this gap, an L2 Italian scenario-based assessment was designed to measure the “situated Italian language proficiency” of hundreds of students enrolled in B1-B2 Italian language courses at the University for Foreigners of Siena. Participants included prospective students at the same University as well as those students applying to enroll in degree programs at other Italian Universities.
The SBA was developed around a group problem-solving task in which examinees, together with their simulated group members, learn about two possible destinations in Italy (i.e., Sicily and Abruzzo) for a class trip and have to decide where to go.
While the scenario narrative was parallel to the English SBA in design, it was adapted to the Italian context for L2 learners, and the rubrics for the writing and speaking tasks were adapted to reflect Italian language characteristics. The test design was oriented toward promoting a viable educational experience in its own right (Bennet 2010) as it focused not only on linguistic features, but also on the development of topical knowledge (Purpura 2016).
Comparing learners’ situated language proficiency, topical learning, and perceptions in cross-linguistic scenario-based assessments
To compare how learners of four typologically different languages (i.e, English, Korean, Persian, and Italian) may demonstrate their situated language proficiency similarly or differently, four language-specific scenario-based assessments (SBAs) with the same scenario goal were developed. Specifically, examinees were asked to learn about two destinations and make a spoken pitch to a simulated committee for a given destination to be selected for a class trip. Adjustments were made to address the differences in cultures and learning contexts of each target language. A series of comparisons across the four SBAs were made to determine to what extent the four SBAs led to similarly meaningful situated language proficiency, and how examinees perceived their testing experience.
Results from statistical analyses showed the four SBAs provided reasonably good situated language proficiency estimates. A comparison of examinees’ pre- and post-scenario topical knowledge scores also showed that learning transpired throughout the assessment independently of the language being assessed.
Nonetheless, there were clear differences in terms of examinees’ perceptions of the testing experience. A post-test survey consisting of 111 Likert rating scale items and 9 open-ended response questions was delivered to all four groups of examinees. The survey was developed and organized according to the dimensions in the learning-oriented assessment framework (Purpura & Turner 2014; Turner & Purpura 2016). Overall, examinees who took the English SBA reported more positive perceptions of their testing experience than examinees of the Korean and Persian SBAs. These differences were most evident when examinees reported their affective dispositions (e.g., engagement, enjoyment) and their perceptions of their proficiency and learning in relation to the test. These differences will be discussed in this presentation, along with possible explanations for these observed trends.