CORPUS USED AS A DATA-DRIVEN LEARNING TOOL IN L2 ACADEMIC WRITING: EVIDENCE FROM TURKISH CONTEXTS

: This study investigated the effectiveness of using corpus as a data-driven learning (DDL) tool to enhance the academic writing skills of Turkish EFL learners. The study also explored learners’ views of the potential use of corpus in L2 academic writing. To achieve these objectives, a mixed-method sequential explanatory design was employed, involving freshman student teachers enrolled in the Department of English Language Teaching at a state university in Turkey. The participants completed four argumentative essay writing tasks. Two tasks employed conventional techniques for error correction, while the other two utilized corpus as a reference tool for error correction. The latter two tasks were complemented by corpus training for the participants. The results indicated that using corpus as a DDL tool had a significant impact on the academic writing skills of Turkish EFL learners, with notable improvements observed in both grammar and vocabulary use. Participants also expressed positive feedback on the use of corpus as a DDL tool in enhancing their L2 academic writing.

Academic writing in a second language (L2) is commonly regarded as a difficult skill to master, particularly for non-native English speakers who learn English in a foreign language learning context.Scholars have noted that L2 academic writing poses various sociocultural and linguistic difficulties for learners, involving issues such as word choice, grammatical accuracy, sentence structure, and discourse organization, among others (Cangır, 2023), leading to an extensive investigation of effective ways to improve L2 learners' academic writing.To address this issue, corpora and concordancers have gained widespread recognition as essential resources for effective L2 writing, with corpora playing a crucial role in providing authentic and diverse language samples for learners to analyze and learn (Boulton & Vyatkina, 2021;Römer, 2011).Despite the potential benefits of corpus approaches for L2 academic writing, scholars raise their concerns about a gap between advances in corpus studies and the practice of English language teaching, arguing that corpus applications have not yet been fully utilized in the language classroom (Chambers, 2019;Chen, et al., 2019;Hirata & Thompson, 2022;Pérez-Paredes, 2022).
Data-driven learning (DDL), a term coined by Johns (1991), aims to directly engage language learners with corpus data in order to eliminate intermediaries as much as possible (Johns, 1991).By doing so, learners are empowered to study the target language and develop their own profiles of language use and meaning (Boulton & Cobb, 2017;Boulton & Pérez-Paredes, 2014).The multiple affordances of language corpora for DDL have emerged in various fields, including first and second language acquisition, skill development, translation, and materials development (Lenko-Szymańska & Boulton, 2015).The central idea of DDL is the hands-on use of authentic corpus data by advanced foreign or L2 language learners for selfdirected language learning (Boulton, 2011).Regarding L2 acquisition, DDL creates an environment for students to become active language learners by enabling them to query corpora, understand concordance and collocation information, and notice gaps in their linguistic knowledge in the L2 repertoire (Godwin-Jones, 2017;O'Keeffe, 2021;Schmidt, 2023).This approach presents an alternative to teacher-led, rule-based approaches to language pedagogy (Crosthwaite, 2017).
In other words, DDL has the potential to enhance the academic writing skills of L2 learners by promoting independent learning through corpus-based discovery (Flowerdew, 2015).This approach encourages self-directed, inductive learning where students take responsibility for their own learning (Aston, 2015;Godwin-Jones, 2017).O'Keeffe (2021) notes that accessing a large amount of authentic corpus data can facilitate noticing (Schmidt, 1990(Schmidt, , 2001)), a process in which learners raise their awareness of so-called fossilized errors and explore their interlanguage features to improve their writing skills.
Numerous studies have documented the use of in-class corpus-driven DDL to improve academic writing.To achieve effective direct use of corpus in L2 writing, several studies have utilized corpus as a reference tool to correct learners' errors and enhance their writing skills.Kennedy and Miceli (2016) identified key characteristics of using a native corpus to improve learners' writing skills and revealed that working on corpus data was useful in promoting learners' development of a lexis-oriented view of the target language rather than a grammaroriented view.Focusing on improving learners' use of linking adverbials in their essays by consulting a native corpus as a reference tool, Larsen-Wa lker (2017) found that learners improved their semantically appropriate production of linking adverbials in their writings.Similarly, Cotos (2014) conducted an experimental study comparing a control group who was using L1 reference corpora only with an experimental group who was using both L1 corpora and an L2 learner corpus derived from their students' personal data.This study provided support for utilizing learner corpus data to assist in identifying adverbial errors and ultimately in enhancing writing skills.Rather than solely relying on corpus as a reference source, Crosthwaite (2017) examined postgraduate language learners' error correction in their L2 writings through corpus consultation in combination with teacher feedback.The study revealed a positive effect of corpus consultation as a DDL tool in which learners corrected their lexical and grammar errors such as collocations, word choice, and word form.In a more recent study, Sun and Hu (2020) conducted research aimed at improving learners' use of hedging in academic writing.Their findings showed that consulting corpus data had a positive impact on the appropriate use of hedges and the quality of learners' academic writing.Each of these quantitative studies also includes qualitative data, indicating that both teachers and learners positively received the use of corpora for in-class DDL.
Several studies have demonstrated the positive impact of data-driven learning (DDL) on enhancing the language skills of L2 learners of Turkish.These studies provide evidence that DDL holds promise in improving various aspects of foreign language learning, including collocation teaching (Özbay & Özer, 2017), enhanced lexical awareness (Aşık et al., 2016), vocabulary acquisition (Soruç & Tekin, 2017), instruction of academic lexical bundles (Lay &Yavuz, 2020), and grammar learning (Özer & Özbay, 2022).Collectively, these studies not only emphasize the potential benefits of DDL in foreign language instruction compared to traditional methods but also highlight the gaps in the existing empirical literature.The studies point towards the promising direction of research that aims to bridge the divide between advancements in corpus studies and the practice of English language teaching.They suggest further research to successfully integrate corpus applications and the DDL approach into mainstream education in Turkey.
Drawing on a limited number of research studies conducted on DDL approaches to foreign language learning, particularly in the context of L2 academic writing in Turkey, this study investigates the impact of data-driven learning (DDL) on improving Turkish EFL learners' academic writing.More than a decade ago, Römer (2006) stated that "a lot still remains to be done before [...] we can say that corpora have actually arrived in language pedagogy" (p.129), and scholars today express concerns about the gap between theory and practice, suggesting that corpus applications have yet to become mainstream in second/foreign language teaching and learning (Perez-Paredes, 2022).To address this issue, this study aims to examine the effectiveness of corpus use as a DDL tool for improving L2 academic writing, including content, organization, grammar, vocabulary, and mechanics.Additionally, the study seeks to explore Turkish EFL learners' views on the potential of corpus use in L2 academic writing.The following research questions are addressed in this study: 1. Does the use of corpus as a DDL tool have an impact on enhancing Turkish EFL learners' L2 academic writing as compared to conventional techniques in terms of content, organization, grammar, vocabulary, and mechanics?
2. Which types of error are corrected by Turkish EFL learners when using corpus as a DDL tool?
3. What are Turkish EFL learners' views on the potential use of corpus as a DDL tool in L2 academic writing?

METHOD Design
This study employed a mixed-method sequential explanatory design to scrutinize Turkish EFL learners' corpus use in their L2 academic writing.A mixed-method design integrates both quantitative and qualitative data at a particular stage of the research to gain a better understanding of the phenomenon being studied (Creswell & Creswell, 2017).The mixedmethod sequential explanatory design consists of two distinct phases, a quantitative procedure followed by a qualitative procedure in order to gain a general understanding of the studied phenomena through quantitative approaches as well as to explain or elaborate on the statistical results for a deeper understanding (Creswell & Creswell, 2017).
Based on this design, the study examined the effect of corpus use on Turkish EFL learners' academic writing skills by applying a pre-experimental design to compare corpus use with conventional techniques and conducting content analysis to explore which traits of L2 academic writing were improved by using corpus as a DDL tool.Lastly, a written opinion survey was conducted to assess learners' opinions about corpus use.

Participants
This study selected participants through convenience sampling, which involves selecting individuals who are readily available and meet specific criteria (Dörnyei, 2007).Sixty-one freshman student teachers studying at the Department of English Language Teaching at a state university in Turkey participated in the study.Out of the 61 participants, 49 were female and 12 were male.The ages of the participants ranged from 18 to 20, and their English proficiency levels were B2.As part of the study, all participants received training in the use of the Corpus for Contemporary American English (COCA) as a data-driven learning tool.

Data Collection Tools
The study utilized COCA as the DDL tool.It was selected for its availability and coverage of over 560 million words of authentic texts, which are divided into sub-sections.The concordance lines from COCA were also used for presentation and practice purposes during the training.To familiarize learners with corpus use, corpus-based activities were developed.
A weighted rubric was developed to categorize learners' errors and grade their writings before and after the intervention, with a focus on five dimensions: content, organization, vocabulary, grammar, and mechanics.Weighted rubrics consider every criterion, but they assign more weight to specific criteria based on the teaching focus, the key components of the standard, and the timing of assessment (Burke, 2009).The 4-point weighted rubric included five dimensions: content, organization, vocabulary, grammar, and mechanics.
At the end of the study, a written opinion survey was developed to explore participants' views on the potential of corpus use as a DDL tool in writing.It was administered to all 61 participants, with a response rate of 100%.The survey consisted of seven open-ended questions to explore the role of the target corpus tool in the writing process.Opinion surveys are commonly conducted to determine the opinions of the target population on a specific situation or subject (Balcı, 2007).

Experiment
In this experiment, learners were trained to write argumentative essays and then assigned four writing tasks.The first two tasks were conventional-based, where they wrote an argumentative essay on a given topic and revised and edited their own errors and their peers' errors using traditional techniques like consulting dictionaries and grammar books.
After completing the conventional writing tasks, learners received two weeks of corpus training.In the first part, they learned about corpora and how to search for the frequency of words, vocabulary items, collocations, prefixes, suffixes, synonyms of words, grammar structures, and syntactic and semantic patterns, as well as how to search across sub-sections of the target corpora.They also practiced corpus searches and participated in some corpus-based activities that were prepared for the current study's purposes.Finally, they were assigned two error correction activities through corpus use as a DDL tool to correct errors utilizing COCA.
Following the corpus training, learners were assigned two additional corpus-based writing tasks.In the third and fourth writing tasks, they wrote an argumentative essay and then revised both their own writing and their peers' writing through COCA.Once they had completed their writing tasks, a written opinion survey was administered to participants to obtain their views on the potential use of corpus in L2 academic writing.

Data Analysis
In order to assess the effectiveness of corpus use as a data-driven learning (DDL) tool for enhancing Turkish EFL learners' L2 academic writing, a weighted rubric was used to score the first and final drafts of their writings.To ensure the reliability of the scores, two raters independently scored the writing, and the final score was determined as the average of the two raters' scores.The conventional-based and corpus-based written samples were compared using inferential statistics to reveal the efficacy of corpus-based applications.A normality test was initially conducted using SPSS (version 22.0), and since the sample size (N > 50) was greater than 50, the Kolmogorov-Smirnov test was used.The test was statistically significant, with a pvalue less than.005,indicating that the normality assumption was not met.Therefore, nonparametric tests were used (Cokluk et al., 2010).Specifically, the Wilcoxon signed-rank test from non-parametric tests was employed to compare conventional-based writings with corpusbased writings.
To identify the error corrections made by the learners, the researchers collected a copy of their first drafts and asked them to correct their errors using either conventional techniques or a corpus used as a DDL tool before creating the second drafts.The first draft was analyzed for errors, and the researchers checked whether they were corrected in the second draft to calculate the learners' corrections.Qualitative content analysis was employed to categorize the learners' errors and corrections, which helped determine the types of errors the learners made, the types of errors they corrected, and the errors they did not correct.
To investigate Turkish EFL learners' perceptions on the potential of corpus use as a DDL tool, qualitative content analysis was employed to the data obtained from written opinion surveys.Qualitative content analysis was conducted to reveal meaning behind the data through seeking coding units and patterns (Hoffman et al., 2012).The researcher coded and categorized the learners' responses based on relevance.Approximately 15% of the data were analyzed by another expert to ensure the reliability of error categorization.Miles and Huberman's (1994) formula was used to compute the percentage of reliability of error categorization, which showed that 90% of the errors were reliable.

Findings
The present study aimed to investigate the effectiveness of using corpora as a DDL tool to enhance the academic writing skills of Turkish EFL learners.First, the study sought to address the research question of whether there was a significant difference between the use of corpusbased applications and conventional techniques in improving L2 academic writing.The study compared the writing scores of learners who utilized conventional techniques to those who used corpora as a DDL tool.The Wilcoxon signed-rank test was then employed to analyze the writing scores, and the findings are presented in Table 1.The Wilcoxon Signed-Rank test was conducted to compare the conventional-based and corpus-based written samples in self-correction tasks.The written samples were evaluated in terms of content, organization, vocabulary, grammar, mechanics, and total score.Each aspect was scored based on negative ranks, with higher scores indicating better writing performance.The findings showed a significant improvement in the traits of Content (Z= -3,547; p<.05), Organization (Z= -2,748; p<.05), Vocabulary (Z= -4,323; p<.05), Grammar (Z= -2,611; p<.05), and Mechanics (Z= -2,711; p<.05).Additionally, the total score of the corpus-based written samples was significantly higher (Z= -4,966; p<.05), highlighting the effectiveness of using corpus as a DDL tool to enhance L2 academic writing.
Table 2 presents the results of the Wilcoxon Signed-Rank test that compared the use of conventional techniques and corpus in peer-correction tasks across various aspects of L2 academic writing, including content, organization, vocabulary, grammar, mechanics, and total score.The negative ranks represent a decrease in scores while positive ranks indicate an increase in scores.The findings in Table 2 showed a significant improvement in Vocabulary (Z=-4.487;p<.05) and Grammar (Z=-4.288;p<.05), demonstrating that corpus-based applications had a positive effect on those two aspects of academic writing.However, no significant difference was found in Content (Z=-1.225;p>.05), Organization (Z=-1.091;p>.05), and Mechanics (Z=-1.291;p>.05).Nonetheless, the total score showed a significant difference between the conventionalbased and corpus-based written samples (Z=-4.000;p<.05), indicating that corpus use as a DDL tool was effective in enhancing Turkish EFL learners' L2 academic writing.
In addition to evaluating the effectiveness of using corpus as a DDL tool, this study also analyzed Turkish EFL learners' corrections through corpus consultation in their writing.The content analysis aimed to identify the types of errors corrected by learners using corpus.To achieve this, the first and final drafts of the participants' conventional-based and corpus-based writing tasks were categorized based on errors, and the corrections made by learners were recorded.The findings of the content analysis are presented in Table 3. Table 3 displays the frequency of error correction using conventional techniques (W1, W2) and corpus-based tasks (W3, W4) in tasks involving Turkish EFL learners.The results indicated that learners made more corrections overall when using corpus-based tasks compared to conventional techniques.Specifically, learners' corrections through corpus consultation increased in all error categories, with the most significant increases observed in vocabulary and grammar errors.
In terms of vocabulary errors, learners' correction rates increased for all types, with the most prominent increase seen in verb errors when using corpus compared to conventional techniques.For grammar errors, learners' correction rates increased for preposition, tense, agreement, and sentence formation errors when using corpus, although they faced difficulties in correcting article errors.The most common errors in the category of mechanics were related to spelling and punctuation.
Finally, this study explored Turkish EFL learners' views on the potential of corpus use as a DDL tool in L2 academic writing.To explore their opinions, a written opinion survey was employed, and the findings of content analysis were presented in Table 4. Table 4 provides an overview of the Turkish EFL learners' views and opinions of the potential of corpus use as a DDL tool in L2 academic writing, addressing various aspects of corpus use.Participants reported using conventional techniques such as dictionaries and online resources to correct their own and their peers' errors.Regarding corpus use, they stated that they mainly consulted the target corpus tool to correct vocabulary and grammar errors, and almost all found corpus to be an effective and useful tool in improving their writing skills.Only a few participants mentioned that the corpus was ineffective as a reference tool due to the complex interface of COCA.They noted that COCA was not user-friendly and that searching for errors was a time-consuming process.
The participants also reported numerous benefits of corpus use as a DDL tool.They mentioned that exposure to authentic language, achieved by observing real-world language examples, allowed them to grasp the nuances of natural language use and incorporate more authentic expressions into their writing.This exposure also led to an increased awareness of their interlanguage features.Through the analysis of language patterns, participants gained insights into their own language use, which enabled them to identify areas for improvement and refine their writing style.Moreover, the DDL approach rendered the writing process more manageable for participants.It facilitated the identification and correction of errors while also enabling them to obtain more precise feedback.Furthermore, participants noted the introduction of academic language as a valuable outcome of corpus use.They encountered academic vocabularies and discourse structures, equipping them with essential tools for expressing ideas in scholarly contexts.Lastly, a particularly intriguing observation was the new perspective gained on the writing process, which implied that participants experienced a transformative shift in their approach to academic writing.This shift could involve understanding writing as a dynamic process, appreciating the role of context, and adopting a more analytical stance towards their own work.Perhaps owing to these mentioned benefits, most participants expressed willingness to utilize the corpus.The primary reason cited was its effectiveness and usefulness in error correction.

Discussion
This study has unveiled the positive impact of utilizing a corpus as a DDL tool in enhancing the academic writings of Turkish EFL learners compared to traditional methods.The Wilcoxon-Signed rank test showed a significant improvement in the learners' L2 academic writing when using corpus as a DDL tool.The content analysis also demonstrated similar results, revealing an increase in learners' correction of their errors, particularly vocabulary and grammar errors, when using corpus to enhance their academic writing skills.
Consistent findings have been reported in the existing literature regarding the impact of utilizing corpora as DDL tools to enhance language learners' writing skills.Luo (2016) and Feng (2014) observed that consulting corpora improved the writing accuracy and fluency of EFL/ESL learners.Similarly, Cotos (2014) highlighted notable progress in the writing skills of learners exposed to corpus-based data.Despite the observed increase in error correction, the content analysis of the corpus-based writings in Cotos' ( 2014) study revealed that the participants made no corrections to almost half of their errors.This aligns with the findings of the current study suggesting that many errors in corpus-based writing remain uncorrected.This result could imply that learners may require input processing strategies (Van Patten, 2004) even when utilizing inductive noticing strategies to analyze concordance output.
This study revealed that using corpus as a DDL tool improved grammar and vocabulary compared to conventional techniques.These findings are consistent with the existing literature, which shows that corpus use can have a positive effect on language learners' grammar and vocabulary.Feng (2014) reported an increase in learners' vocabulary use through corpus consultation in writing processes.Similarly, Crosthwaite (2017) found that learners mostly consulted corpus to correct vocabulary errors.Özer and Özbay (2022) also showed the effectiveness of corpus consultation in improving learners' grammar use, and their study suggested that corpus use as a DDL tool was effective in improving learners' grammar.
This study showed that corpus use did not have an impact on improving learners' content, organization, and mechanics.The existing literature, on the other hand, provided evidence that corpora can be an effective reference tool for improving content and organization (Birhan et al., 2021), and punctuation marks (Celik & Ekatmis, 2013).The conflicting results could be derived from differences in the study designs, participant backgrounds, or the specific instructional methods employed in each study.It's also possible that the impact of corpus use on different aspects of writing, such as content, organization, and mechanics, varies depending on the proficiency level of the learners, the complexity of the writing task, or the amount of guidance provided during the corpus consultation process (Boulton, 2009;Chambers & O'Sullivan, 2004;Ma & Maie, 2021).
This study found that the use of corpora enhanced learners' usage of prepositions, tense, agreement, and sentence formation.However, learners encountered difficulties in correcting their article errors while using the corpus as a DDL tool during their writing processes.These results are in line with previous research in the literature.For instance, Crosthwaite (2017) reported similar findings in his study on DDL-mediated error correction across learners' written samples, demonstrating that learners successfully corrected grammar errors, including morphosyntactic errors such as tense, number, and agreement.Larsen-Walker (2017) investigated the effect of DDL on students' ability to use linking adverbials correctly in their persuasive essays and found that corpus use as a DDL tool improved learners' use of linking adverbials in their writing.
The present study did not uncover any improvement in learners' writing with respect to the usage of articles, passives, quantifiers, and demonstrative determiners.In contrast, investigating the impact of corpus consultation as a DDL tool on language learners' grammar errors, Cowan et al. (2014) found that corpus use increased learners' awareness of grammatical errors in those four areas.This discrepancy might be due to the fact that the learners in the current study did not receive targeted instruction on these specific error types and were instead focused on correcting a range of language features during the error correction process.
The analysis of the participants' perspectives using corpora as a DDL tool in their writing processes revealed favorable views toward corpora.Participants viewed corpora as practical and effective instruments for discerning their interlanguage features through comparisons between their native language and authentic data.Corpora can serve as an external feedback mechanism, aiding learners in recognizing distinctions between their language and the target language.They achieve this through the application of inductive learning mechanisms to deduce linguistic rules, the identification of patterns in the target language, and the comparison between their language and the target language to pinpoint disparities and inconsistencies (O'Sullivan, 2007).As a result, the utilization of corpora as a DDL tool could heighten their awareness of their interlanguage features, empowering them to identify and rectify linguistic issues (Flowerdew, 2015;Gilquin & Granger, 2010).
Nevertheless, a few participants reported challenges and found using the corpus to be timeconsuming.These findings align with Aşık et al.'s (2016) study, which also noted that Turkish EFL learners encountered difficulties and time constraints when consulting corpora.The reported challenges in DDL may be attributed to potential confusion when interpreting concordance lines.Keywords are frequently presented within incomplete sentences, making it challenging to grasp the contexts of these lines (Hirata & Thompson, 2022).Additionally, EFL learners with limited exposure to the target language typically have more familiarity with traditional in-class activities or well-structured teacher-led tasks (Aşık et al., 2016;Hirata & Thompson, 2022).Although this study incorporated more learner-centered tasks aimed at promoting learner agency and self-regulation, as discussed by O'Keeffe (2021), some learners still found these tasks time-consuming and reported limited benefits from corpus use.

CONCLUSIONS
This study investigated the effectiveness of using a corpus as a tool for improving the L2 academic writing of Turkish EFL learners by comparing students' conventional-based and corpus-based written samples.This study also demonstrated participants' views towards the potential of using corpus as a tool for L2 academic writing.The study used the Wilcoxon signedrank test as a non-parametric test to determine the efficacy of corpus use, and content analysis was employed to categorize learners' errors and corrections, indicating in which traits of L2 academic writing participants used corpus to correct their errors.Another content analysis was employed to demonstrate their views towards using a corpus as a DDL tool.The findings revealed that the use of corpus had a significant positive impact on the L2 academic writing of Turkish EFL learners, especially in terms of increasing their vocabulary and grammar usage.Furthermore, the participants reported positive views on the usefulness of corpus use as a DDL tool in improving their L2 academic writing.
This study suggests certain pedagogical implications for error correction, and grammar and vocabulary teaching in light of the findings.Firstly, this study provides evidence that EFL learners can effectively consult corpora as a DDL tool to correct their errors and improve their writing skills.Therefore, it is recommended that corpora use and DDL activities be integrated into language learners' writing processes to enhance their language learning outcomes.Additionally, input processing and enhancement strategies are suggested when working on concordancers to improve EFL learners' correction rates.Second, this study provides evidence that integrating corpora improves learners' understanding of vocabulary and grammar within context.Therefore, the utilization of corpora and engagement in DDL activities could be recommended to facilitate vocabulary and grammar learning.Furthermore, this study reinforces the notion that integration of corpora and corpus-based activities can effectively promote learners' awareness of authentic language and its interlanguage features.Thus, it is recommended that material developers utilize corpora input to present authentic and meaningful data to language learners.This study also suggests avenues for further research.First, this study is limited to including only an experimental group.A control group can be proposed to mitigate the learning effect.Also, this study has been carried out in a university setting.Further studies should be conducted to focus on lower-level language learners and to define language awareness.Further, corpus use was found to be ineffective by a few language learners in this study.Therefore, further research could investigate the effects of individual differences among language learners, such as learning styles, learning strategies, etc., to give more insights into the efficacy of corpus use on language teaching/learning.Lastly, this study, like most previous studies, focused on utilizing corpus as a reference tool in the language learning process.Further research can focus on corpora teaching to learners to get the maximum benefit of corpora in language learning/teaching.

NOTES
This paper is part of the first author's MA thesis under the supervision of the second author.

Table 1 . Findings of the Wilcoxon Signed-Rank Test in Self-correction Tasks
*Based on negative ranks

Table 2 . Findings of the Wilcoxon Signed-Rank Test in Peer-correction Tasks
*Based on negative ranks