• Skip to primary navigation
  • Skip to main content

site logo
The Electronic Journal for English as a Second Language
search
  • Home
  • About TESL-EJ
  • Vols. 1-15 (1994-2012)
    • Volume 1
      • Volume 1, Number 1
      • Volume 1, Number 2
      • Volume 1, Number 3
      • Volume 1, Number 4
    • Volume 2
      • Volume 2, Number 1 — March 1996
      • Volume 2, Number 2 — September 1996
      • Volume 2, Number 3 — January 1997
      • Volume 2, Number 4 — June 1997
    • Volume 3
      • Volume 3, Number 1 — November 1997
      • Volume 3, Number 2 — March 1998
      • Volume 3, Number 3 — September 1998
      • Volume 3, Number 4 — January 1999
    • Volume 4
      • Volume 4, Number 1 — July 1999
      • Volume 4, Number 2 — November 1999
      • Volume 4, Number 3 — May 2000
      • Volume 4, Number 4 — December 2000
    • Volume 5
      • Volume 5, Number 1 — April 2001
      • Volume 5, Number 2 — September 2001
      • Volume 5, Number 3 — December 2001
      • Volume 5, Number 4 — March 2002
    • Volume 6
      • Volume 6, Number 1 — June 2002
      • Volume 6, Number 2 — September 2002
      • Volume 6, Number 3 — December 2002
      • Volume 6, Number 4 — March 2003
    • Volume 7
      • Volume 7, Number 1 — June 2003
      • Volume 7, Number 2 — September 2003
      • Volume 7, Number 3 — December 2003
      • Volume 7, Number 4 — March 2004
    • Volume 8
      • Volume 8, Number 1 — June 2004
      • Volume 8, Number 2 — September 2004
      • Volume 8, Number 3 — December 2004
      • Volume 8, Number 4 — March 2005
    • Volume 9
      • Volume 9, Number 1 — June 2005
      • Volume 9, Number 2 — September 2005
      • Volume 9, Number 3 — December 2005
      • Volume 9, Number 4 — March 2006
    • Volume 10
      • Volume 10, Number 1 — June 2006
      • Volume 10, Number 2 — September 2006
      • Volume 10, Number 3 — December 2006
      • Volume 10, Number 4 — March 2007
    • Volume 11
      • Volume 11, Number 1 — June 2007
      • Volume 11, Number 2 — September 2007
      • Volume 11, Number 3 — December 2007
      • Volume 11, Number 4 — March 2008
    • Volume 12
      • Volume 12, Number 1 — June 2008
      • Volume 12, Number 2 — September 2008
      • Volume 12, Number 3 — December 2008
      • Volume 12, Number 4 — March 2009
    • Volume 13
      • Volume 13, Number 1 — June 2009
      • Volume 13, Number 2 — September 2009
      • Volume 13, Number 3 — December 2009
      • Volume 13, Number 4 — March 2010
    • Volume 14
      • Volume 14, Number 1 — June 2010
      • Volume 14, Number 2 – September 2010
      • Volume 14, Number 3 – December 2010
      • Volume 14, Number 4 – March 2011
    • Volume 15
      • Volume 15, Number 1 — June 2011
      • Volume 15, Number 2 — September 2011
      • Volume 15, Number 3 — December 2011
      • Volume 15, Number 4 — March 2012
  • Vols. 16-Current
    • Volume 16
      • Volume 16, Number 1 — June 2012
      • Volume 16, Number 2 — September 2012
      • Volume 16, Number 3 — December 2012
      • Volume 16, Number 4 – March 2013
    • Volume 17
      • Volume 17, Number 1 – May 2013
      • Volume 17, Number 2 – August 2013
      • Volume 17, Number 3 – November 2013
      • Volume 17, Number 4 – February 2014
    • Volume 18
      • Volume 18, Number 1 – May 2014
      • Volume 18, Number 2 – August 2014
      • Volume 18, Number 3 – November 2014
      • Volume 18, Number 4 – February 2015
    • Volume 19
      • Volume 19, Number 1 – May 2015
      • Volume 19, Number 2 – August 2015
      • Volume 19, Number 3 – November 2015
      • Volume 19, Number 4 – February 2016
    • Volume 20
      • Volume 20, Number 1 – May 2016
      • Volume 20, Number 2 – August 2016
      • Volume 20, Number 3 – November 2016
      • Volume 20, Number 4 – February 2017
    • Volume 21
      • Volume 21, Number 1 – May 2017
      • Volume 21, Number 2 – August 2017
      • Volume 21, Number 3 – November 2017
      • Volume 21, Number 4 – February 2018
    • Volume 22
      • Volume 22, Number 1 – May 2018
      • Volume 22, Number 2 – August 2018
      • Volume 22, Number 3 – November 2018
      • Volume 22, Number 4 – February 2019
    • Volume 23
      • Volume 23, Number 1 – May 2019
      • Volume 23, Number 2 – August 2019
      • Volume 23, Number 3 – November 2019
      • Volume 23, Number 4 – February 2020
    • Volume 24
      • Volume 24, Number 1 – May 2020
      • Volume 24, Number 2 – August 2020
      • Volume 24, Number 3 – November 2020
      • Volume 24, Number 4 – February 2021
    • Volume 25
      • Volume 25, Number 1 – May 2021
      • Volume 25, Number 2 – August 2021
      • Volume 25, Number 3 – November 2021
      • Volume 25, Number 4 – February 2022
    • Volume 26
      • Volume 26, Number 1 – May 2022
      • Volume 26, Number 2 – August 2022
      • Volume 26, Number 3 – November 2022
      • Volume 26, Number 4 – February 2023
    • Volume 27
      • Volume 27, Number 1 – May 2023
      • Volume 27, Number 2 – August 2023
      • Volume 27, Number 3 – November 2023
      • Volume 27, Number 4 – February 2024
    • Volume 28
      • Volume 28, Number 1 – May 2024
      • Volume 28, Number 2 – August 2024
      • Volume 28, Number 3 – November 2024
      • Volume 28, Number 4 – February 2025
    • Volume 29
      • Volume 29, Number 1 – May 2025
      • Volume 29, Number 2 – August 2025
      • Volume 29, Number 3 – November 2025
      • Volume 29, Number 4 – February 2026
  • Books
  • How to Submit
    • Submission Info
    • Ethical Standards for Authors and Reviewers
    • TESL-EJ Style Sheet for Authors
    • TESL-EJ Tips for Authors
    • Book Review Policy
    • Media Review Policy
    • TESL-EJ Special issues
    • APA Style Guide
  • Editorial Board
  • Support

Integrating ChatGPT into Teacher Feedback: Practical Insights for L2 Writing Instruction

* * * On the Internet * * *

November 2025 — Volume 29, Number 3

https://doi.org/10.55593/ej.29115int

Jonna Marie Lim
De La Salle University-Manila
<jonna.limatmarkdlsu.edu.ph>

Christian Go
University of the Philippines-Diliman
<cggoatmarkup.edu.ph>

Abstract

This paper explores the integration of ChatGPT into the L2 writing classroom as a tool for enhancing teacher feedback on student essays. Using a reflexive case study methodology, we examine how generative AI (GenAI) augments teacher feedback in areas such as thesis clarity, idea development, and grammatical accuracy. By combining ChatGPT’s rapid feedback with teachers’ contextual insights, we propose a blended feedback model that leverages both AI capabilities and teachers’ expertise for more personalized feedback. Results show that the teacher effectively utilizes ChatGPT by integrating AI-generated insights with her own to refine feedback, correct inaccuracies in ChatGPT’s feedback, and engage students in meaningful dialogues based on these AI-generated insights. These strategies highlight the blended feedback model’s potential to provide comprehensive and personalized feedback, which could deepen student engagement and enhance the quality of their writing. While ChatGPT proves beneficial for formative feedback, its effectiveness is greatly enhanced when combined with the expert knowledge of teachers, ensuring that the feedback remains relevant and tailored to individual student needs. Thus, we recommend further exploration of blended feedback models and additional strategies for utilizing GenAI to enhance the quality of teacher feedback, particularly in formative assessments in L2 writing classes.

Keywords: GenAI, ChatGPT, teacher feedback, writing instruction

The integration of generative artificial intelligence (GenAI) in writing classrooms has sparked interest among scholars in education and applied linguistics. A significant strand of research on the subject revolves around concerns with GenAI’s capacity to enhance and undermine the integrity of academic writing, the value of revision and editing in the writing process, and the potential for it to impede students’ critical thinking skills (Sullivan et al., 2023; Walter, 2024). These concerns stem from the broader apprehension that relying heavily on AI could lead to an erosion of foundational writing skills and hinder students’ ability to engage critically with their own work (Draxler et al., 2023). At the same time, attention is also turning to how GenAI is reshaping the pedagogical landscape—not just for students, but for educators as well. A growing body of research is also investigating the ways in which GenAI can reshape the role and responsibilities of educators themselves (Crompton et al., 2024; Hockly, 2023). Scholars highlight that the integration of AI in teaching might be used to automate repetitive tasks like grammar checking or aid teachers in generating writing prompts, ultimately saving time and allowing educators to focus more on higher-level instructional tasks (Escalante et al., 2023).

Of particular interest to this paper is the role of GenAI as a tool for providing personalized feedback on L2 (second language) student writing. While AI-generated feedback systems like OpenAI’s ChatGPT, Google’s Bard, and Microsoft’s Bing offer promising potential for reshaping feedback practices, they cannot replicate the nuanced understanding teachers bring to their students’ individual needs and contexts. Still, their capacity for offering scalable critiques provides an external perspective that could be especially beneficial for L2 learners, who often require frequent, detailed feedback to address common writing errors and internalize writing conventions (Mollick & Mollick, 2023). This aligns with Escalante et al. (2023), who note that AI-generated feedback is useful not only for its time efficiency but also for offering focused, individualized comments that can be especially valuable in large classes where personalized attention is often limited. GenAI tools can identify recurring issues in student writing, such as lexical errors, structural problems, logical flow, and even style or tone—areas where L2 learners frequently struggle.

While much of the current research has focused on the advantages of AI in providing generalized, scalable feedback, there is a gap in the literature regarding how teachers in L2 contexts can effectively incorporate AI-generated insights into their own feedback practices (Barrot, 2023; Mi et al., 2025). Existing studies have largely treated AI feedback as a standalone tool, but less attention has been paid to how teachers might blend AI-generated feedback with their own expertise to create a more personalized and nuanced feedback experience for students. This paper seeks to fill that gap by examining how ChatGPT, as a generative AI tool, can be integrated into the L2 writing classroom to enhance and supplement teacher feedback. Specifically, we propose a blended feedback model that combines the strengths of AI-generated insights with the contextual understanding and pedagogical expertise of teachers, offering a more comprehensive approach to feedback. This model aims to refine traditional feedback practices by using AI to highlight areas for improvement while still ensuring that feedback remains relevant, personalized, and aligned with the specific needs of each student. By focusing on how teachers can effectively adapt AI-generated insights to complement their own feedback, this paper contributes to the growing body of research on the pedagogical applications of GenAI in L2 education, offering practical strategies for educators seeking to enhance their feedback processes.

The following section outlines current scholarship on teacher feedback and automated forms of evaluation. Afterward, we present the methodology and findings based on one of the authors’ self-reflective practice vis-à-vis GenAI-use in providing feedback on students’ writing.

Teacher feedback, automation and their interface

Teacher feedback has been viewed as a crucial component in second-language writing research. Scholars note the importance of teacher feedback in strengthening student’s language skills development (e.g., grammar) and their willingness to revise their work (Hyland & Hyland, 2006). Previous research on teacher feedback involved providing descriptions of the ways in which feedback is conveyed (Chaudron, 1986) and insights on designing feedback processes (Olsen & Hunnes, 2024). These studies note the importance of effective feedback strategies, such as error correction, multiple feedback modalities, and timely and personalized feedback. This immediacy is particularly valuable, as feedback is most effective when delivered promptly, ideally soon after the completion of a writing task (Rahimi et al., 2024). However, teacher feedback has also been subject to critique as a potentially problematic practice. Several studies investigating the nature of written responses of instructors to L2 student writing reveal teachers’ tendencies to prioritize technical concerns or surface features, such as grammar, over rhetorical ones (Furneaux et al., 2007; Lee, 2019; Stern & Solomon, 2006). This type of feedback that focuses disproportionately on error correction may lead L2 students to perceive writing as a purely technical skill rather than a communicative or expressive one (Ferris, 2006; Nguyen, 2024). Additionally, such feedback that emphasizes deficits or adheres to monolingual norms can discourage linguistic diversity thereby limiting students’ ability to engage with their multilingual identities in their writing (Canagarajah, 2013). As such, there is a need for feedback practices that are inclusive, context-sensitive, and aligned with fostering students’ broader communicative competence.

Additionally, studies on feedback for L2 students show that both the amount and type of feedback affect the quality of their final drafts (Dawson et al., 2019; Hyland & Hyland, 2019; Matsumura et al., 2002). Ellis (2009) distinguishes between direct feedback, which provides explicit corrections, and indirect feedback, which encourages students to self-edit. While direct feedback benefits lower-proficiency learners, indirect feedback fosters deeper cognitive engagement by prompting students to reflect on their linguistic choices (Bitchener & Ferris, 2012; Ferris, 2006; Guénette, 2007). Put in another way, the efficacy of feedback strategies varies depending on the specific writing situation and learner needs. Furthermore, feedback timing is crucial—studies indicate that immediate feedback enhances learning outcomes, as students can integrate corrections more effectively (Rahimi et al., 2024). Bock et al. (2024) add that providing effective feedback requires teachers to draw on a variety of knowledge: understanding how writing works (content knowledge), what good writing looks like in a specific context (pedagogical knowledge), identifying areas for improvement (diagnostic knowledge), and explaining those areas to the student in a way that promotes improvement (instructional knowledge). Moreover, teacher-student interaction influences feedback effectiveness. Storch and Wigglesworth (2010) suggest that dialogic feedback, such as teacher-student conferences, enhances students’ ability to internalize suggestions and apply them in subsequent writing tasks. Thus, the effectiveness of feedback is closely tied to the teacher’s ability to synthesize content, pedagogical, diagnostic, and instructional knowledge while also considering individual learner characteristics and the broader instructional context (Lee, 2019; Yu & Hu, 2017).

Complementary to scholarly inquiries on the means of providing feedback to students are explorations of the most efficient and pedagogically sound ways to deliver these. The use of Automated Writing Evaluation (AWE) tools for formative feedback on student writing has engendered an ongoing conversation about its utility. Proponents highlight the potential for AWEs to streamline the grading process, freeing up teacher time to address higher-level writing skills like organization and argument development (Wilson & Roscoe, 2020). Moreover, AWEs offer consistency in feedback delivery, reducing potential bias in teacher evaluations and ensuring all students receive structured guidance (Dikli, 2006; Stevenson & Phakiti, 2014). Some studies also suggest that AWE-generated feedback can enhance student engagement by providing immediate responses that facilitate iterative revisions (Ranalli, 2018). That said, while automated writing evaluation (AWE) systems provide constant, personalized feedback, this feedback often focuses on surface-level comments rather than propositional content, thus restricting students’ creativity, hindering the development of ideas, and encouraging the adoption of shortcuts (Link et al., 2022; Y. Wang et al., 2013). AWE tools also struggle to effectively assess rhetorical elements such as audience awareness, argument structure, and coherence, which are central to academic writing (Grimes & Warschauer, 2010; Roscoe et al., 2017). Link et al. (2022) highlight several limitations in current discussions on AWE: (1) a reliance on theoretical rather than substantial empirical evidence, (2) inconclusive findings on AWE program effectiveness, and (3) a focus on short-term impacts, neglecting potential long-term effects on student writing skills. Thus, while AWE systems hold promise as supplementary tools, scholars emphasize the need for hybrid approaches that integrate automated feedback with human guidance to foster deeper writing development (Liu & Yu, 2022; Stevenson & Phakiti, 2014).

ChatGPT feedback on writing

The introduction of chatbots like ChatGPT has significantly influenced Automated Writing Evaluation (AWE) and feedback mechanisms in writing instruction (Barrot, 2023; Gayed et al., 2022). As research on ChatGPT’s role in writing classrooms continues to grow, studies have examined its affordances in areas such as automated essay evaluation, AI-generated feedback, writing efficacy, and its practical applications in enhancing writing skills (Kohnke et al., 2023; Mizumoto & Eguchi, 2023; Teng, 2024a). While these studies highlight ChatGPT’s utility in improving student writing, ongoing research continues to explore its specific strengths and limitations as a feedback tool. In their study examining the impact of ChatGPT as a formative feedback tool on the academic writing skills of undergraduate ESL students, Mahapatra (2024) employed a mixed-methods intervention approach. The study involved 35 participants in an experimental group and 37 in a control group, with qualitative data collected through focus group discussions. The findings indicated that ChatGPT significantly enhanced students’ writing organization, grammatical accuracy, and overall engagement in the writing process. Furthermore, students reported increased collaboration and focus while using ChatGPT, which helped them adhere to topic sentences and improve the coherence of their writing (see also Su et al., 2023; Teng, 2024b). L. Wang et al. (2024) assessed the feedback accuracy of ChatGPT in evaluating undergraduate students’ argumentation skills and found that it demonstrated a precision rate of 91.8% and a recall rate of 63.2%. The accuracy of ChatGPT’s feedback was significantly influenced by the length of the arguments, with longer arguments yielding lower precision and recall rates, and by the presence of discourse markers, which served as indicators of argument structure. Further, they noted that although ChatGPT could identify the structural components of arguments, its qualitative assessments—such as evaluating the adequacy of evidence—were less robust. Meanwhile, M. Wang & Guo (2023) found that ChatGPT provided more directive feedback (i.e., specifying what changes need to be implemented), offered more positive reinforcements, and gave more summaries when assessing the organization of writing. These affordances may offer a distinct advantage in that students can re-examine their work from a holistic perspective (global level) and grasp organizational shortcomings within their writing. Taken together, these studies suggest that ChatGPT can enhance students’ writing by providing structured, rubric-based feedback and facilitating engagement, though its effectiveness varies depending on writing task complexity and the type of feedback required.

However, research also highlights ChatGPT’s limitations compared to human evaluators. Steiss et al. (2024), for instance, compared ChatGPT’s feedback to that of human evaluators and found that human evaluators generally provided higher-quality feedback in terms of accuracy, prioritization of essential features, clarity of directions, and supportive tone. That said, ChatGPT performed better than human evaluators in criteria-based feedback, as it consistently referenced evaluation rubrics. These studies underscore the potential for ChatGPT to serve as a preliminary feedback tool, helping students address fundamental issues before seeking human evaluation, while also highlighting the need for AI literacy and strategic implementation in writing instruction. In this sense ChatGPT’s role as a useful preliminary feedback tool, is contingent on human factors (e.g., oversight, engagement).

Research on ChatGPT as a feedback tool also highlights perceptions about its accessibility and effectiveness in enhancing student writing, while also underscoring concerns about its limitations in providing personalized and context-sensitive feedback. In Mai et al.’s (2024) systematic review of ChatGPT’s use in education. They analyzed studies from various countries (including the United States, Canada, the United Kingdom, Germany, China, Saudi Arabia, and Vietnam) and noted that educators in these contexts generally perceive ChatGPT as a useful tool in accomplishing teaching tasks which includes providing feedback. For students, while ChatGPT enhances feedback accessibility—particularly in writing tasks, as evidenced by studies in Hong Kong and Turkey—its responses tend to be generic, often necessitating human oversight to foster deeper critical engagement (Mai et al., 2024).

The effectiveness of ChatGPT as a feedback tool has been further explored in research comparing its performance to that of human evaluators. Building on perceptions of AI’s role as a feedback tool, Teng (2024b) examines how ChatGPT influences EFL learners’ writing development in a classroom setting by focusing on its impact on motivation, self-efficacy, and collaborative writing. Specifically, Teng (2024b) investigated the impact of ChatGPT on the writing skills of 45 EFL learners in Macau, utilizing a mixed-methods approach that included quantitative data from questionnaires and qualitative insights from interviews after a writing course. Findings revealed significant positive effects of AI assistance on students’ writing motivation, self-efficacy, engagement, and collaborative writing tendencies. While participants acknowledged the benefits of ChatGPT in providing timely and specific feedback, some expressed concerns regarding the lack of personal touch and the machine’s limitations in understanding nuanced writing issues (e.g., students noted that ChatGPT sometimes struggled with understanding the more nuanced aspects of writing, such as tone, context, and specific writing challenges unique to individual learners.).

Expanding on these findings, Han and Li (2024) examined ChatGPT-assisted teacher feedback in a Chinese university EFL course, proposing an “AI + Teacher” model where instructors refined ChatGPT’s indirect corrective and rhetorical feedback before sharing it with students. Their study of 102 learners found that while AI-generated feedback improved efficiency, teachers still played a crucial role in adapting the feedback for clarity and relevance.

Building upon the previously discussed scholarly explorations of GenAI’s potential within writing instruction, this paper advances these conversations by examining the ways in which ChatGPT, as a tool for AWE, may be utilized in conjunction with the teacher’s feedback. Furthermore, according to L. Yang and Li (2024), most studies on ChatGPT and language learning have primarily examined the contexts of EFL college learners. As such, this study aims to expand the investigation into how AI-generated feedback might be tailored to meet the needs of L2 writers and support educators in providing more targeted guidance and instruction.

With these insights, this paper explores how ChatGPT’s capabilities can be strategically utilized to optimize the effectiveness and efficiency of feedback delivery in L2 writing classes. To achieve this, it focuses on ChatGPT’s utility as a tool for refining the quality of feedback on student writing. In this inquiry, we seek to answer the following research question: In what ways can teachers integrate and adapt ChatGPT’s feedback to enhance the quality of their own feedback on L2 student writing? We suggest that the incorporation of ChatGPT’s analytical and editing capabilities with the more nuanced, contextual expert knowledge of teachers can lead to a blended model of feedback. This approach leverages AI-generated insights as a springboard for deeper engagement with students’ work, fostering more in-depth, personalized teacher feedback and potentially leading to improved writing outcomes for L2 students.

Methodology

Research design

The present study employs a reflexive case study approach to analyze qualitative data derived from the lead author’s annotations and reflections on ChatGPT’s feedback on the 22 problem-solution essay drafts written by her first-year college students in a communication course. The analysis focuses on evaluating the effectiveness of ChatGPT’s feedback, with the lead author–hereafter referred to as the teacher–making critical decisions on whether to modify, retain, or discard the AI-generated feedback before it is finalized and forwarded to the students. By integrating self-reflection, a fundamental aspect of effective teaching practice (Yin, 2014), this study aims to deepen understanding of how emerging technologies like ChatGPT can complement and enhance teacher feedback on student writing by analyzing the teacher’s interactions with AI-generated feedback on student essays.

The participant

This study examines the feedback annotations and reflections of a female university instructor who brings her extensive experience to teaching writing, with 18 years at the secondary level and 6 years at the tertiary level. She has a specialized certification in assessment and has an MA in English Language Teaching and a PhD in Applied Linguistics. Currently, she works at a university in Manila where she teaches Purposive Communication, a General Education course required for first-year college students in the Philippines, which also serves as the data collection site for this study. These experiences greatly shaped her decisions on ChatGPT’s feedback utility and effectiveness in creating quality, more personalized feedback on her students’ essays. To mitigate potential bias given her dual role as a participant and a researcher, peer debriefing was done with the co-researcher who provided an external review to ensure that the data inferences were accurate and within the bounds of the available data.

 The context

The primary data for this study were the feedback annotations made by the teacher on 22 problem-solution essays written by first-year college students in their Purposive Communication class. Purposive Communication is a mandatory General Education course for all Filipino college students where they are required to develop their abilities to communicate ethically, effectively, and professionally for diverse audiences, and purposes in various modes. One of the major requirements of the course is the problem-solution essay where students engage critically with a localized issue linked to a Sustainable Development Goal (SDG).

Data collection procedures
In integrating ChatGPT into the feedback process, the following steps were taken. The teacher informed the students that ChatGPT will be used in creating more personalized feedback on their essays and she explained the process for generating the final feedback for revisions. She then read each essay and uploaded the draft to ChatGPT 3.5, using a detailed prompt instructing it to give feedback on the following feedback areas: quality of the thesis statement, clarity of problem, solution and call to action, development of ideas throughout the paragraph and essay, academic quality of the language, transitional phrases, use of sources and evidence, and grammatical accuracy. The prompt was adapted from Escalante et al. (2023) to align with the grading criteria for the course’s problem-solution essay.

Shown in Figure 1 is a sample of ChatGPT’s generated feedback, categorized according to the specific feedback areas mentioned earlier.

Sample ChatGPT-generated Feedback on an Essay Draft
Figure 1. Sample ChatGPT-generated Feedback on an Essay Draft

The teacher then reviewed ChatGPT’s generated feedback, annotating at the margins whether she would retain, modify, or remove each feedback. She also added her initial insights on the quality of the feedback generated by ChatGPT, as illustrated in Figure 2. In the analysis, the texts enclosed in brackets [] and italicized represent the feedback of the teacher. Each feedback is tagged with a code like “A57G15”, where “A57” specifies the class section and “G15” identifies the group number within that section.

Sample Teacher Annotation on ChatGPT-generated Feedback
Figure 2. Sample Teacher Annotation on ChatGPT-generated Feedback

The final feedback given to the students was compiled in a separate document. This document included feedback added and modified by the teacher while removing any incorrect or unnecessary comments from ChatGPT.

Data analysis

The teacher’s annotations and reflections underwent reflexive thematic analysis (Braun & Clarke, 2013) to uncover the underlying strategies– these teacher strategies emerged as the four key themes of the data. The analysis began with data familiarization, where all marginal comments by the teacher on the ChatGPT-generated feedback were consolidated into a single document. These comments reflected the teacher’s decisions to adopt, remove, or modify the AI feedback and included her initial thoughts and insights during the review process.

Next, initial codes were generated where the coding focused on the strategies the teacher employed in response to the AI-generated feedback. These initial codes were then analyzed to identify patterns that formed the themes, in this case, the teacher’s strategies for using or building upon ChatGPT’s feedback. These themes were ultimately finalized and defined as teacher strategies. The reflexive thematic analysis proved useful in identifying how the teacher effectively integrated her expertise with AI-generated comments to achieve the desired learning outcomes of the essays. It highlighted her extensive experience in providing feedback and her nuanced understanding of her students’ actual needs.

Since the data came from the lead author’s personal annotations and reflections, traditional consent from external participants was not necessary. However, the analysis ensured integrity in the handling and presentation of the reflective data by acknowledging the intrinsic subjectivity of a reflexive thematic analysis.  The teacher, a university instructor with extensive experience in secondary and tertiary education, utilizes her background in assessment and advanced degrees in English Language Teaching and Applied Linguistics to enrich her analysis.

Findings

In the sample, four key strategies emerged from the teacher’s annotations and reflections on ChatGPT’s feedback. These strategies relate to the dynamic between ChatGPT-generated comments and teacher expertise, specifically how the teacher negotiates the generated output: (1) retaining ChatGPT’s feedback and enhancing it with additional comments; (2) omitting ChatGPT’s inaccurate feedback; (3) creating more nuanced feedback; and (4) engaging the students in a dialogue using ChatGPT’s feedback. This section discusses each of these key strategies in detail, highlighting both the strengths and limitations of ChatGPT’s feedback and underscoring the critical role of the teacher in refining and contextualizing AI-generated feedback to better meet students’ learning needs.

Retaining and enhancing ChatGPT-generated feedback strategy

ChatGPT was valuable in identifying specific areas for student improvement, which served as a starting point for teachers to expand and offer more targeted guidance. By analyzing student work, ChatGPT could pinpoint weaknesses in the quality of thesis statements, development of ideas, and use of transitions. This pinpointed feedback allowed teachers to focus their efforts on these crucial aspects of writing, ensuring their interventions were directly addressing student needs.

In the first type of feedback, ChatGPT comments on the quality of the student’s thesis statement. It pinpoints the strengths by acknowledging the focus of the thesis’ central dilemma: balancing marine conservation with livelihoods (Sample 1 below). Furthermore, the feedback constructively criticizes the weakness by suggesting the inclusion of a proposed solution or recommendation. Additionally, the provided sample demonstrates how incorporating a potential solution, such as stricter enforcement alongside community engagement initiatives, can enhance the thesis. The provided feedback thus provides instructions on how to make the thesis more argumentative and outlines the paper’s intended direction. While the feedback doesn’t explicitly state that including a solution is mandatory for all thesis statements, its suggestion and sample illustrate how following the feedback can significantly improve the focus and argumentative nature of the thesis.

Sample 1

Feedback on the quality of the thesis statement: The thesis statement provided in the student’s writing does address the writing prompt by highlighting the dilemma between preserving marine biodiversity and supporting the livelihoods of people. However, it could be improved by explicitly mentioning the proposed solution or recommendations for practical action. For example, an improved thesis statement could be:

“While addressing the challenge of balancing marine conservation with supporting local livelihoods poses a dilemma, implementing stricter enforcement of marine biodiversity laws alongside community engagement initiatives can effectively preserve marine biodiversity while also sustaining the livelihoods of coastal communities.” (A57DG8)

As can be seen in Sample 2 below, both the ChatGPT-generated and teacher feedback focus on improving the development of ideas in the essay and complement each other. The ChatGPT feedback highlights the need for further elaboration on a specific solution related to the IFN (Ideal Flow Network)– its workings, feasibility, and potential challenges. This technical focus strengthens the argument’s foundation. This comment is further enhanced through the teacher’s clarification where she emphasizes the integration and flow of ideas. Rather than listing these benefits, the teacher advises integrating them seamlessly into the paragraphs, thereby enhancing the earlier recommendations from ChatGPT more clearly. Thus, while each feedback emphasizes the need for a deeper and more nuanced discussion of the IFN, the teacher’s feedback specifically focuses on enhancing the coherence and flow of the essay by improving how the benefits are presented.

Sample 2.

Feedback on the development of ideas throughout the paragraph and essay: Your essay effectively develops the problem of traffic congestion in Manila, providing ample evidence and statistics to support the argument. However, the discussion on the proposed solution using the Ideal Flow Network could be further elaborated. Provide more details on how this model works, its feasibility, and potential challenges in its implementation. Additionally, consider addressing potential counterarguments to strengthen your argument. [Can you re-write the part where you presented the benefits of the IFN. Instead of enumerating them, integrate them in your paragraphs. Better yet, use these input to address the recommendations stated in the previous paragraph.] (A57DG7)

Meanwhile in Sample 3, the feedback concerns the use of transitional phrases, but from different angles. ChatGPT first provides positive reinforcement, commending the appropriate use of transitions for guiding the reader. At the same time, it recommends creating variety by replacing repetitive phrases like “Moreover” with alternatives like “In addition” or “Consequently.” The teacher’s feedback, on the other hand, enhances the feedback by providing personalized commentary such as integrating bulleted points from an outline into a cohesive essay, and raises a concern about exceeding the word limit. In this sense, the teacher’s feedback points to specific structural issues in the student’s paper and offers strategies for improving the overall organization and coherence of the essay as well as contextualized feedback that fits with the specifics of the assignment.

Sample 3. ChatGPT Feedback with the Teacher’s Personalized Commentary on Transitions

Feedback on the transitional phrases: Your essay uses transitional phrases appropriately to guide the reader through the various sections and ideas. However, you could improve the flow by incorporating more diverse transitional phrases. For example, instead of repeatedly using phrases like “Moreover” or “Furthermore,” you could vary your transitions with alternatives like “In addition” or “Consequently” to add variety and sophistication to your writing. [When you remove the headings and integrate the bullet points in paragraphs, make sure to use effective transitions to link your ideas. Btw, did you not go beyond the 2500 word count? Your draft looks like an outline. You revise it in a way that all your key ideas are presented in one continuous essay.] (A58G7)

These examples highlight how ChatGPT’s comments provided a foundation for improvement by focusing on essential elements that contribute to effective essays.

Omitting ChatGPT’s inaccurate feedback strategy

Although ChatGPT-generated feedback provides a useful starting point for personalized feedback, it also has limitations that necessitate teacher intervention and correction. These constraints stem from the nature of AI language models, which can struggle to replicate the nuanced judgment of an instructor. The examples below are instances where the generated feedback may be inaccurate or misleading. This prompted the researchers to flag them as inaccurate and omit them. This problematic feedback includes the following issues: deviations from the essay’s central topic, provision of irrelevant advice, inclusion of factually incorrect information, and generation of unhelpful suggestions that do little to improve the student’s writing.

Sample 4. Sample of ChatGPT’s Incorrect Feedback on Comma Splice

Sentence Error Type Description Suggestion
While sometimes described as the “poor man’s economy” (Indon, 2002), this label does little justice to its multifaceted nature. Comma splice The sentence contains a comma splice, where two independent clauses are joined by a comma alone. Replace the comma with a semicolon, period, or conjunction.

(A58G1)

Sample 5. Sample of ChatGPT’s Incorrect Feedback on Tense Consistency

Sentence Error Type Description Suggestion
“Each antibiotic introduction is followed, relatively quickly, by documented resistance to that antibiotic.” Tense inconsistency The tense changes from present tense (“is followed”) to past tense (“documented”). Use consistent verb tense throughout the sentence.

(A57DG3)

The tables in Samples 4 and 5 present ChatGPT-generated feedback on students’ grammar and illustrate incorrect appraisals. In a close examination of the sentence in Sample 4, the feedback generated by ChatGPT incorrectly identifies the use of a comma as a “comma splice.” The purported error seems to stem from a misrecognition of the sentence’s grammatical structure; however, the comma is used correctly in this sentence to separate an introductory dependent clause from the main independent clause. As for Sample 5, ChatGPT’s analysis of the sentence “Each antibiotic introduction is followed, relatively quickly, by documented resistance to that antibiotic” reflects an error in evaluation. While the tool flags a potential tense inconsistency, it misidentifies “documented” as a verb instead of an adjective. Meanwhile, the sentence correctly uses the present tense “is followed” to describe a general procedure. Thus, enforcing a tense change, as ChatGPT proposes, to an adjective would result in a grammatical error.

Although both ChatGPT and the teacher comment on the clarity of the essay’s solution, the contrast in their feedback exposes the limitations of ChatGPT-generated comments (Sample  6).

Sample 6.  Sample Contrasting Feedback from ChatGPT and the Teacher: Clarity of problem, solution and call to action

The essay effectively establishes the problem of educational inequality in the Philippines, highlighting issues such as inadequate funding, outdated textbooks, poor infrastructure, and cultural barriers. The proposed solution, which involves collaboration between different stakeholders to develop online educational content, is clearly articulated. However, the call to action could be more explicit in urging readers to support or advocate for this collaborative initiative to address educational disparities.

[For the consequences of underfunding, impact, and call to action, integrate them all in paragraphs. Avoid enumerations. You have very little explanation about your proposed solution. Give us more details on how your proposal will maximize tech in providing equitable access to educational resources? Go back to your thesis statement… Were you able to expound on this? As a reader, I find the problem and proposed solution clear and feasible. The urgency and relevance of the problem are well-established, and the proposed solution offers a practical approach to addressing the educational challenges faced by marginalized communities in the Philippines.] (A57DG6)

ChatGPT recognizes the mention of online educational content in the student’s essay, but its response lacks depth in analyzing the proposed solution’s details and its effectiveness in promoting equal access through technology. The teacher’s correction reveals an issue with ChatGPT’s feedback: the student essay lacks a discussion of the proposed solution, which ChatGPT praised. This problem suggests a possible instance of AI hallucinations, in which responses are generated to fill perceived gaps, regardless of relevance or accuracy (Walter, 2024).

In contrast, the teacher provides specific revision points. She emphasizes the need for a cohesive flow between consequences, impact, and the call to action as well as identifies the lack of details in the proposed solution and its connection to maximizing technology for equal access. Importantly, the teacher acknowledges the overall clarity of the problem and solution but questions if the essay sufficiently elaborates on the thesis in relation to the proposed solution. This targeted feedback offers a clear path for improvement, unlike the irrelevant analysis generated by ChatGPT. The examples discussed show a potential blind spot in generative AI: their struggle with complex ideas. AI’s limitations in this area thus need to be balanced with the teacher’s expertise.

Building on ChatGPT’s feedback to create a nuanced, personalized feedback strategy

The limitations of GenAI in handling complex ideas highlight the need for a more nuanced, personalized approach to feedback. Though GenAI, specifically ChatGPT, has been lauded for its ability to understand text contexts and provide moderately accurate feedback, it continues to struggle in understanding and analyzing connections among pieces of information (L. Wang et al., 2024) and recognizing the unique needs of the students. As a result, it tends to provide feedback that is generic, broad, and often vague. This was evident in ChatGPT’s generated comments on the essays.

Sample 7. ChatGPT’s Generic Feedback with the Teacher’s Personalized Commentary

Feedback on the transitional phrases Transitional phrases are important for maintaining coherence and flow in academic writing. While your essay does use transitional phrases, they could be improved to better connect ideas and paragraphs. For example, instead of starting a paragraph with “Moreover,” you could use “Furthermore” or “Additionally” for variation and clarity… [Can you also provide a clearer transition from your discussion of the problem of overcrowding to the existing solutions? Tighten your discussion of the existing solutions to address jail congestion. I suggest that the discussion be captured in one paragraph and make sure to have one clear topic sentence that will give your readers an idea what has been done so far to address the concern.] (A58G15)

In Sample 7, ChatGPT consistently suggests that students use transitional phrases “to better connect ideas and paragraphs” to improve coherence and flow. However, the specific suggestions that followed were all focused on the “variety” of transitional phrases, recommending replacements for existing transitions (e.g., instead of “Moreover,” use “Furthermore”). It does not identify the specific problematic areas of the essay related to coherence and flow, nor does it provide specific transition suggestions. To create more nuanced feedback, the teacher adopts ChatGPT’s comments and adds personalized feedback on specific parts of the essay. This includes improving the transition from the problem to the solution, tightening the solution section to enhance the logical and smooth flow of ideas, and creating a topic sentence for the new paragraph.

These seemingly generic comments are also evident in ChatGPT’s feedback on the development of ideas throughout the essay. It provides broad suggestions, such as adding more elaboration and evidence, without specifying which parts of the essay needed improvements (A57G15, A58G12). Additionally, ChatGPT repeatedly claims that the essay adhered to APA 7th format, even though it does not (A57G7, A58G4). These examples present some of the limitations of ChatGPT in fully engaging in deep structural and content analysis, limiting its ability to provide specific feedback tailored to the essays.

Despite the observed limitations, ChatGPT consistently highlights the essay’s strengths, a type of feedback often overlooked by teachers (Rathel et al., 2013). In many instances, ChatGPT’s positive appraisals are accurate and relevant to the essay. This is then complemented by the teacher’s more specific comments, which reinforce the affirmative feedback, as shown below.

Sample 8. Sample of Combined Positive Appraisals from ChatGPT and the Teacher

Feedback on the clarity of problem, solution and call to action: Your introductory paragraphs effectively introduce the problem of the transport system in the Philippines and outline the proposed solution through the PUVMP. [I really like that as early as your 3rd paragraph, you provided definitions of the key terms in your essay. It helps me understand better your ideas. Kudos!] (A57G4)

Sample 8 illustrates how ChatGPT accurately identifies the essay’s strength in clearly presenting the problem and the solution involving the local transport system. This validation of students’ effort and ability is reinforced by the teacher’s personalized feedback, which cites a specific section of the essay, “as early as your 3rd paragraph,” to back the positive appraisal. Also, the teacher adds motivational phrases like “Kudos” to explicitly congratulate the students, which adds to the “personal touch.” This feedback customization transforms the overall feedback into objective compliments—a specific and factual acknowledgment of the student’s abilities—qualities of feedback that are believed to impact student learning and motivation directly (Cavanaugh, 2013; Truax, 2018). This blend of AI and teacher feedback creates a more comprehensive and supportive feedback environment, fostering positive relationships that could enable student learning to thrive.

Engaging students in a dialogue using ChatGPT’s feedback strategy

Nuanced, personalized feedback resulting from the blend of ChatGPT-generated and teacher feedback not only clarifies the subsequent steps for students in their essay writing but also potentially engages students in the feedback process through dialogue. While this analysis excludes post-comment teacher-student exchanges, the blended feedback initiates dialogue by posing clarifying questions, often corrective, and acknowledging ChatGPT’s suggestions while letting students decide whether to follow the feedback.

Sample 9. Teacher Feedback Using Clarificatory Questions to Address Content Accuracy

Feedback on the clarity of problem, solution, and call to action: The essay effectively establishes the problem of disparities in public and local healthcare in rural areas of the Philippines. It provides clear details about the challenges faced by rural communities, such as geographic barriers, lack of medical resources, and shortages of healthcare professionals. The proposed solution, the “Standardized Healthcare Standard for Rural Medical Institutions Act,” is also well-articulated, outlining specific measures to improve healthcare accessibility and quality in rural areas. Additionally, the call to action for the government to mandate the proposed act is clear and relevant. [Question though. The majority of your essay talked about “telemedicine.” Is telemed subsumed in your “standardized healthcare act?” How I understood your essay is that under the Standardized Healthcare Act, telemed and ALAGA Ka projects will be promoted. Am I right? If I am correct, revise your essay in a way that your clearly frame both projects as the key programs/features of the Act and you have to discuss each one in almost the same depth. Most of your discussion was on telemed but only a few details about the ALAGA ka project.] (A57G1)

Sample 10 Teacher Feedback Using Clarificatory Questions to Strengthen Idea Development

Feedback on the development of ideas throughout the paragraph and essay: Overall, your essay effectively develops the main idea of addressing water contamination in indigenous communities. You provide detailed explanations, examples, and evidence to support your arguments. However, there are some areas where further elaboration or evidence could strengthen your points, such as providing more specific examples of successful collaborative efforts between governments and indigenous communities in addressing water contamination. [Check idea consistency e.g. 2nd para: …because of their sustainable practices and the difficulty of accessing territories…” should these two really go together? Please review whole essay. When I first read it, I was somehow confused what you really wanted to tell me. I know that your issue is specifically on clean water access but the majority of your essay tackled the general issues related to IP communities.] (A58G11)

In Samples 9 and 10, the teacher retains ChatGPT’s positive comments on the essays, such as “effectively established the problem” and “effectively develops the main idea,” and adds her comments in the form of questions to clarify certain aspects of the writing. In Sample 9, although ChatGPT provides only positive comments, the teacher raised questions about how “telemed” is addressed in the essay, noting that the majority of the essay focuses on telemedicine alone and seems to disregard other projects under the Standardized Healthcare Act. By asking questions like “Is telemed subsumed in your standardized healthcare act?” and “Am I right?” to verify her understanding, the teacher takes a clarificatory stance, prompting the students to explain further. Similarly, in Sample 10, the teacher clarifies whether the two key constructs of “sustainable practices” and “difficulty of accessing territories” are compatible ideas. She insinuates that they might be unrelated, based on how the essay is developed. Instead of directly telling the students what to correct in their writing, the teacher chooses to raise questions to clarify their thinking. This approach reflects the teacher’s strategy of using feedback to initiate a dialogue, aimed at building positive relationships with students and facilitating meaningful exchanges that lead to learning (Nicol, 2010).

While these clarificatory questions prompt students to explain their reasoning, they are typically paired with a related suggestion that addresses the query. For instance, the question “Am I right?” is immediately followed by corrective feedback “If yes, revise your essay..” and the question ‘Should these two really go together? is followed  by the suggestion “Please review the essay…” This strategy raises questions about whether the clarificatory questions truly foster dialogue or remain monologic.

Additionally, the teacher uses ChatGPT’s comments to initiate dialogue by acknowledging the AI-generated while allowing students to assess its value. Instead of correcting specific aspects of the essay, the teacher encourages students to reflect on ChatGPT’s suggestions, giving them the autonomy to decide whether to adopt the feedback, as seen in Samples 11 and 12.

Sample 11. Teacher Feedback Leveraging ChatGPT’s Response to Initiate Dialogue on the Thesis Statement

Feedback on the quality of the thesis statement The thesis statement in your essay does address the problem of low reading and comprehension skills among students in the Philippine education system. However, it could be strengthened by explicitly mentioning the proposed solution or recommendations for action. Try to make your thesis statement more focused and clear in terms of what actions or solutions you will discuss in your essay.

Here’s an example of an improved thesis statement: “While addressing the issue of low reading and comprehension skills among students in the Philippine education system, this essay proposes a comprehensive solution involving the development of standardized reading materials and the implementation of nationwide pre-assessment and post-assessment tests, facilitated through collaborative efforts between Local Government Units (LGUs) and the Department of Education.” [What do you think of ChatGPT’s suggestion on your thesis statement? For me, your thesis statement will stand on its own… my problem though is that you did not elaborate on it in your essay.] (A58G6)

Sample 12. Teacher Feedback Using ChatGPT’s Suggestions to Encourage Student’s Critical Thinking

Feedback on the quality of the thesis statement: The thesis statement in your essay should explicitly mention the problem and proposed solution, aligning with the writing prompt. Your thesis statement does touch upon the problem of traffic congestion in Metro Manila and suggests a solution involving the restructuring of roadways for public space creation. However, it could be more specific and focused on the Sustainable Development Goals (SDGs), which are central to the writing prompt.

For example, an improved thesis statement could be: “While traffic congestion in Metro Manila poses significant challenges to the city’s livability and economic efficiency, the implementation of sustainable transportation solutions, such as promoting public transportation and enhancing pedestrian infrastructure, can alleviate congestion, reduce emissions, and contribute to achieving SDG 11 (Sustainable Cities and Communities).”

This statement explicitly addresses the problem of traffic congestion, proposes a solution involving sustainable transportation, and links it to the broader context of the Sustainable Development Goals. [This is a suggestion of AI. What do you think? I leave it up to you to decide if you wish to consider it.] (A58G13)

In both examples, ChatGPT suggests ways to enhance the focus and clarity of the thesis statements by providing revised versions. Building upon ChatGPT’s comments, the teacher poses questions such as, “What do you think of ChatGPT’s suggestion on your thesis statement?” and “This is the AI’s suggestion. What do you think?” to enrich the feedback. This time, the purpose of the teacher’s questions is not corrective but to empower students to determine the best course of action for improving their work. This approach is supported by the teacher’s follow-up statements: “For me, your thesis statement stands on its own” and “I leave it up to you to decide if you wish to consider it.” These statements affirm that the teacher recognizes the value of ChatGPT’s feedback while also acknowledging the adequacy of the student’s current thesis statements. By posing questions that allow students to choose which feedback works best for them, the blending of ChatGPT’s and the teacher’s feedback actively promotes critical thinking. Students critically reflect on the suggestions, initiating meaningful discussions with the teacher and among themselves.

Discussion

In this paper, we demonstrated how ChatGPT functions as a valuable tool that can complement teacher’s feedback in an ESL context. In this blended model, we identified four key strategies that exemplify a blended feedback model for GenAI and the teacher: the teacher refines and enhances ChatGPT’s feedback, discards inaccurate responses, provides more nuanced evaluations, and leverages ChatGPT’s feedback to engage students in meaningful dialogue. These strategies provide practical examples of integrating AI into the feedback process.

Our findings confirm that ChatGPT consistently and accurately addresses essay topics, adheres to rubric criteria, excels at providing global-level feedback that assesses the overall structure, coherence, and clarity of students’ writing, aligning with the work of Steiss et al. (2024), Wang & Guo (2023), and Zou et al. (2025). The teacher in the study consistently used ChatGPT’s feedback as a foundation, enhancing it with a more nuanced and targeted guidance, particularly addressing ESL student-writers’ difficulties in effectively connecting their ideas.

Additionally, ChatGPT’s analytical efficiency guarantees consistent feedback. Pack et al. (2025) have articulated the critical role of timely, quality feedback in writing improvement, while also acknowledging the challenges related to teacher workload and the considerable time required to provide feedback. Interestingly, the present study shows that ChatGPT is not only capable of efficiently delivering consistent feedback but also provides positive appraisals—affirmations crucial for fostering a supportive learning environment in L2 writing classrooms (Truax, 2018). Our study clearly demonstrates GenAI’s capability to enhance both the evaluative and motivational aspects of teaching writing (Abduljawad, 2024; Song & Song, 2023).

While AI-generated comments are often relevant, they can also include inaccurate, overly general, or contextually inappropriate feedback, which may not adequately address the specific needs of ESL students. These results are consistent with published studies highlighting significant limitations of GenAI (Algaraady & Mahyoob, 2023; Lingard, 2023; Ulla et al., 2023). Moreover, the findings of the study emphasize GenAI’s limitations in tackling complex ideas, specifically in understanding and analyzing connections among pieces of information. Although the rubric guides ChatGPT in identifying what to look for in essays, the ways ESL students link their ideas and present their reasoning vary considerably. As GenAI is constrained in providing customized feedback for these diverse thought processes, it often resorts to generic, broad, and vague feedback.

The findings of this study acknowledge both the strengths and limitations of ChatGPT as a feedback-generating tool, underscoring the invaluable context-specific expertise that teachers bring to the table. The analysis reveals a clear, dynamic interaction between GenAI and the teacher, characterized by a reciprocal relationship in which each compensates for the other’s limitations. For instance, when ChatGPT provides inaccurate or generic feedback, the teacher addresses these shortcomings by correcting the errors or offering specific suggestions for revision. This underscores the critical role of teacher oversight in correcting potential inaccuracies and tailoring feedback to meet individual student needs (Han & Li, 2024; Naz & Robertson, 2024). Another example of this reciprocal relationship occurs when the teacher uses ChatGPT’s feedback as a starting point for dialogue with the student, asking whether they agree with the suggestions and how they might further improve specific aspects of their writing, ultimately leaving the decision to accept the feedback to the student.

Note that the present study is not focused on streamlining the grading process or saving time for teachers but rather on how teachers can utilize AI tools like ChatGPT to ensure quality and consistency in feedback delivery. By reducing potential bias in teacher evaluations and providing structured, consistent, and prompt guidance, AI-assisted feedback helps create a more equitable learning environment.

Furthermore, we argue that feedback in L2 writing instruction derives its pedagogical value from its formative nature—encouraging students to actively engage with and internalize the feedback they receive. In this regard, the teacher’s engagement with AI-generated feedback remains essential, not merely as a corrective mechanism but as an integral part of the writing process that scaffolds students’ long-term development as writers. Our study highlights the need for high-quality feedback that facilitates genuine learning, ensuring that students receive guidance that is timely, meaningful, and actionable. By leveraging the efficiency of ChatGPT alongside the expertise of teachers, this blended feedback model brings us closer to an L2 writing classroom where feedback not only corrects errors but also nurtures deeper engagement with writing, fostering student growth in a way that is both structured and personalized.

Implications for integrating AI into teacher feedback

The potential of using ChatGPT for providing formative feedback on student writing offers significant value in writing instruction. ChatGPT, with well-crafted prompts, can effectively communicate what quality writing looks like and how students can achieve it, guiding them toward better performance (Barrot, 2023). Its ability to generate detailed and analytical feedback, which many teachers find challenging to provide consistently (Graham, 2019), can greatly assist teachers in delivering high-quality formative feedback (Mahapatra, 2024). However, maintaining teacher oversight is crucial to ensure the feedback’s quality and relevance. The discussion highlights the need for teachers to address potential inaccuracies in AI feedback and refocus students’ attention (Grimes & Warschauer, 2010). By integrating ChatGPT’s capabilities with teachers’ contextual and nuanced knowledge, a blended feedback model can be created. In this model, AI-generated insights serve as a starting point for deeper engagement with students’ writing, leading to more personalized and impactful feedback.

Such use of ChatGPT as a formative tool may be extended to self and peer assessments. The approach used in this study to curate ChatGPT’s feedback can be replicated in self and peer-feedback sessions. Students can be encouraged to use ChatGPT to provide initial feedback on their own or their peers’ writing and engage critically with the AI-generated feedback. By providing initial feedback, ChatGPT can prompt students to reflect on their work, discuss suggestions with their peers, and engage in deeper analysis of their own writing. This interaction encourages students to think critically about their choices and make informed revisions, thereby improving their writing skills and self-awareness as writers. At the same time, AI-generated feedback may serve as a mediator in instances where students have doubts about the effectiveness of their peers’ feedback (S. Yang, 2011).

Another implication of the study is that it provides a concrete illustration of how GenAI can be used responsibly and effectively in assessments to foster creativity and critical thinking while addressing ethical concerns such as dependency and plagiarism. These practical classroom writing practices can guide teachers in developing clear guidelines and guardrails for integrating GenAI in writing classes. Thus, we recommend that further classroom research be done to explore additional strategies for developing a blended feedback model that combines the strengths of generative AI and teacher expertise. By continuously refining the integration process, writing teachers can discover new ways to improve the quality of teacher feedback and enhance student learning outcomes, guaranteeing that the blended model remains dynamic and responsive to the evolving needs of both teachers and students.

About the Authors

Jonna Marie Lim is an assistant professor in the Department of English and Applied Linguistics at the De La Salle University-Manila. She earned her Ph.D. in Applied Linguistics from the same university. Her research focuses on assessment, critical pedagogy, curriculum design, and teacher feedback. ORCID ID: 0000-0002-9524-9321

Christian Go is an assistant professor in the Department of English and Comparative Literature at the University of the Philippines-Diliman. He received his Ph.D. in English Language and Linguistics from the National University of Singapore. His research interests fall under critical pedagogy, sociolinguistics, and discourse analysis. ORCID ID:  0000-0002-9962-4547

To Cite this Article

Lim, J. M., & Go, C. (2025). Integrating ChatGPT into teacher feedback: Practical insights for L2 Writing instruction. Teaching English as a Second Language Electronic Journal (TESL-EJ), 29(3). https://doi.org/10.55593/ej.29115int

References

Abduljawad, S. (2024). Investigating the impact of ChatGPT as an AI tool on ESL writing: Prospects and challenges in saudi arabian higher education. International Journal of Computer – Assisted Language Learning and Teaching, 14(1), 1-19. https://doi.org/10.4018/IJCALLT.367276

Algaraady, J., & Mahyoob, M. (2023). ChatGPT’s capabilities in spotting and analyzing writing errors experienced by EFL learners. Arab World English Journals, (9), 3–17. https://doi.org/10.24093/awej/call9.1

Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745

Bitchener, J., & Ferris, D. (2012). Written corrective feedback in second language acquisition and writing. Routledge.

Bock, T., Thomm, E., Bauer, J., & Gold, B. (2024). Fostering student teachers’ research-based knowledge of effective feedback. European Journal of Teacher Education, 47(2), 389–407. https://doi.org/10.1080/02619768.2024.2338841

Braun, V., & Clarke, V. (2013). Successful Qualitative Research: A Practical Guide for Beginners. Sage Publications, Thousand Oaks.

Canagarajah, S. (2013). Translingual practice: Global Englishes and cosmopolitan relations. Routledge.

Cavanaugh, B. (2013). Performance feedback and teachers’ use of praise and opportunities to respond: A review of the literature. Education and Treatment of Children, 36(1), 111-136. https://www.proquest.com/scholarly-journals/performance-feedback-teachers-use-praise/docview/1312445557/se-2

Chaudron, C. (1986). The role of error correction in second language teaching. University of Hawaii Working Papers in English as a Second. Language, 5(2), 43–81. https://scholarspace.manoa.hawaii.edu/server/api/core/bitstreams/36751f3b-74dc-4c0b-bed0-0b83c047080d/content

Crompton, H., Edmett, A., Ichaporia, N., & Burke, D. (2024). AI and English language teaching: Affordances and challenges. British Journal of Educational Technology, 55, 2503–2529. https://doi.org/10.1111/bjet.13460

Dawson, P., Henderson, M., Mahoney, P., Phillips, M., Ryan, T., Boud, D., & Molloy, E. (2019). What makes for effective feedback: Staff and student perspectives. Assessment & Evaluation in Higher Education, 44(1), 25-36. https://doi.org/10.1080/02602938.2018.1467877

Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning, and Assessment, 5(1), 4-35. https://ejournals.bc.edu/index.php/jtla/article/view/1640/1489

Draxler, F., Buschek, D., Tavast, M., Hämäläinen, P., Schmidt, A., Kulshrestha, J., & Welsch, R. (2023). Gender, age, and technology education influence the adoption and appropriation of LLMs. Cornell University Library. https://arxiv.org/abs/2310.06556

Ellis, R. (2009). A typology of written corrective feedback types. ELT Journal, 63(2), 97–107. https://doi.org/10.1093/elt/ccn023

Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20, (57). https://doi.org/10.1186/s41239-023-00425-2

Ferris, D. (2006). Does error feedback help student writers? New evidence on the short- and long-term effects of written error correction. In K. Hyland & F. Hyland (Eds.), Feedback in Second Language Writing: Contexts and Issues (pp. 81–104). Cambridge University Press.

Furneaux, C., Paran, A., & Fairfax, B. (2007). Teacher stance as reflected in feedback on student writing: An empirical study of secondary school teachers in five countries. International Review of Applied Linguistics in Language Teaching, 45, 69–94. https://doi.org/10.1515/IRAL.2007.003

Gayed, J. M., Carlon, M. K. J., Oriola, A. M., & Cross, J. S. (2022). Exploring an AI-based writing assistant’s impact on English language learners. Computers and Education: Artificial Intelligence, 3, 100055. https://doi.org/10.1016/j.caeai.2022.100055

Graham, S. (2019). Changing how writing is taught. Review of Research in Education, 43(1), 277–303.  https://doi.org/10.3102/0091732×18821125

Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. The Journal of Technology, Learning and Assessment, 8(6). http://www.jtla.org.

Guénette, D. (2007). Is feedback pedagogically correct?: Research design issues in studies of feedback on writing. Journal of Second Language Writing, 16(1), 40–53. https://doi.org/10.1016/j.jslw.2007.01.001

Han, J., & Li, M. (2024). Exploring ChatGPT-supported teacher feedback in the EFL context. System, 126. https://doi.org/10.1016/j.system.2024.103502

Hockly, N. (2023). Artificial intelligence in English language teaching: The good, the bad and the ugly. RELC Journal, 54(2), 445-451. https://doi-org.dlsu.idm.oclc.org/10.1177/00336882231168504

Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39(2), 83-101. https://doi.org/10.1017/S0261444806003399

Hyland, K., & Hyland, F. (Eds.). (2019). Feedback in second language writing: Contexts and issues. Cambridge University press.

Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT for language learning and teaching. RELC Journal, 54(2), 537 – 550. https://doi.org/10.1177/00336882231162868

Lee, I. (2019). Teacher written corrective feedback: Less is more. Language Teaching, 52(4), 524-536. https://doi.org/10.1017/S0261444819000247

Lingard, L. (2023). Writing with ChatGPT: An illustration of its capacity, limitations & implications for academic writers. Perspectives on Medical Education, 12(1), 261–270. https://doi.org/10.5334/pme.1072

Link, S., Mehrzad, M., & Rahimi, M. (2022). Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement. Computer Assisted Language Learning, 35(4), 605-634. https://doi-org.dlsu.idm.oclc.org/10.1080/09588221.2020.1743323

Liu, S., & Yu, G. (2022). L2 learners’ engagement with automated feedback: An eye-tracking study. Language Learning & Technology, 26(2), 78–105. https://doi.org/10125/73480

Mahapatra, S. (2024). Impact of ChatGPT on ESL students’ academic writing skills: a mixed methods intervention study. Smart Learning Environments, 11(1). https://doi.org/10.1186/s40561-024-00295-9

Mai, D. T. T., Da, C. V., & Hanh, N. V. (2024, February). The use of ChatGPT in teaching and learning: a systematic review through SWOT analysis approach. Frontiers in Education, 9, p. 1328769. https://10.3389/feduc.2024.1328769

Matsumura, L. C., Garnier, H., Pascal, J., & Valdés, R. (2002). Measuring instructional quality in accountability systems: Classroom assignments and student achievement. Educational Assessment, 8(3), 207–229. https://doi-org.dlsu.idm.oclc.org/10.1207/S15326977EA0803_01

Mi, Y., Rong, M., & Chen, X. (2025). Exploring the affordances and challenges of GenAI feedback in L2 writing instruction: A comparative analysis with peer feedback. ECNU Review of Education, 20965311241310883. https://doi.org/10.1177/209653112413108

Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050

Mollick, E. R., & Mollick, L. (2023). Using AI to implement effective teaching strategies in classrooms: Five strategies, including prompts. The Wharton School Research Paper.  https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4391243

Naz, I., & Robertson, R. (2024). Exploring the feasibility and efficacy of ChatGPT3 for personalized feedback in teaching: EJEL. Electronic Journal of E-Learning, 22(2), 98-111. https://doi.org/10.34190/ejel.22.2.3345

Nguyen, C. D. (2024). Scaffolding student engagement with written corrective feedback: Transforming feedback sessions into learning affordances. Language Teaching Research, 28(5), 1918-1939. https://doi.org/10.1177/13621688211040904

Nicol, D. (2010). From monologue to dialogue: Improving written feedback processes in mass higher education. Assessment & Evaluation in Higher Education, 35:5, 501-517 https://doi.org/10.1080/02602931003786559

Olsen, T., & Hunnes, J. (2024). Improving students’ learning—the role of formative feedback: experiences from a crash course for business students in academic writing. Assessment & Evaluation in Higher Education, 49(2), 129-141. https://doi-org.dlsu.idm.oclc.org/10.1080/02602938.2023.2187744

Pack, A., Hartshorn, J., Escalante, J., & Gillette, N. (2025). How well can GenAI (GPT-4) provide written corrective feedback on English-language learners’ writing? International Journal of English for Academic Purposes, 5(1), 7 – 26. https://doi.org/10.3828/ijeap.2025.2

Rahimi, M., Fathi, J., & Zou, D. (2024). Exploring the impact of automated written corrective feedback on the academic writing skills of EFL learners: An activity theory perspective. Education and Information Technologies, 30, 2691–2735. https://doi.org/10.1007/s10639-024-12896-5

Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31(7), 653–674. https://doi.org/10.1080/09588221.2018.1428994

Rathel, J. M., Drasgow, E., Brown, W. H., & Marshall, K. J. (2013). Increasing Induction-Level teachers’ Positive-to-Negative Communication ratio and use of Behavior-Specific praise through E-Mailed performance feedback and its effect on students’ task engagement. The Journal of Positive Behavior Interventions/Journal of Positive Behavior Interventions, 16(4), 219–233. https://doi.org/10.1177/1098300713492856

Roscoe, R. D., Wilson, J., Johnson, A. C., & Mayra, C. R. (2017). Presentation, expectations, and experience: Sources of student perceptions of automated writing evaluation. Computers in Human Behavior, 70, 207-221. https://doi.org/10.1016/j.chb.2016.12.076.

Song, C., & Song, Y. (2023). Enhancing academic writing skills and motivation: assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14. https://doi.org/10.3389/fpsyg.2023.1260843

Steiss, T., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., &  Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894.\

Stern, L. A., & Solomon, A. (2006). Effective faculty feedback: The road less traveled. Assessing Writing, 11(1), 22-41. https://doi.org/10.1016/j.asw.2005.12.001

Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007

Storch, N., & Wigglesworth, G. (2010). Learners’ processing, uptake, and retention of corrective feedback on writing: Case studies. Studies in Second Language Acquisition, 32(2), 303–34. https://doi.org/10.1017/S0272263109990532

Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752

Sullivan, M., Kelly, A., & McLaughlan, P. (2023). ChatGPT in higher education: Considerations for academic integrity and student learning. Journal of Applied Teaching and Learning, 6(1), 31-40. https://doi.org/10.37074/jalt.2023.6.1.17

Teng, M. F. (2024a). “ChatGPT is the companion, not enemies”: EFL learners’ perceptions and experiences in using ChatGPT for feedback in writing. Computers and Education: Artificial Intelligence, 7, 100270. https://doi.org/10.1016/j.caeai.2024.100270

Teng, M. F. (2024b). A systematic review of ChatGPT for English as a Foreign Language writing: Opportunities, challenges, and recommendations. International Journal of TESOL Studies, 6(3), 36-57. https://doi.org/10.58304/ijts.20240304

Truax, M. L. (2018). The impact of teacher language and Growth mindset feedback on writing motivation. Literacy Research and Instruction, 57(2), 135–157. https://doi.org/10.1080/19388071.2017.1340529

Ulla, M. B., Perales, W. F., & Busbus, S. O. (2023). To generate or stop generating response: Exploring EFL teachers’ perspectives on ChatGPT in English language teaching in Thailand. Learning: Research and Practice, 9(2), 168–182. https://doi-org.dlsu.idm.oclc.org/10.1080/23735082.2023.2257252

Walter, Y. (2024). Embracing the future of Artificial Intelligence in the classroom:The relevance of AI literacy, prompt engineering, and critical thinking in modern education. International Journal of Educational Technology in Higher Education, 21(1), 15. https://educationaltechnologyjournal.springeropen.com/articles/10.1186/s41239-024-00448-3

Wang, L., Chen, X., Wang, C., Xu, L., Shadiev, R., & Li, Y. (2024). ChatGPT’s capabilities in providing feedback on undergraduate students’ argumentation: A case study. Thinking Skills and Creativity, 101440. https://doi.org/10.1016/j.tsc.2023.101440.

Wang, M., & Guo, W. (2023). The potential impact of ChatGPT on education: Using history as a rearview mirror. ECNU Review of Education, 20965311231189826. https://doi.org/10.1177/209653112311898

Wang, Y. J., Shang, H. F., & Briody, P. (2013). Exploring the impact of using automated writing evaluation in English as a foreign language university students’ writing. Computer Assisted Language Learning, 26(3), 234-257. https://doi.org/10.1080/09588221.2012.655300

Wilson, J., & Roscoe, R. D. (2020). Automated writing evaluation and feedback: Multiple metrics of efficacy. Journal of Educational Computing Research, 58(1), 87-125. https://doi.org/10.1177/0735633119830764

Yang, L., & Li, R. (2024). ChatGPT for L2 learning: Current status and implications. System, 124, 103351. https://doi.org/10.1016/j.system.2024.103351

Yang, S. H. (2011). Exploring the effectiveness of using peer evaluation and teacher feedback in college students’ writing. Asia-Pacific Education Researcher, 20(1), 144-150. https://ejournals.ph/article.php?id=4096

Yin, R. (2014). Case study research: Design and methods. SAGE.

Yu, S., & Hu, G. (2017). Understanding university students’ peer feedback practices in EFL writing: Insights from a case study. Assessing Writing, 33, 25–35. https://doi.org/10.1016/j.asw.2017.03.004

Zou, S., Guo, K., Wang, J., & Liu, Y. (2025). Investigating students’ uptake of teacher- and ChatGPT-generated feedback in EFL writing: a comparison study. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2024.2447279

Copyright of articles rests with the authors. Please cite TESL-EJ appropriately.
Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations.

© 1994–2026 TESL-EJ, ISSN 1072-4303
Copyright of articles rests with the authors.