February 2026 – Volume 29, Number 4
https://doi.org/10.55593/ej.29116m1
| Chatbot/App | Developer | Model reviewed |
| Grok | xAI, founded by Elon Musk | Grok 1 & 2 |
| Copilot | Microsoft, built on OpenAI’s Codex and other models | Open AI Codex |
| ChatGPT | Open AI | GPT-3.5 |
Twenty years ago, if we had seen someone looking at their phone and talking to it, laughing with it, and learning with it, most of us probably would have started laughing. Nowadays, however, owing to the advancement of technology, specifically the advent of Artificial Intelligence (AI) and its rampancy across various disciplines, we no longer have the same mocking attitudes. Teaching English as a Foreign or Second language (TEFL/TESL) is a discipline that has been gravely affected by AI-powered apps and chatbots over the course of the last five years. As a subset of Computer-Assisted Language Learning (CALL), AI has the potential to do wonders in the field of additional language learning, enhancing both the quality and efficiency in all areas of learning (Enik Rukiati et al., 2023; Rebolledo Font de la Vall & González Araya, 2023). One of the skills that could be boosted via AI is speaking.
One of the age-old issues gnawing at language learners observed by teachers regarding improving their speaking skills has been the lack of an interlocutor to practice speaking with. By the courtesy of AI and its provision of personalized language content and human-like features such as voice calls, language learners no longer grapple with issues like having no one to practice speaking with (Fathi et al., 2024; Nguyen & Pham, 2024). In that light, the present review intends to focus particularly on three of the most frequently downloaded Chatbots on Google Play or App Store: Grok, ChatGPT, and Copilot, and their potential impact on the language learners’ speaking skills. These apps have been selected due to their high rating on Google Play, the number of downloads, and some of their unique characteristics, namely “Camera Vision” and ” Transcription feature” that set them apart from their counterparts, such as DeepSeek, Monica AI, etc. It is of paramount importance to note that these apps are not specifically designed for language learning. However, they target practically any discipline because they provide audiences with a wide range of opportunities to achieve their goals, and additional language learning is one such area that could be leveraged by such chatbots.
In this comparative review, the merits and demerits of these three AI-powered apps have been targeted to see the areas where they converge and diverge in terms of providing opportunities for the enhancement of speaking skills. Each of the apps is analyzed and priced separately below.
Grok
This app is available on the Google Play (rated 4.8) and Apple Store (rated 4.9) operating systems. When downloaded, the user is asked to enter their email address in order for the app to fully launch (without signing in, the app has time limitations on voice calls and other options like uploading documents). The app has both free and premium tiers (called Super Grok). The specific option that can be taken advantage of for the purpose of language learners’ speaking improvement is the voice call option. This option provides the users with thirteen different AI tones some of which include, assistant (straightforward, helpful and witty responses without extremes), therapist(probing psychologist role), storyteller (immersive narrator; crafts dynamic tales with flair), meditation (Soothing guide; leads breathwork or mindfulness with rhythmic, calming tones), Grok Doc (witty health advisor; offers medical insights with humor, not a substitute for real care), romantic (for intimate, affectionate role-play), etc., as shown in Figure 1. This feature allows students to tailor their call to the exact goal they have in mind and the mood they are in to speak and listen enthusiastically. The second fascinating feature of Grok, while having a voice call, is the presence of emotions in its tone. It can easily laugh with you and even cry if it is requested, and it may be well counted on for emotional support, which is a crucial aspect of additional language learning theory and practice (MacIntyre & Gregersen, 2012; Richards, 2020; Rogers, 1951). The third advantage of Grok is the feature of “Grok Vision,” which learners can utilize by activating their camera and allowing AI to see what they see simultaneously. This permits the user to seek the help of AI for describing situations that cannot be easily described through words and provides a sense of companionship or affinity by observing things through the same lens.
As with every platform, Grok also has its own deficiencies. To begin with, the voice call option is not as effective across all levels of proficiency owing to the lack of speed adjustment in its tone, which makes it impossible for beginners and perhaps intermediate language learners to fully exploit. Secondly, Grok answers have no limitations or sensitivity to certain explicit types of language, which can be problematic for certain audiences, particularly young learners (i.e., they receive whatever question they ask, even if it contains inappropriate content) and people with religious orientations. Of course, it should be highlighted that the “Grok Vision” feature has limitations in describing explicit contents by smoothly derailing the conversation toward a way that will not require the use of strong language. For instance, A user takes a photo of a detailed anatomical diagram from a medical textbook, clearly showing the external genitalia, and asks Grok Vision to “Describe this image.” To avoid describing these explicit body parts directly, Grok derails the specific description request toward a more general, clinical, or functional context.

Figure 1. Grok, tone adjustment, and Grok Vision features
Copilot
This app (rated 4.7 on Google Play and 4.8 on App Store) can be accessed and launched the same way as Grok and also has free and premium tiers that learners can use at will. Copilot is equipped with the option of voice call, which only has one fixed tone, and is used throughout the conversation. There is, of course, a feature to change the voice of the AI, but it is a far cry from adjusting the tone. Notably, Copilot is able to detect strong words and content and does not respond to them by encouraging the user to switch topics. Additionally, it enjoys the feature of Speed Adjustment (see Figure 2) that partially redresses the balance for lower-proficiency language learners as well. These two rectifications may expand the circle of audience of Copilot because families with younger learners or religious attitudes could safely trust the app and make sure that their young learners are in good company. Camera Vision (see Figure 3) is another feature allowing learners to activate their camera and receive assistance based on what the AI sees. It also displays emotional responsiveness during conversations, though it may occasionally resist certain emotional cues. The absence of multiple tones during interaction remains is the main limitation of the app.

Figure 2. Copilot, Speed Adjustment feature Figure 3. Copilot, Camera Vision feature
ChatGPT
Perhaps the first AI chatbot ushering in a new era of technological advancement, it has far more users than Grok and Copilot on the Google Play; 29 million users versus 1.3 million and 1.7 million for Grok and Copilot, respectively. The app (rated 4.7 on Google Play and 4.9 on App Store) has practically the same interface as the other two apps and is launched and activated similarly. It offers the voice call option and the ability to get engaged emotionally in conversations, as well as having the “transcription feature” while having a conversation, which provides the users with a chance to comprehend and analyze the language in a better way (see Figure 4). This feature could actually be treated as a way to reduce the gap between different levels of proficiency because even intermediate language learners can get involved in the conversations despite the fact that their listening abilities may not be fully developed. Notwithstanding this practical feature, ChatGPT is not equipped with ChatGPT Vision or having different tones while talking to the user, which counts as a disadvantage to this app as far as boosting speaking skills is concerned. It should be mentioned that ChatGPT is also sensitive to ill-talking and use of explicit language while having a conversation.

Figure 4. ChatGPT, transcription future
Although each tool offers features that support speaking practice, they differ in some aspects. As noted earlier, Grok provides the largest number of tone options and integrates an emotionally expressive interaction style and camera-assisted “Grok Vision.” In comparison, Copilot offers features such as Speed Adjustment and sensitivity to explicit language, as well as its own Camera Vision. ChatGPT, on the other hand, distinguishes itself through its transcription feature, which helps learners analyze spoken language more effectively. While these strengths are evident, each tool also reveals critical gaps that limit its pedagogical potential. For instance, Grok’s lack of speed adjustment may undermine accessibility for lower-proficiency learners despite its emotional richness. Besides, its lack of sensitivity to explicit language while text messaging could pose a threat to its users if it is misused. On the other side, Copilot’s fixed tone and occasional emotional resistance may restrict the depth of communicative engagement it claims to support. And ChatGPT’s absence of vision and tonal flexibility creates a mismatch between its conversational power and its capacity to simulate real-world, multimodal communication. These distinctions demonstrate that each application highlights different strengths helpful in developing speaking skills, yet none provides a fully comprehensive environment for speaking development on its own. For further understanding of the merits and demerits of each app, all the information mentioned is encapsulated in Table 1 below.
Table 1. Summary of the merits and demerits of each chatbot/app
| Apps | Voice call | Different tones | Emotional engagement | Camera vision | Sensitivity to explicit language | Transcription feature |
| Grok | ✔ | ✔ | ✔ | ✔ | ✖ | ✖ |
| Copilot | ✔ | ✖ | ✔ | ✔ | ✔ | ✖ |
| ChatGPT | ✔ | ✖ | ✔ | ✖ | ✔ | ✔ |
Conclusion
In conclusion, it is not easy to assert that one app is superior to another (irrespective of their download rate), knowing that each has its own merits and demerits as demonstrated above; but rather, we could opine that all three apps could be taken advantage of in conjunction with each other and as supplements. It is believed that if these apps could incorporate a feature to adjust the language (e.g., vocabulary, grammar, and rate of delivery) based on different proficiency levels, they would be embraced more than ever. That said, we should be cognizant of the possible threats of overdependence on AI. The content of this review must not be construed as the role of AI as a replacement for face-to-face communication in real life, because if interpreted so, it could result in students becoming detached from real life and attached to the virtual space (Klimova & Hua Chen, 2024). Language learners should invariably be reminded that AI is to be utilized as an assistant in tandem with real-life practices for maximal efficiency and language achievement.
About the Authors
Masoud Taghipour is a Ph.D. candidate in TEFL at Allameh Tabataba’i University, Tehran, Iran. His research focuses on English language teaching, with a particular emphasis on the emotional dynamics of teachers and students, language skills development, CALL, and the impact of language learning strategies on educational outcomes. <masoudtaghipour78@gmail.com> ORCID ID: 0009-0009-1931-0256
Goudarz Alibakhshi is an Associate Professor at Allameh Tabataba’i University, specializing in Applied Linguistics and English Language Education. With a strong background in language teaching methodology, research methodology, and language assessment, he has published extensively in national and international journals. His research interests include materials development, language assessment, positive psychology, and the integration of artificial intelligence into language education. He is also actively involved in supervising graduate theses and has contributed significantly to curriculum development and educational reform projects in Iran. ORCID ID: 0000-0003-0916-8662
To Cite this Review
Taghipour, M. & Alibakhshi, G. (2026). Talking tech: A review of Grok, Copilot, and ChatGPT head-to-head in enhancing second language speaking. Teaching English as a Second Language Electronic Journal (TESL-EJ), 28 (2). https://doi.org/10.55593/ej.29116m1
References
Rukiati, E., Wicaksono, J. A., Taufan, G. T. & Suharsono. D. D. (2023). AI on Learning English: Application, Benefit, and Threat. Journal of Language, Communication and Tourism, 1(2), 32–40. https://doi.org/10.25047/jlct.v1i2.3967
Fathi, J., Rahimi, M., & Derakhshan, A. (2024). Improving EFL learners’ speaking skills and willingness to communicate via artificial intelligence-mediated interactions. System, 121, 103254–103254. https://doi.org/10.1016/j.system.2024.103254
Klimova, B., & Hua Chen, J. (2024). The Impact of AI on Enhancing Students’ Intercultural Communication Competence at the University Level: A Review Study. Language Teaching Research Quarterly, 43, 102–120. https://doi.org/10.32038/ltrq.2024.43.06
MacIntyre, P. D., & Gregersen, T. (2012). Emotions that facilitate language learning: The positive-broadening power of the imagination. Studies in Second Language Learning and Teaching, 2(2), 193–213. https://doi.org/10.14746/ssllt.2012.2.2.4
Nguyen, N. H. V., & Pham, V. P. H. (2024). AI Chatbots for Language Practices. International Journal of AI in Language Education, 1(1), 56–67. https://doi.org/10.54855/ijaile.24115
Rebolledo Font de la Vall, R., & González Araya, F. (2023). Exploring the Benefits and Challenges of AI-Language Learning Tools. International Journal of Social Sciences and Humanities Invention, 10(01), 7569–7576. https://doi.org/10.18535/ijsshi/v10i01.02
Richards, J. C. (2020). Exploring Emotions in Language Teaching. RELC Journal, 53(1), 003368822092753. https://doi.org/10.1177/0033688220927531
Rogers, C. (1951). Client-centered therapy: Its current practice, implications, and theory. Houghton Mifflin
| © Copyright rests with authors. Please cite TESL-EJ appropriately. Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations. |

