• Skip to primary navigation
  • Skip to main content

site logo
The Electronic Journal for English as a Second Language
search
  • Home
  • About TESL-EJ
  • Vols. 1-15 (1994-2012)
    • Volume 1
      • Volume 1, Number 1
      • Volume 1, Number 2
      • Volume 1, Number 3
      • Volume 1, Number 4
    • Volume 2
      • Volume 2, Number 1 — March 1996
      • Volume 2, Number 2 — September 1996
      • Volume 2, Number 3 — January 1997
      • Volume 2, Number 4 — June 1997
    • Volume 3
      • Volume 3, Number 1 — November 1997
      • Volume 3, Number 2 — March 1998
      • Volume 3, Number 3 — September 1998
      • Volume 3, Number 4 — January 1999
    • Volume 4
      • Volume 4, Number 1 — July 1999
      • Volume 4, Number 2 — November 1999
      • Volume 4, Number 3 — May 2000
      • Volume 4, Number 4 — December 2000
    • Volume 5
      • Volume 5, Number 1 — April 2001
      • Volume 5, Number 2 — September 2001
      • Volume 5, Number 3 — December 2001
      • Volume 5, Number 4 — March 2002
    • Volume 6
      • Volume 6, Number 1 — June 2002
      • Volume 6, Number 2 — September 2002
      • Volume 6, Number 3 — December 2002
      • Volume 6, Number 4 — March 2003
    • Volume 7
      • Volume 7, Number 1 — June 2003
      • Volume 7, Number 2 — September 2003
      • Volume 7, Number 3 — December 2003
      • Volume 7, Number 4 — March 2004
    • Volume 8
      • Volume 8, Number 1 — June 2004
      • Volume 8, Number 2 — September 2004
      • Volume 8, Number 3 — December 2004
      • Volume 8, Number 4 — March 2005
    • Volume 9
      • Volume 9, Number 1 — June 2005
      • Volume 9, Number 2 — September 2005
      • Volume 9, Number 3 — December 2005
      • Volume 9, Number 4 — March 2006
    • Volume 10
      • Volume 10, Number 1 — June 2006
      • Volume 10, Number 2 — September 2006
      • Volume 10, Number 3 — December 2006
      • Volume 10, Number 4 — March 2007
    • Volume 11
      • Volume 11, Number 1 — June 2007
      • Volume 11, Number 2 — September 2007
      • Volume 11, Number 3 — December 2007
      • Volume 11, Number 4 — March 2008
    • Volume 12
      • Volume 12, Number 1 — June 2008
      • Volume 12, Number 2 — September 2008
      • Volume 12, Number 3 — December 2008
      • Volume 12, Number 4 — March 2009
    • Volume 13
      • Volume 13, Number 1 — June 2009
      • Volume 13, Number 2 — September 2009
      • Volume 13, Number 3 — December 2009
      • Volume 13, Number 4 — March 2010
    • Volume 14
      • Volume 14, Number 1 — June 2010
      • Volume 14, Number 2 – September 2010
      • Volume 14, Number 3 – December 2010
      • Volume 14, Number 4 – March 2011
    • Volume 15
      • Volume 15, Number 1 — June 2011
      • Volume 15, Number 2 — September 2011
      • Volume 15, Number 3 — December 2011
      • Volume 15, Number 4 — March 2012
  • Vols. 16-Current
    • Volume 16
      • Volume 16, Number 1 — June 2012
      • Volume 16, Number 2 — September 2012
      • Volume 16, Number 3 — December 2012
      • Volume 16, Number 4 – March 2013
    • Volume 17
      • Volume 17, Number 1 – May 2013
      • Volume 17, Number 2 – August 2013
      • Volume 17, Number 3 – November 2013
      • Volume 17, Number 4 – February 2014
    • Volume 18
      • Volume 18, Number 1 – May 2014
      • Volume 18, Number 2 – August 2014
      • Volume 18, Number 3 – November 2014
      • Volume 18, Number 4 – February 2015
    • Volume 19
      • Volume 19, Number 1 – May 2015
      • Volume 19, Number 2 – August 2015
      • Volume 19, Number 3 – November 2015
      • Volume 19, Number 4 – February 2016
    • Volume 20
      • Volume 20, Number 1 – May 2016
      • Volume 20, Number 2 – August 2016
      • Volume 20, Number 3 – November 2016
      • Volume 20, Number 4 – February 2017
    • Volume 21
      • Volume 21, Number 1 – May 2017
      • Volume 21, Number 2 – August 2017
      • Volume 21, Number 3 – November 2017
      • Volume 21, Number 4 – February 2018
    • Volume 22
      • Volume 22, Number 1 – May 2018
      • Volume 22, Number 2 – August 2018
      • Volume 22, Number 3 – November 2018
      • Volume 22, Number 4 – February 2019
    • Volume 23
      • Volume 23, Number 1 – May 2019
      • Volume 23, Number 2 – August 2019
      • Volume 23, Number 3 – November 2019
      • Volume 23, Number 4 – February 2020
    • Volume 24
      • Volume 24, Number 1 – May 2020
      • Volume 24, Number 2 – August 2020
      • Volume 24, Number 3 – November 2020
      • Volume 24, Number 4 – February 2021
    • Volume 25
      • Volume 25, Number 1 – May 2021
      • Volume 25, Number 2 – August 2021
      • Volume 25, Number 3 – November 2021
      • Volume 25, Number 4 – February 2022
    • Volume 26
      • Volume 26, Number 1 – May 2022
      • Volume 26, Number 2 – August 2022
      • Volume 26, Number 3 – November 2022
      • Volume 26, Number 4 – February 2023
    • Volume 27
      • Volume 27, Number 1 – May 2023
      • Volume 27, Number 2 – August 2023
      • Volume 27, Number 3 – November 2023
      • Volume 27, Number 4 – February 2024
    • Volume 28
      • Volume 28, Number 1 – May 2024
      • Volume 28, Number 2 – August 2024
      • Volume 28, Number 3 – November 2024
      • Volume 28, Number 4 – February 2025
    • Volume 29
      • Volume 29, Number 1 – May 2025
  • Books
  • How to Submit
    • Submission Info
    • Ethical Standards for Authors and Reviewers
    • TESL-EJ Style Sheet for Authors
    • TESL-EJ Tips for Authors
    • Book Review Policy
    • Media Review Policy
    • APA Style Guide
  • Editorial Board
  • Support

Study Intonation: A mobile-assisted pronunciation training application

November 2021 – Volume 25, Number 3

Title Study Intonation
Website http://studyintonation.org/
Type of Product Mobile-assisted pronunciation training application
Operating System Android
Hardware Requirements An Android device, internet connection
Price Free and open source

Technological devices such as smartphones, tablets, wearables, and laptops now make it possible for learners to study anywhere and at any time, which provides what educational specialists call ‘ubiquitous’ learning (Yang, 2006). Of these technological devices, mobile-assisted language learning (MALL) is a rapidly growing field, especially when it comes to mobile-assisted pronunciation training (MAPT). In fact, it is expected that MALL, and by extension MAPT, will have an enormous impact on second language learning in the coming years due to the projection that mobile devices will only grow in accessibility, affordability, and connectivity, and thus its integration into mainstream education (Crompton, Olszewski, & Bielefeldt, 2016).

MAPT has also been found to be effective for intonation learning due to its real-time, audio-visual capabilities and its ability to provide individualized, instantaneous feedback in a way that human instructors cannot. Research reporting positive MAPT outcomes include teaching ESL to Arabic speaking migrants in Sweden (Bradley et al., 2017), as a perceptual training tool for Mandarin tones (Kang et al., 2018), and educating prospective English language teachers on dialect awareness (Bozoglan & Gok, 2017). Nevertheless, MAPT is a relatively new field and comes with many limitations, namely, that it has yet to bridge the divide between pedagogical practices and technology (Kaiser, 2018). As a result, existing MAPT applications are either too technical or cumbersome to use or do not adhere to sound pronunciation pedagogy for learning. However, linguists and developers understanding of MAPT continue to evolve as they develop novel applications to meet the needs of L2 leaners.

This review focuses on a MAPT application, Study Intonation, and explores the possibilities this app brings to L2 intonation learning. Developed in 2017, Study Intonation is an Android-based mobile application that provides users an intuitive and user-friendly environment to practice the intonation of a language through pitch visualization. A courseware development kit (CDK) has also been created for instructors to aid in L2 intonation teaching to go along with the application, but this review will primarily focus on the application from the student’s perspective. To ensure effective pronunciation pedagogy, the goal of Study Intonation is to: (1) address technological restrictions by visualizing pitch in a way that is helpful to learners, (2) take into account the different learning styles of learners, and (3) incorporate sound CAPT principles as outlined by Chun (2013):

  • to provide the learners with visualization of their intonation patterns and specific contrastive feedback;
  • to provide the learners with authentic extensive speech and cultural context to contribute to their perceptual skills;
  • to facilitate, record, and analyze interactions between speakers;
  • to be operated both as a teaching and a research tool.

Description

Users must first download the APK file from Study Intonation’s website. On the sign in page, users can choose to proceed with or without creating an account. However, users can only access content from an instructor if they are logged into an account. Once in the app, users are taken to a homepage where they can see all their courses listed on course cards. A search bar on top allows users to quickly search through their available courses. Without logging into an account, the application comes pre-loaded with only one demo course, “Course Study Intonation.” Course cards contain a brief description of the course along with a “Go to course” button (see Figure 1). Tapping on the “Go to course” button will take users to the lessons page.

Study Intonation's Home Page
Figure 1. Study Intonation’s Home Page

On the lessons page, every lesson within that particular course is listed on lesson cards. As an example, the pre-loaded course comes with 4 lessons: everyday discourse, invitations (make-accept-decline), show your emotions, and speaking in an academic context. Each lesson card contains a brief description of that lesson, basic instructions on what to pay attention to, and an estimated completion time for each lesson (see Figure 2).

Lesson Cards Containing Descriptions, Instructions, and Estimated Completion Time
Figure 2. Lesson Cards Containing Descriptions, Instructions, and Estimated Completion Time

Once users tap on a lesson card, they are taken to the training page where they can begin training with the sentences in that set. A personable message pops up at the top of every page to serve as supplementary guidance to learners as they work with a sentence. Taking a minimalist UI approach that prioritizes content organization and a “clean” look to aid in usability, the only other elements on the page are a grid chart, the text for that sentence, and three buttons (a play button, a microphone button, and a broom button) on the bottom of the page. The play button plays the native speaker’s recording, the microphone button starts and stops the user’s microphone, and the broom icon resets the chart by deleting all existing drawings on the chart.

Real Time Pitch Visualization for the Native Speaker Sample     Real Time Pitch Visualization for the Native Speaker Sample
Figure 3. Real Time Pitch Visualization for the Native Speaker Sample

The main feature of Study Intonation is its use of pitch visualizations and using the learner’s and model’s pitch visualizations as contrastive feedback. When a user taps on the play button, the app plays the model speaker’s audio while drawing out the model speaker’s pitch contour in real-time (see Figure 3). The corresponding word for the sentence also lights up in pink (karaoke-style) as the model speaker says the sentence. The learner can replay the audio and pitch visualization as many times as they need.

Once the user is ready to practice, they can tap on the microphone button to start their recording. The user then says the sentence out loud and tap on the microphone button again to stop the recording. While the pitch drawing for the user is not in real-time, the pitch visualization does appear the moment the user hits the stop button. It also appears on top of the model speaker’s pitch as contrastive feedback. The learner can practice as many times as they need, and each new recording will appear in a new color as visual distinction (see Figure 4a).

Learner's Pitch Visualization in Different Colors
Figure 4a. Learner’s Pitch Visualization in Different Colors

Additionally, an intonation accuracy score also pops up briefly after each recording. This score is calculated using a pitch estimation algorithm with Pearson coefficients (r) and Mean Squared Errors (MSE) as metrics to evaluate how close a learner’s pitch direction and pitch height are to the model’s (Lezhenin et al., 2017; see Figure 4b). These scores are reported as raw data. When a user is ready to move on, they can swipe onto the next sentence, and once they are done with the sentences, they can move on to the next lesson within the course.

Raw Data of a Learner's Accuracy Score (Pearson r and MSE)     Raw Data of a Learner's Accuracy Score (Pearson r and MSE)
Figure 4b. Raw Data of a Learner’s Accuracy Score (Pearson r and MSE)

Evaluation

This section aims to evaluate the strengths and limitations of the application based on its intended goals as outlined in the introduction, categorized here as: (1) technological features, (2) learner fit, and (3) pedagogical features.

Technological features

Existing pitch detection algorithms come with many restrictions that limit its usefulness in a classroom setting. Common developer struggles are that pitch arrays are oversensitive and returned as a series of dots, resulting in unreadable visuals, and that it is exceptionally difficult to normalize input data based on a speaker’s gender, timbre, and speaking rate without a significant trade off in accuracy (Lezhenin et al., 2017). In this regard, one of the biggest strengths of Study Intonation is in its pitch detection algorithm. The application not only handles silences and noise well, but it also does an exceptional job at generating learner-friendly pitch curves despite the variance in pitch and duration of a speaker’s voice. Furthermore, Study Intonation uses a pitch estimation algorithm with Pearson coefficients (r) and Mean Squared Errors (MSE) as metrics to evaluate how close a learner’s pitch direction and pitch height are to the model’s, opting to give users contrastive feedback rather than a prescriptive “correct-or-incorrect” approach common to other applications. This allows for users to make their own interpretation regarding their performance, and it also allows for the algorithm to be used with multiple languages.

Pedagogical features

Another strength of this application lies in its real-time pitch drawing for the model speaker audio clips. This real-time drawing, along with the “karaoke-style” highlighting of the corresponding words, adhere to the noticing hypothesis principle (Schmidt, 1990), in which attention is called to the most important features of a target. Content words are also bolded to further emphasize what learners should pay attention to. However, a limitation to the app is that the user’s pitch is visualized as one long curve without text segmentation and without audio playback capabilities, which makes it hard for the user to identify problem segments when trying to self-correct. In regards to the developers’ stated goal of creating a display that provides specific, contrastive feedback, Study Intonation’s display is contrastive but not specific. This could result in guesswork when trying to learn a new pitch pattern, which in turn could be detrimental to learning (Rogerson-Revell, 2021).

Another issue with Study Intonation is its use of raw data (Pearson r and MSE scores) as part of its contrastive feedback. This is unhelpful to learners for several reasons. First, most users are not statisticians, and the presentation of these scores, unaccompanied by an explanation, will mean very little to the average learner. Secondly, even if a learner did understand the technical implications of these scores (e.g., a higher Pearson r score means a closer approximation to the model speaker), it leaves the judgement of the performance up to the interpretation of the learner themselves. As L1 influence can impact the way learners perceive interlanguage productions (Flege, 1999), much of SLA research agrees that learners need feedback that does not rely on the student’s own perceptions. Finally, these scores do not tell the student why they got the score, nor does it provide a solution on how to correct the presumed errors.

Learner Fit and Usability

Learner fit is typically defined as the suitability of an application to an individual learners’ needs and interests (Cotos & Huffman, 2013), and one of the main goals of the developers is to take into account the diversity in learning styles (Lezhenin et al., 2017). As such, audio, visual, and verbal components were presented as practice options when working with the intonation of a sentence. Students were able to listen to the model as many times as needed, watch real-time pitch-drawings of the model sample, and use the training page to practice verbal output. In terms of the usability of the application, Study Intonation requires minimal onboarding or tutorials because it is fairly intuitive to navigate. The application utilizes a minimalist design and chooses to exclude waveforms, spectrograms, or complicated graphical information (common in other pitch detection applications) so as not to overload or confuse the user.

Conclusion

Study Intonation is promising and consistent in its goal to serve as a MAPT for classrooms and learning purposes. Its strengths include its powerful pitch detection algorithm and non-prescriptive pitch estimation evaluation scoring system, allowing it to be used by any learner and applied to any foreign intonation learning context. It is also free to download, open-sourced to allow for content creation, and is intuitive to use. Every application, however, comes with its drawbacks. The major limitations of Study Intonation are (1) it does not provide specific enough feedback for users to pinpoint problem segments and self-correct effectively, (2) its pedagogical model relies on a direct comparison method which presupposes one “correct” way to pronunciation, and (3) its use of technical scores that is not meaningful nor constructively corrective. As such, Study Intonation would be better used as a supplementary tool rather than a standalone intonation learning application. For example, the app will serve well as an introductory aid to intonation learning, how content words interact with stress and pitch prominence, and in understanding the fall-rise movements of one’s voice. Overall, a suggestion for future iterations of this application would be to report the numbers in a more learner-friendly format. For example, instead of reporting: “Pearson r = 0.44”, the app could instead say: “Pitch direction falls towards the end. Consider a rising pitch instead.” This type of feedback could be more beneficial to the learner.

References

Bozoglan, H., & Gok, D., (2017) Effect of mobile-assisted dialect awareness training on the dialect attitudes of prospective English language teachers, Journal of Multilingual and Multicultural Development, 38(9), 772-787. https://doi.org/10.1080/01434632.2016.1260572

Bradley, L., Lindström, N. B., & Hashemi, S. S. (2017). Integration and language learning of newly arrived migrants using mobile technology. Journal of Interactive Media in Education, 1(3), 1-9.

Chun, D.M. (2013). Computer-assisted pronunciation teaching. In C. Chapelle’s (Ed.), The encyclopedia of applied linguistics. Blackwell.

Cotos, E., & Huffman, S. (2013). Learner fit in scaling up automated writing evaluation. International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT), 3(3), 77-98.

Crompton, H., Olszewski, B., & Bielefeldt, T. (2016). The mobile learning training needs of educators in technology enabled environments. Professional Development in Education, 42(3), 482-501. https://doi.org/10.1080/19415257.2014.1001033

Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second Language Acquisition and the Critical Period Hypothesis (pp. 101-131). Erlbaum.

Kaiser, D. (2018). Mobile-assisted pronunciation training: The iPhone pronunciation app project. IATEFL Pronunciation Special Interest Group Journal, 58, 38-52.

Kang, M., Ito, A., & Hatano, H. (2018). Development and Evaluation of a Chinese Tone Perceptual Training Mobile App. In 2018 IEEE Conference on e-Learning, e-Management and e-Services (IC3e) (pp. 40-45). IEEE.

Lezhenin, Y., Lamtev, A., Dyachkov, V., Boitsova, E., Vylegzhanina, K., & Bogach, N. (2017). Study intonation: Mobile environment for prosody teaching. In 2017 3rd IEEE International Conference on Cybernetics (CYBCONF) (pp. 1-2). IEEE.

Rogerson-Revell, P. M. (2021). Computer-Assisted Pronunciation Training (CAPT): Current Issues and Future Directions. RELC Journal, 52(1), 189-205.

Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129-158.

Yang, S. J. H. (2006). Context aware ubiquitous learning environments for peer to peer collaborative learning. Educational Technology and Society, 9(1), 188-201.

About the Reviewer

April Tan is a second year PhD student co-majoring in Applied Linguistics and Technology and Human-Computer Interaction at Iowa State University. Her research interests include computer assisted pronunciation teaching, game-based learning, and UX/UI design. Currently, she is working on developing an intonation learning application that extracts a learner’s pitch contour in real-time. <apriltan@iastate.edu>

To cite this article

Tan, A. (2021). Study Intonation: A mobile-assisted pronunciation training application. Teaching English as a Second Language Electronic Journal (TESL-EJ), 25(3). https://tesl-ej.org/pdf/ej99/m3.pdf

© Copyright rests with authors. Please cite TESL-EJ appropriately.Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations.

© 1994–2025 TESL-EJ, ISSN 1072-4303
Copyright of articles rests with the authors.