November 2021 – Volume 25, Number 3
Title | Study Intonation |
Website | http://studyintonation.org/ |
Type of Product | Mobile-assisted pronunciation training application |
Operating System | Android |
Hardware Requirements | An Android device, internet connection |
Price | Free and open source |
Technological devices such as smartphones, tablets, wearables, and laptops now make it possible for learners to study anywhere and at any time, which provides what educational specialists call ‘ubiquitous’ learning (Yang, 2006). Of these technological devices, mobile-assisted language learning (MALL) is a rapidly growing field, especially when it comes to mobile-assisted pronunciation training (MAPT). In fact, it is expected that MALL, and by extension MAPT, will have an enormous impact on second language learning in the coming years due to the projection that mobile devices will only grow in accessibility, affordability, and connectivity, and thus its integration into mainstream education (Crompton, Olszewski, & Bielefeldt, 2016).
MAPT has also been found to be effective for intonation learning due to its real-time, audio-visual capabilities and its ability to provide individualized, instantaneous feedback in a way that human instructors cannot. Research reporting positive MAPT outcomes include teaching ESL to Arabic speaking migrants in Sweden (Bradley et al., 2017), as a perceptual training tool for Mandarin tones (Kang et al., 2018), and educating prospective English language teachers on dialect awareness (Bozoglan & Gok, 2017). Nevertheless, MAPT is a relatively new field and comes with many limitations, namely, that it has yet to bridge the divide between pedagogical practices and technology (Kaiser, 2018). As a result, existing MAPT applications are either too technical or cumbersome to use or do not adhere to sound pronunciation pedagogy for learning. However, linguists and developers understanding of MAPT continue to evolve as they develop novel applications to meet the needs of L2 leaners.
This review focuses on a MAPT application, Study Intonation, and explores the possibilities this app brings to L2 intonation learning. Developed in 2017, Study Intonation is an Android-based mobile application that provides users an intuitive and user-friendly environment to practice the intonation of a language through pitch visualization. A courseware development kit (CDK) has also been created for instructors to aid in L2 intonation teaching to go along with the application, but this review will primarily focus on the application from the student’s perspective. To ensure effective pronunciation pedagogy, the goal of Study Intonation is to: (1) address technological restrictions by visualizing pitch in a way that is helpful to learners, (2) take into account the different learning styles of learners, and (3) incorporate sound CAPT principles as outlined by Chun (2013):
- to provide the learners with visualization of their intonation patterns and specific contrastive feedback;
- to provide the learners with authentic extensive speech and cultural context to contribute to their perceptual skills;
- to facilitate, record, and analyze interactions between speakers;
- to be operated both as a teaching and a research tool.
Description
Users must first download the APK file from Study Intonation’s website. On the sign in page, users can choose to proceed with or without creating an account. However, users can only access content from an instructor if they are logged into an account. Once in the app, users are taken to a homepage where they can see all their courses listed on course cards. A search bar on top allows users to quickly search through their available courses. Without logging into an account, the application comes pre-loaded with only one demo course, “Course Study Intonation.” Course cards contain a brief description of the course along with a “Go to course” button (see Figure 1). Tapping on the “Go to course” button will take users to the lessons page.
On the lessons page, every lesson within that particular course is listed on lesson cards. As an example, the pre-loaded course comes with 4 lessons: everyday discourse, invitations (make-accept-decline), show your emotions, and speaking in an academic context. Each lesson card contains a brief description of that lesson, basic instructions on what to pay attention to, and an estimated completion time for each lesson (see Figure 2).
Once users tap on a lesson card, they are taken to the training page where they can begin training with the sentences in that set. A personable message pops up at the top of every page to serve as supplementary guidance to learners as they work with a sentence. Taking a minimalist UI approach that prioritizes content organization and a “clean” look to aid in usability, the only other elements on the page are a grid chart, the text for that sentence, and three buttons (a play button, a microphone button, and a broom button) on the bottom of the page. The play button plays the native speaker’s recording, the microphone button starts and stops the user’s microphone, and the broom icon resets the chart by deleting all existing drawings on the chart.
The main feature of Study Intonation is its use of pitch visualizations and using the learner’s and model’s pitch visualizations as contrastive feedback. When a user taps on the play button, the app plays the model speaker’s audio while drawing out the model speaker’s pitch contour in real-time (see Figure 3). The corresponding word for the sentence also lights up in pink (karaoke-style) as the model speaker says the sentence. The learner can replay the audio and pitch visualization as many times as they need.
Once the user is ready to practice, they can tap on the microphone button to start their recording. The user then says the sentence out loud and tap on the microphone button again to stop the recording. While the pitch drawing for the user is not in real-time, the pitch visualization does appear the moment the user hits the stop button. It also appears on top of the model speaker’s pitch as contrastive feedback. The learner can practice as many times as they need, and each new recording will appear in a new color as visual distinction (see Figure 4a).
Additionally, an intonation accuracy score also pops up briefly after each recording. This score is calculated using a pitch estimation algorithm with Pearson coefficients (r) and Mean Squared Errors (MSE) as metrics to evaluate how close a learner’s pitch direction and pitch height are to the model’s (Lezhenin et al., 2017; see Figure 4b). These scores are reported as raw data. When a user is ready to move on, they can swipe onto the next sentence, and once they are done with the sentences, they can move on to the next lesson within the course.
Evaluation
This section aims to evaluate the strengths and limitations of the application based on its intended goals as outlined in the introduction, categorized here as: (1) technological features, (2) learner fit, and (3) pedagogical features.
Technological features
Existing pitch detection algorithms come with many restrictions that limit its usefulness in a classroom setting. Common developer struggles are that pitch arrays are oversensitive and returned as a series of dots, resulting in unreadable visuals, and that it is exceptionally difficult to normalize input data based on a speaker’s gender, timbre, and speaking rate without a significant trade off in accuracy (Lezhenin et al., 2017). In this regard, one of the biggest strengths of Study Intonation is in its pitch detection algorithm. The application not only handles silences and noise well, but it also does an exceptional job at generating learner-friendly pitch curves despite the variance in pitch and duration of a speaker’s voice. Furthermore, Study Intonation uses a pitch estimation algorithm with Pearson coefficients (r) and Mean Squared Errors (MSE) as metrics to evaluate how close a learner’s pitch direction and pitch height are to the model’s, opting to give users contrastive feedback rather than a prescriptive “correct-or-incorrect” approach common to other applications. This allows for users to make their own interpretation regarding their performance, and it also allows for the algorithm to be used with multiple languages.
Pedagogical features
Another strength of this application lies in its real-time pitch drawing for the model speaker audio clips. This real-time drawing, along with the “karaoke-style” highlighting of the corresponding words, adhere to the noticing hypothesis principle (Schmidt, 1990), in which attention is called to the most important features of a target. Content words are also bolded to further emphasize what learners should pay attention to. However, a limitation to the app is that the user’s pitch is visualized as one long curve without text segmentation and without audio playback capabilities, which makes it hard for the user to identify problem segments when trying to self-correct. In regards to the developers’ stated goal of creating a display that provides specific, contrastive feedback, Study Intonation’s display is contrastive but not specific. This could result in guesswork when trying to learn a new pitch pattern, which in turn could be detrimental to learning (Rogerson-Revell, 2021).
Another issue with Study Intonation is its use of raw data (Pearson r and MSE scores) as part of its contrastive feedback. This is unhelpful to learners for several reasons. First, most users are not statisticians, and the presentation of these scores, unaccompanied by an explanation, will mean very little to the average learner. Secondly, even if a learner did understand the technical implications of these scores (e.g., a higher Pearson r score means a closer approximation to the model speaker), it leaves the judgement of the performance up to the interpretation of the learner themselves. As L1 influence can impact the way learners perceive interlanguage productions (Flege, 1999), much of SLA research agrees that learners need feedback that does not rely on the student’s own perceptions. Finally, these scores do not tell the student why they got the score, nor does it provide a solution on how to correct the presumed errors.
Learner Fit and Usability
Learner fit is typically defined as the suitability of an application to an individual learners’ needs and interests (Cotos & Huffman, 2013), and one of the main goals of the developers is to take into account the diversity in learning styles (Lezhenin et al., 2017). As such, audio, visual, and verbal components were presented as practice options when working with the intonation of a sentence. Students were able to listen to the model as many times as needed, watch real-time pitch-drawings of the model sample, and use the training page to practice verbal output. In terms of the usability of the application, Study Intonation requires minimal onboarding or tutorials because it is fairly intuitive to navigate. The application utilizes a minimalist design and chooses to exclude waveforms, spectrograms, or complicated graphical information (common in other pitch detection applications) so as not to overload or confuse the user.
Conclusion
Study Intonation is promising and consistent in its goal to serve as a MAPT for classrooms and learning purposes. Its strengths include its powerful pitch detection algorithm and non-prescriptive pitch estimation evaluation scoring system, allowing it to be used by any learner and applied to any foreign intonation learning context. It is also free to download, open-sourced to allow for content creation, and is intuitive to use. Every application, however, comes with its drawbacks. The major limitations of Study Intonation are (1) it does not provide specific enough feedback for users to pinpoint problem segments and self-correct effectively, (2) its pedagogical model relies on a direct comparison method which presupposes one “correct” way to pronunciation, and (3) its use of technical scores that is not meaningful nor constructively corrective. As such, Study Intonation would be better used as a supplementary tool rather than a standalone intonation learning application. For example, the app will serve well as an introductory aid to intonation learning, how content words interact with stress and pitch prominence, and in understanding the fall-rise movements of one’s voice. Overall, a suggestion for future iterations of this application would be to report the numbers in a more learner-friendly format. For example, instead of reporting: “Pearson r = 0.44”, the app could instead say: “Pitch direction falls towards the end. Consider a rising pitch instead.” This type of feedback could be more beneficial to the learner.
References
Bozoglan, H., & Gok, D., (2017) Effect of mobile-assisted dialect awareness training on the dialect attitudes of prospective English language teachers, Journal of Multilingual and Multicultural Development, 38(9), 772-787. https://doi.org/10.1080/01434632.2016.1260572
Bradley, L., Lindström, N. B., & Hashemi, S. S. (2017). Integration and language learning of newly arrived migrants using mobile technology. Journal of Interactive Media in Education, 1(3), 1-9.
Chun, D.M. (2013). Computer-assisted pronunciation teaching. In C. Chapelle’s (Ed.), The encyclopedia of applied linguistics. Blackwell.
Cotos, E., & Huffman, S. (2013). Learner fit in scaling up automated writing evaluation. International Journal of Computer-Assisted Language Learning and Teaching (IJCALLT), 3(3), 77-98.
Crompton, H., Olszewski, B., & Bielefeldt, T. (2016). The mobile learning training needs of educators in technology enabled environments. Professional Development in Education, 42(3), 482-501. https://doi.org/10.1080/19415257.2014.1001033
Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second Language Acquisition and the Critical Period Hypothesis (pp. 101-131). Erlbaum.
Kaiser, D. (2018). Mobile-assisted pronunciation training: The iPhone pronunciation app project. IATEFL Pronunciation Special Interest Group Journal, 58, 38-52.
Kang, M., Ito, A., & Hatano, H. (2018). Development and Evaluation of a Chinese Tone Perceptual Training Mobile App. In 2018 IEEE Conference on e-Learning, e-Management and e-Services (IC3e) (pp. 40-45). IEEE.
Lezhenin, Y., Lamtev, A., Dyachkov, V., Boitsova, E., Vylegzhanina, K., & Bogach, N. (2017). Study intonation: Mobile environment for prosody teaching. In 2017 3rd IEEE International Conference on Cybernetics (CYBCONF) (pp. 1-2). IEEE.
Rogerson-Revell, P. M. (2021). Computer-Assisted Pronunciation Training (CAPT): Current Issues and Future Directions. RELC Journal, 52(1), 189-205.
Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129-158.
Yang, S. J. H. (2006). Context aware ubiquitous learning environments for peer to peer collaborative learning. Educational Technology and Society, 9(1), 188-201.
About the Reviewer
April Tan is a second year PhD student co-majoring in Applied Linguistics and Technology and Human-Computer Interaction at Iowa State University. Her research interests include computer assisted pronunciation teaching, game-based learning, and UX/UI design. Currently, she is working on developing an intonation learning application that extracts a learner’s pitch contour in real-time. <apriltan@iastate.edu>
To cite this article
Tan, A. (2021). Study Intonation: A mobile-assisted pronunciation training application. Teaching English as a Second Language Electronic Journal (TESL-EJ), 25(3). https://tesl-ej.org/pdf/ej99/m3.pdf
© Copyright rests with authors. Please cite TESL-EJ appropriately.Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations. |