• Skip to primary navigation
  • Skip to main content

site logo
The Electronic Journal for English as a Second Language
search
  • Home
  • About TESL-EJ
  • Vols. 1-15 (1994-2012)
    • Volume 1
      • Volume 1, Number 1
      • Volume 1, Number 2
      • Volume 1, Number 3
      • Volume 1, Number 4
    • Volume 2
      • Volume 2, Number 1 — March 1996
      • Volume 2, Number 2 — September 1996
      • Volume 2, Number 3 — January 1997
      • Volume 2, Number 4 — June 1997
    • Volume 3
      • Volume 3, Number 1 — November 1997
      • Volume 3, Number 2 — March 1998
      • Volume 3, Number 3 — September 1998
      • Volume 3, Number 4 — January 1999
    • Volume 4
      • Volume 4, Number 1 — July 1999
      • Volume 4, Number 2 — November 1999
      • Volume 4, Number 3 — May 2000
      • Volume 4, Number 4 — December 2000
    • Volume 5
      • Volume 5, Number 1 — April 2001
      • Volume 5, Number 2 — September 2001
      • Volume 5, Number 3 — December 2001
      • Volume 5, Number 4 — March 2002
    • Volume 6
      • Volume 6, Number 1 — June 2002
      • Volume 6, Number 2 — September 2002
      • Volume 6, Number 3 — December 2002
      • Volume 6, Number 4 — March 2003
    • Volume 7
      • Volume 7, Number 1 — June 2003
      • Volume 7, Number 2 — September 2003
      • Volume 7, Number 3 — December 2003
      • Volume 7, Number 4 — March 2004
    • Volume 8
      • Volume 8, Number 1 — June 2004
      • Volume 8, Number 2 — September 2004
      • Volume 8, Number 3 — December 2004
      • Volume 8, Number 4 — March 2005
    • Volume 9
      • Volume 9, Number 1 — June 2005
      • Volume 9, Number 2 — September 2005
      • Volume 9, Number 3 — December 2005
      • Volume 9, Number 4 — March 2006
    • Volume 10
      • Volume 10, Number 1 — June 2006
      • Volume 10, Number 2 — September 2006
      • Volume 10, Number 3 — December 2006
      • Volume 10, Number 4 — March 2007
    • Volume 11
      • Volume 11, Number 1 — June 2007
      • Volume 11, Number 2 — September 2007
      • Volume 11, Number 3 — December 2007
      • Volume 11, Number 4 — March 2008
    • Volume 12
      • Volume 12, Number 1 — June 2008
      • Volume 12, Number 2 — September 2008
      • Volume 12, Number 3 — December 2008
      • Volume 12, Number 4 — March 2009
    • Volume 13
      • Volume 13, Number 1 — June 2009
      • Volume 13, Number 2 — September 2009
      • Volume 13, Number 3 — December 2009
      • Volume 13, Number 4 — March 2010
    • Volume 14
      • Volume 14, Number 1 — June 2010
      • Volume 14, Number 2 – September 2010
      • Volume 14, Number 3 – December 2010
      • Volume 14, Number 4 – March 2011
    • Volume 15
      • Volume 15, Number 1 — June 2011
      • Volume 15, Number 2 — September 2011
      • Volume 15, Number 3 — December 2011
      • Volume 15, Number 4 — March 2012
  • Vols. 16-Current
    • Volume 16
      • Volume 16, Number 1 — June 2012
      • Volume 16, Number 2 — September 2012
      • Volume 16, Number 3 — December 2012
      • Volume 16, Number 4 – March 2013
    • Volume 17
      • Volume 17, Number 1 – May 2013
      • Volume 17, Number 2 – August 2013
      • Volume 17, Number 3 – November 2013
      • Volume 17, Number 4 – February 2014
    • Volume 18
      • Volume 18, Number 1 – May 2014
      • Volume 18, Number 2 – August 2014
      • Volume 18, Number 3 – November 2014
      • Volume 18, Number 4 – February 2015
    • Volume 19
      • Volume 19, Number 1 – May 2015
      • Volume 19, Number 2 – August 2015
      • Volume 19, Number 3 – November 2015
      • Volume 19, Number 4 – February 2016
    • Volume 20
      • Volume 20, Number 1 – May 2016
      • Volume 20, Number 2 – August 2016
      • Volume 20, Number 3 – November 2016
      • Volume 20, Number 4 – February 2017
    • Volume 21
      • Volume 21, Number 1 – May 2017
      • Volume 21, Number 2 – August 2017
      • Volume 21, Number 3 – November 2017
      • Volume 21, Number 4 – February 2018
    • Volume 22
      • Volume 22, Number 1 – May 2018
      • Volume 22, Number 2 – August 2018
      • Volume 22, Number 3 – November 2018
      • Volume 22, Number 4 – February 2019
    • Volume 23
      • Volume 23, Number 1 – May 2019
      • Volume 23, Number 2 – August 2019
      • Volume 23, Number 3 – November 2019
      • Volume 23, Number 4 – February 2020
    • Volume 24
      • Volume 24, Number 1 – May 2020
      • Volume 24, Number 2 – August 2020
      • Volume 24, Number 3 – November 2020
      • Volume 24, Number 4 – February 2021
    • Volume 25
      • Volume 25, Number 1 – May 2021
      • Volume 25, Number 2 – August 2021
      • Volume 25, Number 3 – November 2021
      • Volume 25, Number 4 – February 2022
    • Volume 26
      • Volume 26, Number 1 – May 2022
      • Volume 26, Number 2 – August 2022
      • Volume 26, Number 3 – November 2022
  • Books
  • How to Submit
    • Submission Procedures
    • Ethical Standards for Authors and Reviewers
    • TESL-EJ Style Sheet for Authors
    • TESL-EJ Tips for Authors
    • Book Review Policy
    • Media Review Policy
    • APA Style Guide
  • TESL-EJ Editorial Board

The Language of Massively Multiplayer Online Gamers: A Study of Vocabulary in Minecraft Gameplay

* * * On the Internet * * *

November 2019 — Volume 23, Number 3

Ya-Chen Chien
National Taipei University of Education, Taiwan
<ychienatmarkmail.ntue.edu.tw>

Abstract

This study is a lexical analysis of spoken discourse and text supporting vocabulary development for EFL (English as a foreign language) students who watch gameplay videos, with corroborating evidence that the vocabulary of two EFL learners was indeed enriched in an analysis of their spoken discourse while engaged in a gameplay task. The study first addresses whether the popular video game Minecraft provides the much needed diverse context and situated learning for EFL learners to engage in conversations. It next examines the vocabulary encountered by players in Minecraft itself to demonstrate the potential vocabulary coverage that learners would be exposed to while playing Minecraft. And finally, it analyzes the spoken language produced by a pair of L2 learners to corroborate that their vocabulary coverage during gameplay recapitulated the language found in these videos and in the game itself, to an extent greater than would be expected of their peers.

The study shows that L2 learners need to know 4000-6000 word families to understand 95% of the words in Minecraft gameplay videos. The Minecraft blocks and items vocabulary contains 60% of 3k words and 30% of 4k-14k words from the Brown National Corpus, whereas 9% of the words encountered in Minecraft are not found in the 14k BNC list. These are words that may be common to native speakers of English but that L2 learners are not likely to encounter outside of class. In this case, learners may increase the breadth of their vocabulary knowledge more easily by being exposed to these less frequent words from watching Minecraft videos and playing the game. Finally, the results from analyzing second language learner Minecraft gameplay show that 95% of the vocabulary, including proper names and marginal words, are from the 2,000 most frequent word lists and 5% were from the 3k-14k word lists. The study concludes that, for L2 learners of English, Minecraft YouTube videos can serve as authentic language input sources with rich vocabulary coverage.

Key Words: Minecraft, YouTube, frequency of occurrence, lexical analysis, gamification, games-based learning, video games, language learning

Introduction

According to the official Minecraft wiki (https://minecraft.gamepedia.com/Minecraft_Wiki) and Valentin (2019), Minecraft has sold more than 176 million copies worldwide across different platforms. It has become so popular amongst kids that educators have begun to explore its possibilities for educational purposes. Many researchers believe that games are appropriate to and hold substantial potential for language learning (Kuhn, 2017, Young et al., 2012). Specifically, there has been a steady increase in the claims valuing the use of network-based gaming for language learning. Peterson (2010) suggested that learner participation in network-based gaming provides valuable opportunities for vocabulary acquisition and also the context to develop communicative competence while interacting with online players otherwise not available in countries where English is a foreign language.

In the TEFL field, many parents of young EFL (English as a foreign language) learners who play Minecraft have reported observing language acquired through watching YouTube videos. According to the lead author in Smolčec, Smolčec, and Stevens (2014):

I learned how Filip had acquired a high level of English through watching YouTube videos, mainly ones about Minecraft, and interacting with players from other countries, often to exchange expertise on the game, but more recently in making his own videos of gameplay and tutorials to explain his techniques to others. (p. 6)

However, there has been little research to support such claims. Therefore, this study seeks to provide a vocabulary coverage analysis to explore the vocabulary learning potential of learners who play Minecraft and watch Minecraft gameplay videos.

Minecraft Videos Provide Authentic Language Input

Minecraft YouTube videos provide authentic language input, arguably even more so than television programs for young learners. Many research studies have focused on using television programs as a source of authentic language input and argued that these programs, although scripted, are still identified as authentic language because the target audiences of such programs are native speakers of the language who are “authentically representative of the input English speakers regularly come into contact with” (Rodgers and Webb, 2011, p.690). Furthermore, Minecraft YouTube channels mostly contain people talking about what they are doing while streaming gameplay, which means they comprise spontaneous and naturally occurring monologues and dialogues. Therefore, they would be considered to contain authentic language, language used by players while they engage in gameplay. There have been studies of the language used in multiplayer online games; however, to my knowledge, none has focused on examining Minecraft, which is effectively a sandbox where players often use voice or text to converse authentically in order to decide what to build on their own.

Gaming videos appear to be conducive to English vocabulary learning. Peters’ study (2019) on the effect of imagery and on-screen text on foreign language vocabulary learning from audio-visual input shows that words which accompany on-screen imagery are three times more likely to be learned incidentally than words without imagery. Peters also found that students performed higher meaning recall and form recognition of words as a result of vocabulary learning under audio-visual input with words and on-screen imagery. In Rogers’ study (2018) words that were recalled by students most frequently were words with on-screen imagery or with clear visual support; for example, words and phrases such as ‘pack of wolves’, ‘cubs’, or ‘den’. Rogers also found that, with visual support, there was conceptual learning of words his students had not known prior to his study. Gee (2010) explained that games provide learners a situated context where they can learn vocabulary by experiencing and interacting in the virtual environment as opposed to learning words by getting definitions containing even more words. In his view, the gaming environment provides the visual meaning to visualize and thus better understand the words.

The Language Triptych for Gaming

Understanding how Minecraft gamers communicate and use specific language is valuable for language educators, particularly with respect to game-based learning as a pedagogical approach (Bawa, 2018). Coyle (2010) proposed the language triptych in raising awareness of three types of language needed for a successful CLIL lesson. This triptych comprises the three notions of language — of, for, and through — learning. In Coyle’s view, language of learning focuses on language related to understanding the subject; language for learning includes functional language for carrying out language tasks; language through learning is the new language acquired because of the process. By adapting the language triptych, the language needed in playing massively multiplayer online games can also be seen to include three aspects: language of gaming, language for gaming, and language through gaming, as explained below. 

Language of gaming is the language involved in the virtual world in-game. Minecraft (Mojang, 2015) is a multiplayer sandbox building game which, in the beginning, was simply a game about breaking and placing blocks. However, its functionality became more complex with the introduction of different types of blocks and materials such as redstone to transport signals and create complex functionality such as automated circuit devices to make the gameplay tasks easier. Therefore, the language of gaming includes the names of the blocks, items, and actions that can be done during gameplay in the virtual world, such as ‘crafting’, ‘smelting’, ‘enchanting’, ‘brewing’, and ‘trading’. There are also different types of mobs, or moving entities in-game. There are passive mobs such as animals; hostile mobs such as ‘creepers’, ‘ghasts’, and ‘zombies’; and tamable mobs such as ‘cats’, ‘donkeys’, ‘llamas’, ‘mules’, ‘parrots’, and ‘wolves’.

An additional affordance of Minecraft in terms of the vocabulary players are exposed to is the variety of biomes that can be used as ecological representations. Minecraft provides a variety of ecological systems with different forms of plants and animals, such as the desert biome, the forest biome, and the recently introduced ocean biome, with its own unique set of oceanic creatures. Each ecosystem has its unique set of vocabulary particular to that domain. 

Language for gaming comprises the language functions needed to interact with online players and perform game quests or tasks. Parents and researchers have reported that players appear to be endlessly engaged in communication and cooperation during gameplay in Minecraft. Players may play on Minecraft servers either with their neighbors or with other players around the world. They may work together and pool resources, build structures, defeat hostile mobs, and/or trade tips pertaining to gameplay; thereby fostering social skills in communication. Furthermore, Minecraft is a virtual world that relies on its players’ creativity and problem-solving skills; thus, it’s a virtual world that elicits from learners the language needed for problem-solving, creativity, and collaboration. Collaborative language is mostly modeled by other players in the game. By listening to other players online, the learners are able to acquire the language they need for gaming.

Language through gaming is the new language learned in the process of playing. This aspect of language learning is almost unpredictable and without boundaries. Given that Minecraft is a multiplayer online gaming platform, while players engage in playing with people from different parts of the world, cross-cultural communication enriches their language repertoire and new target language and culture are learned in the process of playing. One example from my own data is where the author’s son, Mattie, did not understand why Jacob in England was going to have tea during dinner time and later found out tea is their way of saying dinner time. Another example occurred during a discussion of llamas and alpacas in Minecraft. In this instance, the kids were not only acquiring new vocabulary but also learning new concepts such as the differences between llamas and alpacas.

In Zheng, Bischoff, and Gilliland’s (2015) investigation of a Japanese learner, Conan, and his vocabulary learning in the massively multiplayer online game (MMOG) World of Warcraft, the researchers observed that Conan had actively taken advantage of his co-player by initiating questioning about the words ‘repop’ and ‘looting’. The researcher argued that the sequence of vocabulary learning in MMOG reflects the participatory, collaborative, and distributed nature of learning, providing the learners with the ability to simultaneously see and do in the course of gameplay in an enriching context through which to acquire the vocabulary. In this way, language is learned through gaming and interacting with online players in the virtual world; thus learners are engaged in first-order languaging, “a whole-body sense-making activity that enables persons to engage with each other in forms of coaction and to integrate themselves with and to take part in social activities” (Thibault, 2011, p. 215), and not just learning about language, which is a second-order construct.

Lexical Demand for Comprehension

Understanding the vocabulary coverage in Minecraft videos would help debunk the notion that watching Minecraft streaming videos is just a waste of time (Vrabel, 2017) but, on the contrary, would in fact be beneficial for language learning. There has been little research in investigating lexical coverage of the language used by players in gaming videos and none so far has determined the lexical coverage necessary for adequate comprehension of YouTube Minecraft gameplay videos. Therefore, studies of lexical coverage and reading comprehension, which have been more extensively studied, may shed some light on how much vocabulary is needed to comprehend Minecraft gaming videos for L2 learners.

Nation (2006) examined the coverage of a variety of spoken and written texts with the 14 frequency lists developed on the basis of the British National Corpus (BNC) and the data show that in written texts, the first thousand most frequent word families provide a coverage of 81%, the second thousand an additional 9%, and the third thousand 5%. Thus, these data show that readers with a knowledge of 3,000 word families can reach a coverage of 95% of the words used in the BNC. Laufer and Ravenhorst-Kalovshki (2010) revisited the lexical threshold for reading comprehension and suggested that if corpus analysis showed that 8,000 word families cover 98% of a text, and if learners obtained a score of 71% in a reading comprehension test when they understood 98% of a text, then that level of adequate reading comprehension would require the knowledge of 8,000 word families. Therefore, Laufer and Ravenhorst-Kalovshki proposed two thresholds: the knowledge of 8,000-word families yielding the coverage of 98% would be an optimal threshold and minimal comprehension would require a vocabulary size of 4,000–5,000 word families resulting in the coverage of 95%. 

Since these figures were based on the results of text comprehension, we might project that for comprehending gaming videos, assuming that the learners might be able to understand more with visuals in the videos, an estimate of 4,000 word families with 95% coverage may be sufficient for learners to comprehend the Minecraft YouTube videos.

Meanwhile, Adolphs and Schmitt (2003) conducted a study on lexical coverage of spoken discourse to investigate the number of words that are required to speak conversationally in the L2. The authors indicated that the common consensus that learners are to acquire 2,000 word families to be able to engage in daily conversation is based on a study by Schonell et al. (1956; as cited in Adolphs & Schmitt, 2003) of a corpus of spoken discourse that consisted of only 512,647 words collected from Australian workers at the time. Since much larger and more diverse spoken corpora are available, e.g. the Cambridge and Nottingham Corpus of Discourse in English, Adolphs and Schmit analyzed the CANCODE, which contains 5 million words, and found that 2,000 word families made up less than 95 percent coverage, and that the most frequent 3,000 word families covered nearly 96 percent of the spoken corpus. Their study, which performed a second analysis on the CANCODE, and the spoken component of the British National Corpus (BNC), shows that approximately 5,000 words were required to achieve about a 96 percent coverage figure.

This result was an indication that more vocabulary is necessary to engage in daily spoken discourse than was previously thought. Therefore, performing an analysis of spoken discourse in Minecraft YouTube videos would allow us to know whether Minecraft gameplay videos reflect the daily spoken discourse, as suggested in the results found in Adolph and Schmitt’s study, to determine whether these videos would serve as sufficient language input for L2 learners.

Purpose of the Study

The present study focuses on the vocabulary coverage of Minecraft YouTube gaming channels in order to gauge the vocabulary learning potential for L2 learners when they watch the videos. The purposes are three-fold: 1) to investigate the lexical coverage of Minecraft videos; 2) to analyze the lexical coverage of Minecraft block and item names and thereby provide information on the language of gaming; and 3) to analyze L2 learner gameplay vocabulary.

Based on the spoken corpus-based study by Adolphs and Schmitt (2003), 3,000 word families were determined to cover nearly 96% of the words that occurred in the spoken corpus, and drawing on text comprehension studies (Nation, 2006; Laufer and Ravenhorst-Kalovshki, 2010), it appears that vocabulary size needs to reach at least 95% of lexical coverage of the text for minimal comprehension.

So to address the first purpose of this research, this study seeks to investigate whether 3,000 word families in the BNC cover 96% of the Minecraft gameplay spoken corpus and what vocabulary size is necessary to reach 95% coverage of the Minecraft gameplay spoken corpus. Regarding the language of gaming, the study then analyzes the vocabulary frequency of the names of blocks and items. And finally, the study ends in analyzing a 10-minute building challenge gameplay of two young players and reports the vocabulary range of L2 learner gameplay, classifying the words by their frequency against the BNC wordlist.

Method

Materials

For research purposes one and two, the automated closed captions of 106 Minecraft videos were randomly selected, downloaded, and analyzed in this study. The videos used came from the following YouTube channels and were selected based on the popularity and purpose of the channels:

  • PopularMMOs with 16 million subscribers
  • DanTDM., with 22.1 million subscribers
  • and OMGcraft with 895K subscribers

PopularMMOs videos are natural dialogues between two American players, Pat and Jenn, filmed while they uncover new challenges while playing Minecraft together.  Sixty-one of the most popular videos’ closed captions were downloaded from the PopularMMOs channel. For example, the most viewed video with 54 million views was Minecraft more tnt mod (35 tnt explosives and dynamite!) too much tnt mod showcase (Popularmmos, 2013)  uploaded in 2013. The video was 52 minutes long and contains 4176 words.

Twenty-five videos of  The best Minecraft Series Ever w/DanTDM (DanTDM, 2018) were used, and for OMGcraft (Johnson, 2012), twenty videos were selected. Both DanTDM and OMGcraft channels are monologues of the hosts either playing or creating tutorials informing listeners how to play. 

All in all, the Minecraft video gameplay corpus consists of  464,605 words of automated transcribed closed captions from the YouTube videos, broken down as follows: Popularmmos, (61 videos, 226k words), DanTDM (25 videos, 205k words), and OMGcraft (20 videos, 32k words). DanTDM and PopularMMOs were selected based on the popularity of the channel, but OMGcraft was selected because of the clarity and pace of the speech. The OMGcraft channel host talks more slowly and more clearly than Pat and Jenn in Popularmmos, therefore OMGcraft may be more suitable for L2 learners.  

For research purpose three, a 10 minute audio recording of a conversation between two young L2 learners during a Minecraft building challenge was transcribed and used for analysis.

Data Analysis

The AntWordProfiler was used to analyze the transcripts. AntWordProfiler is a “freeware tool for profiling the vocabulary level and complexity of texts” (Anthony, 2014, n.p.) that lists the words that occur in a text according to their frequency. Nation’s (2006) BNC word family list of fourteen 1,000 word lists were used with the AntWordProfiler software to show the percentage lexical coverage of the 14 groups of 1,000 words at which the words in the Minecraft YouTube videos occurred. The lists are based on the frequency and range of occurrence of words in the spoken section of the British National Corpus.

In addition, two other lists were used for analysis of the spoken corpus: a list of proper nouns and a list of marginal words containing four headwords which include mostly interjections, exclamations, and hesitation procedures, all of which are common in spoken English. Items that are not listed in the most frequent 14,000 word-families may be classified as ‘Not in the lists’. The BNC word lists can be downloaded from Paul Nation’s website: https://www.victoria.ac.nz/lals/resources/range

Results and Discussion

Lexical Coverage of Minecraft Videos

To address research purpose one regarding the vocabulary learning potential for L2 learners when watching gameplay videos, a corpus consisting of 464,605 words from three Minecraft YouTube channels was used for analysis against the BNC 14K word lists. More specifically, this study examined whether 3,000 word families would cover 96% of the Minecraft gameplay spoken corpus and what vocabulary size is necessary to reach 95% coverage.

Table 1 shows the cumulative coverage, in percent, of the complete corpus of dialog from the Minecraft gameplay videos arrayed against the 14k most frequent words in the BNC corpus. In the second column  the percent coverage is shown for the three channels combined, and then each is shown separately in the last three columns. As can be seen in the totals at the bottom of the table, the complete spoken corpus of gameplay videos consisted of 464,605 tokens. The last two rows of the table show the total numbers of word types and word families found for each of the columns; e.g the combined channels consisted of 10,359 word types from 6,977 word families.

The results in Table 1 show that a vocabulary size of 3,000 word families provided only 94% coverage of all videos in the three different channels combined. For all three channels, 88.11% of the vocabulary used in the videos make up the first thousand words, and 92.37% of that vocabulary were from the first 2k words. In order to reach 95% of coverage, learners would need to have a vocabulary size of 5,000 word families across three channels. As indicated by the figures in bold in the table below, the vocabulary threshold is lower in DanTDM’s videos, with 4,000 word families covering 95.42% of the vocabulary used in the videos and in OMGcraft’s videos, with 4,000 word families covering 95.06%; whereas with Populammos, viewers would need to be familiar with 6,000 word families in order to achieve that level of coverage.  

Table 1
Cumulative Coverage for Three Minecraft YouTube Channels

BNC Word list All channels combined %
Popularmmos %

DanTDM %

OMGcraft %
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
11,000
12,000
13,000
14,000
Marginal words
Not in the lists
88.11
92.37
94
94.86
95.44*
95.8
96.04
96.21
96.44
96.66
96.73
96.77
96.82
96.88
1.27
1.87
87.79
92.13
93.6
94.52
94.97
95.28*
95.55
95.71
96.02
96.29
96.37
96.41
96.44
96.5
1.75
1.74
88.62
92.69
94.46
95.42*
95.92
96.32
96.53
96.69
96.83
96.99
97.04
97.08
97.14
97.19
0.87
1.94
87.17
92.03
93.85
95.06*
95.69
96.1
96.33
96.61
96.84
97.09
97.19
97.29
97.38
97.8
0.42
2.20
Tokens
Types
Families
464605
10359
6977
226693
6338
4276
205847
6833
4751
32065
2451
1664

*The lexical coverage recommended for minimal video comprehension

So, to address the first research purpose, the results indicate that a vocabulary size of 3,000 word families provided only 94% coverage of all videos in all three different channels combined.  In other words, learners with a vocabulary of 3,000 most frequent words from the British National Corpus are likely to understand only 94 percent of the vocabulary in Minecraft videos. And in order to understand Minecraft YouTube videos, learners need a vocabulary size of 4000-6000 word families to reach 95% coverage. That is to say, either learners gain more receptive vocabulary by watching Minecraft YouTube videos, or they may understand less when they watch the videos, but they watch them anyway. The results can explain the experience of Filip, the son of the lead author in Smolčec, Smolčec, and Stevens (2014):

I was particularly impressed by Filip’s saying that when he and his brother were children they used to watch YouTube with no idea what people were saying until the wall gradually dissolved as the language somehow became comprehensible. (pp. 6-7)

We now understand that in order to comprehend 95% of the vocabulary in Minecraft videos, learners like Filip would need to acquire at least 4000-6000 word families, which is equivalent to the vocabulary size set for high school English curriculum in many EFL settings. For Filip to be able to achieve comprehension of the videos at such a young age as an EFL learner is quite impressive.

These findings are similar to those of Adolphs and Schmitt’s (2003) findings that 3,000 word families form approximately 94% of the subcorpus genre of intimate, socio-cultural discourse. However, the percent coverage of word families found in Minecraft gameplay videos falls short of the 96% of the 3,000 word families that Aldoph and Schmitt found in the CANCODE spoken corpus. When 3,000 words are not enough to cover 96% of the Minecraft spoken corpus, that means that Minecraft players use more difficult words when they play, which may be an indication that gameplay involves the use of more technical terms than does daily conversation. 

Language of Gaming in Minecraft: Vocabulary Analysis of Blocks and Items

Blocks are standard-sized cube units which make up the landscapes of the Minecraft world. The Minecraft Wiki lists the names of over 150 different types of blocks that can be deployed in Minecraft (https://minecraft.gamepedia.com/Block). Items are objects that exist only within the player’s inventory and hands. There are about 361 items listed here, https://minecraft.gamepedia.com/Item.

In order to address research purpose two, to analyze the lexical coverage of Minecraft block and item names and thereby provide information on the language of gaming, the complete list of names of all the blocks and items were analyzed with AntWordProfiler against the BNC 14k word list.

Table 2 below shows the number of word tokens and types, and cumulative coverage of the names of blocks and items, analyzed against each 1k level of the 14k BNC wordlist. To master the language of gaming for Minecraft, Table 2 shows that players must become familiar with many of the 488 types of words and 2,312 tokens in the names of blocks and items combined, among which are words that fall into the first 3,000 word families which account for 60% of the vocabulary (see figures in bold). That is to say, about 30% of the words fall into the 4k to 14k less frequent word lists, considering there are 59 words that are not on the lists, accounting for 9.15% of the total words in the block and item lists.

Table 2
Cumulative Coverage for Names of Minecraft Blocks and Items

Word list Tokens Cumulative Token % Type Cumulative Type %
1,000 599 25.91 82 18
2,000 497 47.41 86 37
3,000 291 60 57 50.22
4,000 206 68.91 41 59.37
5,000 104 73.41 28 65.62
6,000 39 75.1 18 69.64
7,000 36 76.66 12 72.32
8,000 91 80.6 19 76.56
9,000 47 82.63 15 79.91
10,000 74 85.83 9 81.92
11,000 14 86.44 4 82.81
12,000 64 89.21 7 84.37
13,000 15 89.86 3 85.04
14,000 23 90.85 7 86.6
Not in the lists 195 100 59 100
Total 2312 488

As indicated earlier, learners would require 4000-6000 word families in order to understand 95% of the vocabulary in the Minecraft videos. Taking a closer look at Table 3 below, which shows block and item names that fall into the 4000-6000 most frequent word lists, we can find that some of these words are nouns that we encounter more often in our daily lives, e.g. ‘bubble’, ‘mushroom’, ‘spider’, ‘carrot’, ‘ink’, ‘fox’, etc. Also, although these nouns are less frequently used in the general spoken BNC corpus, if we were to create a frequency list from the Minecraft spoken corpus, the list would not be the same as the BNC corpus. Minecraft block and item names would appear more often. One of the key factors that enhances vocabulary learning is the number of times of ‘retrieval’ of the vocabulary (Folse, 2004). In other words, while learners are engaged in Minecraft play, even if they are not talking to someone else, when they are searching for items and blocks listed below, they are retrieving these nouns in their minds and eventually looking at the items while they play. 

Table 3
Names of Blocks and Items in the 4000-6000 Word Lists

4000 word list

5000 word list

6000 word list

banner

jungle

diamond

lime

disc

bubble

mossy

horn

mushroom

leather

planks

armor

axe

stem

sword

helmet

salmon

spider

carrot

rod

stew

cave

clay

guardian

pillar

valley

blast

brewing

carved

explorer

flesh

fox

globe

gravel

ink

pants

pickle

steak

stray

witch

slab

pane

skull

cod

dragon

blaze

fern

lily

tropical

wheat

cane

cocoa

daisy

firework

sponge

void

activator

berries

dispenser

donkey

gateway

jigsaw

loom

parrot

pearl

saddle

shears

tulip

zombie

chorus

piston

wart

phantom

bale

coarse

cobweb

cookie

dolphin

farmland

petrified

sac

scaffolding

snowball

vex

vines

The following are some of the 9.15% of words in the block and item lists that fall outside the 14k word lists: 

acacia, shulker, cyan, magenta, prismarine, cobblestone, redstone, andesite, diorite,  minecart, kelp, pickaxe, chainmail, chestplate, purpur, ender, pufferfish, allium, bluet, cornflower, ghast, glowstone, ingot, exeye, porkchop, seagrass, TNT, tripwire, campfire, cartography, chirp, comparator, composter, dropper, elytra, enderman, endermite, evoker, fletching glistering, grindstone, mellohi, mooshroom, mycelium, nautilus, netherrack, ocelot, peony, pigman, podzol, scute, silverfish, slimeball, smithing, spawner, stal, stonecutter, strad, vindicator

Some of these words may be common words among some native speakers. Words such as ‘acacia’ a tree common in Arab and African countries,  or ‘cyan’ and ‘magenta’ which are colors well known to artists, ‘cartography’ which means map making, or ‘cornflower’ which is used in cooking, are words that are not part of 14k BNC word list. These words are more technical or not used as often in daily life; therefore, they are not likely to make it into the students’ textbooks. Some of these words are relatively easy for NNS (non-native English speaking) learners. Words such as ‘turtle’, ‘carpet’, ‘torch’, ‘sunflower’, ‘cane’, ‘pumpkin’, and ‘melon’ are more frequently used compared to others. Words such as ‘slab’, ‘birch’, ‘scaffolding’, and ‘terracotta’ are more difficult because they are less frequently encountered in most people’s lives.

Because these words occur frequently when playing Minecraft, they are learned more quickly because of “situated meaning” (Gee, 2010); i.e., the need to recall the names of the blocks or items in order to use them. Thus, the interactive nature of the game makes it likely for learners to acquire these vocabulary items more easily than by encountering them in reading textbooks and learning the meaning with text definitions (Gee, 2010; Zheng et al., 2015). So a significant advantage here for vocabulary development is that players encounter less frequently occurring words during gameplay in Minecraft, allowing them to acquire the vocabulary in an embodied interactive context in which meaning is conveyed through visuals and actions with multi-channeled meanings that would be otherwise impossible to acquire in classroom settings.

Vocabulary Analysis of L2 Learners Gameplay in Minecraft

To address research purpose three, to analyze L2 learner gameplay vocabulary, a transcript of the recording of two young L2 players engaged in a building challenge was used. The building challenge was staged during Electronic Village Online Minecraft MOOC 2019 (Stevens, 2019). As shown in Figure 1, participants were given a theme to build the best creation they could with the available resources. The theme was ‘pig’ since it was the year of the pig. There were three teams and they were each asked to go into different Discord channels for 10 minutes to collaborate with their teammates to build their pig (Discord is an online VOIP environment highly popular with gamers). A 10 minute audio recording was used for vocabulary analysis comprising a conversation between Mattie, a 10 year-old boy from Taiwan whose primary language is Chinese, although he attended elementary school in Florida for seven months, and Emanuel, a 9 year old boy living in Brazil with Brazilian parents, who might be described as a bilingual with English as his dominant language. This part of the analysis serves as an exploratory study of language for learning as the two children, Mattie and Emanuel, were engaged in building, creating, collaborating, and problem-solving.

EVO Minecraft MOOC 2018 building challenge: The year of the pig.
Figure 1. EVO Minecraft MOOC 2018 building challenge: The year of the pig.

As is shown in Table 4 below, there were a total of 986 tokens of words and 251 types of words during the ten-minute gameplay. The subjects’ dialogue consisted of 83.57 % of words from Paul Nation’s first 1,000 word list and 6.29% from the first 2k words. From the table we can see that 89.86 % of the words used by the subjects were from the first 2,000 word list. Adding proper nouns (3.85%) and marginal words such as oh and ah, which are included in the spoken word list (1.72%), we find that this comes out to 95.43% coverage of the most frequently used 2K words for the young learners’ gameplay interaction.

Table 4
Vocabulary Analysis of Gameplay between Mattie and Emanuel

Word Lists

Token

Token%

Cumulative
Token%

Type

Type %

Cumulative
Type %

1,000

2,000

3,000

4,000

5,000

6,000

9,000

10,000

14,000

Proper nouns

Marginal words

Off list words

824

62

5

2

3

2

4

3

4

38

17

22

83.57

6.29

0.51

0.20

0.30

0.20

0.41

0.30

0.41

3.85

1.72

2.23

83.57

89.86

90.37

90.57

90.87

91.07

91.48

91.78

92.19

96.04

97.76

99.99

190

23

4

2

1

2

2

1

1

10

3

12

75.70

9.16

1.59

0.80

0.40

0.80

0.80

0.40

0.40

3.98

1.20

4.78

75.7

84.86

86.46

87.25

87.65

88.45

89.25

89.65

90.05

94.03

95.23

100

TOTAL:

986

251

It is surprising to see that 13 words from 3k-14k BNC word lists were used during the conversation (see bold text in Table 4, above). In Taiwan, the English vocabulary listed in the curriculum is 300 words for elementary school students and 1200 words for junior high school students. It seems highly improbable that an L2 learner in elementary school would comfortably be able to use words from the 3-14k word list as a part of his productive vocabulary without Minecraft providing the opportunity and incentive to use such words.

Minecraft In-game L2 Learner Interaction

During the building challenge, Mattie came up with an idea for the design of their pig, drawing inspiration from a large-sized pig which had been placed on the stage in advance as an example by the building challenge creators. He asked Emanuel to look at the pig and suggested that they copy the build. Later they modified it by adding a body, a tail, and four legs with hooves, two on each side as if the pig were being served on a silver platter. 

M: Emanuel, I have an idea?
E: What is it?
M: Just look, that’s a pig over there!
E: I know that.
M: Wait. Oh my God. One, two, three, four….
M: Emanuel, just look at the pig thing.
E: I thought we’re making a full sized pig.
E: We’ve got 10 minutes to build that pig.
M: I’m gonna go count the blocks, okay?
E: Okay.

It was apparent that Emanuel might not have agreed with Mattie’s idea to just copy the big pig face that was displayed on the stage. He questioned Mattie “I thought we’re making a full-sized pig.” However, Mattie did not explain much but went over to the big pig and started counting the blocks. Emanuel stayed at their team building area and followed Mattie’s instructions to build their pig: “Nine on the very bottom…one straight line.” Emanuel confirmed what he did “A line of nine blocks.” They coordinated among themselves by one giving instructions and the other carrying them out.

M: One, two, three, four, five, six, seven, eight, nine.
Nine on the very bottom. Destroy every block.
One straight line.
E: One, two, three, four, five, six, seven, eight, nine. A line of nine blocks.
M: One two three four five. Destroy four blocks.
E: Destroying four blocks.
M: I have an idea…
E: Done…

During the building process, they engaged in the cognitive processes and accompanying language involving an evaluation of their work. Both Mattie and Emanuel constantly evaluated their build, making sure it looked right: “Are you sure this is how it looks like? Doesn’t look very good!” Then, because of this evaluative comment identifying a problem, Mattie tried to fix their build. Then Emanuel suggested, “It needs one touch to it.” Mattie responded that he noticed they needed coal blocks for the final touch. This time, both reached an agreement on Emanuel’s evaluative comment and they took action to fix it. 

M: Okay, we need something dark.
E: Are you sure this is how it looks like? Doesn’t look very good. There we go!
M: Because you started building!
E: Okay, let’s just go! Go! Go! Make some more things!
E: We’re making the pig face up there.
M: duh, duh, duh…(singing)
E: That looks good!

M: Oh, no, I forgot.
E: That looks good! It needs one touch to it.
M: Yes, I noticed that we need something very important. We need coal block.
E: Mattie, Mattie, be fast!
M: We need three coal blocks.
E: I’m gonna get coal.
M: What are you doing?
E: You’ve already got coal?
M: Yes.
E: We’re doing pretty good Mattie? Anybody like our pig? We’re making the pig up there.

As an EFL learner, Mattie had not been given much chance to converse in English at school in Taiwan. The speaking activities held in class at school are mainly role plays or skits based on the dialogues in the textbook. Students are encouraged to come up with different endings to the dialogues or skits, but these activities are designed for more controlled, scripted practices. Echoing Smolčec, Smolčec, and Stevens (2014), Mattie is another example of a student who has learned a lot from watching YouTubers play Minecraft, and reading books about Minecraft, and I believe these helped tremendously in building his listening/reading comprehension and receptive language skills.

Speaking English to communicate ideas was very difficult for Mattie at the beginning. Although Mattie understood the contents of the YouTube videos when he was seven, he was not able to converse effortlessly during the first year that he joined EVO Minecraft MOOC in 2017. By the time of the year of the pig building challenge during EVO Minecraft MOOC 2018, his language had improved, and this transcript of Mattie building with Emanuel shows evidence of how much he has progressed. The words ‘pan’, ‘craft’, and ‘stacks’ are 3k words; ‘spider’ and ‘ugly’ are 4k words; the word ‘playground’ is a 5k word; ‘exotic’ and ‘zombie’ are 8k words; ‘inventory’ and ‘reindeer’ are 9k words; and the word ‘lava’ is a 10k word on the BNC frequency wordlist. These words seemed very commonly used by native speakers; however, words such as ‘craft’, ‘stacks’, ‘exotic’, or ‘inventory’ are rarely used in an EFL setting.

This is why Minecraft is much valued as a virtual world that allows kids from different countries to connect and interact with one another in English. Having a virtual environment such as the one provided in Minecraft enhances learners’ motivation to communicate in English with online collaborators from different parts of the world.

YouTube videos serve as much-needed language input for EFL learners. Minecraft offers ample support in-game for vocabulary learning. For example, when learners open their inventory in Minecraft, they can hover over an item and its corresponding word appears, linking images to the names of the items in the text. Although there is language support embedded in the game itself, authentic language input from YouTube videos helps with the language learning process. Learners need to learn how to pronounce the items and blocks in English in order to interact with online players, and they hear these words pronounced in the YouTube videos. Minecraft YouTube videos facilitate L2 vocabulary learning by providing authentic language input with normal speech rate from native English speakers with different accents, British and American.

Limitations

There are some limitations to the present study. First is in regard to the Minecraft spoken corpus. The corpus was compiled from randomly selected YouTube videos from the three different channels mentioned above. However, no attempt was made to draw equal portions from each channel. Thus, there wasn’t a balanced number of words across the channels.

Secondly, in order to provide an idea of what vocabulary is involved during NNS young learners’ gameplay, a limited portion of gameplay was included in that part of the analysis. This was done in order to provide some evidence of how YouTube videos may serve as authentic language input and how learners are able to reinforce their vocabulary not only by watching the videos but by using the vocabulary while they play. A larger corpus of spoken discourse from recordings of the Minecraft gameplay of young L2 learners would be required to provide more generalizable results.

Another limitation would be that the learners under study had both been living in English-as-a native-language environments. There could have been some overlap between what they had learned from playing Minecraft and from watching gameplay videos, and what they had acquired during their time spent with other native speakers of English. To prove that learners acquire vocabulary exclusively through Minecraft would require a quasi-experimental research design, which is beyond the scope of the present study. However, the present study provides valuable insights into the vocabulary used by L2 learners when they are engaged in gameplay. Therefore, it would be worthwhile to replicate this study with a broader spectrum of language learners.

Conclusion

Analyzing YouTube Minecraft videos, blocks and items appearing in Minecraft, and player discourse during gameplay in the ways described above allows us to understand what vocabulary learners are likely to learn from watching such videos and playing Minecraft. The present study draws on research investigating the effects of lexical coverage on comprehension (Hu & Nation, 2000; Rodgers & Webb, 2011) and spoken corpus-based studies examining the number of words necessary for learners to engage in everyday conversation (e.g. Adolphs & Schmitt, 2003). The results indicate that knowledge of 4000-6000 word families is required to reach  95% of coverage for lexical items in the videos. From this we see that Minecraft gameplay involves more technical terms than does daily conversations. For the language of gaming in Minecraft, in terms of blocks and item names, 30% of the words that occur frequently in Minecraft are from the BNC 4-14k word lists, and an additional 9% are from off list words, indicating that one significant affordance of Minecraft is that it empowers learners to gain knowledge of less frequent words during gameplay or through watching Minecraft videos. Finally, the L2 learner gameplay analysis shows that  95% of the young L2 learners’ gameplay vocabulary came from the 2,000 most frequent words and that 5% of the words that the learners produced came from the lists of less frequently used words in English during their gameplay. Therefore, Minecraft appears to provide an excellent learning environment for enhancing vocabulary development in NNS learners of English.

Acknowledgements

I wish to express my most sincere gratitude to the On-the-Internet section editor, Vance Stevens, for his insightful comments in refining my manuscript. I would also like to thank the following three co-moderators of EVO Minecraft MOOC, Rosemere Damasio Bard and Dakotah Redstone for hosting the building challenge with me, and Dr. Donald Carroll for his valuable input and discussions on language learning in the virtual world.

References

Adolphs, S., & Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics, 24, 425-438. doi:10.1093/applin/24.4.425

Anthony, L. (2014). AntWordProfiler (Version 1.4.1) [Computer Software]. Tokyo, Japan: Waseda University. Available from https://www.laurenceanthony.net/software

Bawa, P. (2018). Massively multiplayer online gamers’ language: Argument for an m-gamer corpus. The Qualitative Report, 23(11), 2714-2753. Retrieved from https://nsuworks.nova.edu/tqr/vol23/iss11/8/

Coyle, D., Philip, H., & David, M. (2010). CLIL: Content and language integrated learning Cambridge University Press.

DanTDM (2018). The best Minecraft series ever w/Dan. [Video files] Retrieved from https://www.youtube.com/playlist?list=PLUR-PCZCUv7TYq9OIcshdbY6SP07-L0yE

Folse, K. (2004). Vocabulary Myths: Applying Second Language Research to Classroom Teaching. Ann Arbor: Michigan University Press.

Gee, J. P. (2010). A situated social-cultural approach to literacy and technology. In Elizabeth A.Baker (Ed.), The new literacies: Multiple perspectives on research and practice (pp.165-193). New York: The Guilford Press.

Hu, M., & Nation, I. S. P. (2000). Vocabulary density and reading comprehension. Reading in a Foreign Language, 13, 203-430.

Johnson, C. (2012). OMGcraft – Minecraft Tips and Tutorials. [Video files] Retrieved from https://www.youtube.com/user/OMGcraftShow/

Kuhn, J. (2017). Minecraft: Education edition. CALICO Journal, 35(2), 214-223. doi:10.1558/cj.34600

Laufer, B., & Ravenhorst-Kalovshki, G. C. (2010). Lexical threshold revisited: Lexical text  coverage, learners’ vocabulary size and reading comprehension. Reading in a Foreign Language, 22(1), 15–30.

Mojang. (2015). What is minecraft? Retrieved from https://www.minecraft.net/en-us/what-is-minecraft/

Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63, 59-82. doi:10.3138/cmlr.63.1.59

Peters, E. (2019). The effect of imagery and on-screen text on foreign language vocabulary learning from audio-visual input. TESOL Quarterly. DOI: 10.1002/tesq.531 [Online Version of Record before inclusion in an issue].

Peterson, M. (2010). Digital gaming and second language development: Japanese learner interactions in a MMORPG.  Digital Culture & Education, 3(1), 56-73.

Popularmmos. (2013, Oct 2,). Minecraft more tnt mod (35 tnt explosives and dynamite!) too much tnt mod showcase. [Video file] Retrieved from https://www.youtube.com/watch?v=1NEkzAo6ULE

Rogers, M. P. H. (2018). The images in television programs and the potential for learning unknown words: The relationship between on-screen imagery and vocabulary. ITL-International Journal of Applied Linguistics, 169(1), 191-211.

Rodgers, M. P. H., & Webb, S. (2011). Narrow viewing: The vocabulary in related television programs. TESOL Quarterly, 45(4), 689-717.

Smolčec, M., Smolčec, F. & Stevens, V. (2014). Using Minecraft for Learning English. TESL-EJ, 18(2), 1-15. Retrieved from http://www.tesl-ej.org/pdf/ej70/int.pdf

Stevens, V. (2019, January 27). Rose Bard, Dakota Redstone, and Jane Chien host Building Challenge on EVO Minecraft 1.12.2 server [Blog post]. Retrieved from https://learning2gether.net/2019/01/27/rose-bard-dakota-redstone-and-jane-chien-host-building-challenge-on-evo-minecraft-1-12-2-server/

Thibault, P. J. (2011). First-order languaging dynamics and second-order language: The distributed language view. Ecological Psychology, 2(3), 210-245, doi: 10.1080/10407413.2011.591274

Valentine, R. (2019, May 17). Minecraft has sold 176 million copies worldwide [Blog post]. Retrieved from https://www.gamesindustry.biz/articles/2019-05-17-minecraft-has-sold-176-million-copies-worldwide

Vrabel, J. (2017, October 13). Why do my kids waste time watching millennials play video games on YouTube? Retrieved from https://www.washingtonpost.com/news/parenting/wp/2017/10/12/why-do-my-kids-waste-hours-watching-millennials-play-video-games-on-youtube/

Young, M. F., Slota, S., Cutter, A. B., Jalette, G., Mullin, G., Lai, B., … Yukhymenko, M. (2012). Our princess is in another castle: A review of trends in serious gaming for education. Review of Educational Research, 82(1), 61–89. doi.org/10.3102/0034654312436980

Zheng, D., Bischoff, M., Gilliland, B. (2015). Vocabulary learning in massively multiplayer online games: content and action before words. Education Tech Research Dev, 63, 771-790. doi: 10.1007/s11423-015-9387-4

© Copyright rests with authors. Please cite TESL-EJ appropriately.

Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations.

© 1994–2023 TESL-EJ, ISSN 1072-4303
Copyright of articles rests with the authors.