• Skip to primary navigation
  • Skip to main content

site logo
The Electronic Journal for English as a Second Language
search
  • Home
  • About TESL-EJ
  • Vols. 1-15 (1994-2012)
    • Volume 1
      • Volume 1, Number 1
      • Volume 1, Number 2
      • Volume 1, Number 3
      • Volume 1, Number 4
    • Volume 2
      • Volume 2, Number 1 — March 1996
      • Volume 2, Number 2 — September 1996
      • Volume 2, Number 3 — January 1997
      • Volume 2, Number 4 — June 1997
    • Volume 3
      • Volume 3, Number 1 — November 1997
      • Volume 3, Number 2 — March 1998
      • Volume 3, Number 3 — September 1998
      • Volume 3, Number 4 — January 1999
    • Volume 4
      • Volume 4, Number 1 — July 1999
      • Volume 4, Number 2 — November 1999
      • Volume 4, Number 3 — May 2000
      • Volume 4, Number 4 — December 2000
    • Volume 5
      • Volume 5, Number 1 — April 2001
      • Volume 5, Number 2 — September 2001
      • Volume 5, Number 3 — December 2001
      • Volume 5, Number 4 — March 2002
    • Volume 6
      • Volume 6, Number 1 — June 2002
      • Volume 6, Number 2 — September 2002
      • Volume 6, Number 3 — December 2002
      • Volume 6, Number 4 — March 2003
    • Volume 7
      • Volume 7, Number 1 — June 2003
      • Volume 7, Number 2 — September 2003
      • Volume 7, Number 3 — December 2003
      • Volume 7, Number 4 — March 2004
    • Volume 8
      • Volume 8, Number 1 — June 2004
      • Volume 8, Number 2 — September 2004
      • Volume 8, Number 3 — December 2004
      • Volume 8, Number 4 — March 2005
    • Volume 9
      • Volume 9, Number 1 — June 2005
      • Volume 9, Number 2 — September 2005
      • Volume 9, Number 3 — December 2005
      • Volume 9, Number 4 — March 2006
    • Volume 10
      • Volume 10, Number 1 — June 2006
      • Volume 10, Number 2 — September 2006
      • Volume 10, Number 3 — December 2006
      • Volume 10, Number 4 — March 2007
    • Volume 11
      • Volume 11, Number 1 — June 2007
      • Volume 11, Number 2 — September 2007
      • Volume 11, Number 3 — December 2007
      • Volume 11, Number 4 — March 2008
    • Volume 12
      • Volume 12, Number 1 — June 2008
      • Volume 12, Number 2 — September 2008
      • Volume 12, Number 3 — December 2008
      • Volume 12, Number 4 — March 2009
    • Volume 13
      • Volume 13, Number 1 — June 2009
      • Volume 13, Number 2 — September 2009
      • Volume 13, Number 3 — December 2009
      • Volume 13, Number 4 — March 2010
    • Volume 14
      • Volume 14, Number 1 — June 2010
      • Volume 14, Number 2 – September 2010
      • Volume 14, Number 3 – December 2010
      • Volume 14, Number 4 – March 2011
    • Volume 15
      • Volume 15, Number 1 — June 2011
      • Volume 15, Number 2 — September 2011
      • Volume 15, Number 3 — December 2011
      • Volume 15, Number 4 — March 2012
  • Vols. 16-Current
    • Volume 16
      • Volume 16, Number 1 — June 2012
      • Volume 16, Number 2 — September 2012
      • Volume 16, Number 3 — December 2012
      • Volume 16, Number 4 – March 2013
    • Volume 17
      • Volume 17, Number 1 – May 2013
      • Volume 17, Number 2 – August 2013
      • Volume 17, Number 3 – November 2013
      • Volume 17, Number 4 – February 2014
    • Volume 18
      • Volume 18, Number 1 – May 2014
      • Volume 18, Number 2 – August 2014
      • Volume 18, Number 3 – November 2014
      • Volume 18, Number 4 – February 2015
    • Volume 19
      • Volume 19, Number 1 – May 2015
      • Volume 19, Number 2 – August 2015
      • Volume 19, Number 3 – November 2015
      • Volume 19, Number 4 – February 2016
    • Volume 20
      • Volume 20, Number 1 – May 2016
      • Volume 20, Number 2 – August 2016
      • Volume 20, Number 3 – November 2016
      • Volume 20, Number 4 – February 2017
    • Volume 21
      • Volume 21, Number 1 – May 2017
      • Volume 21, Number 2 – August 2017
      • Volume 21, Number 3 – November 2017
      • Volume 21, Number 4 – February 2018
    • Volume 22
      • Volume 22, Number 1 – May 2018
      • Volume 22, Number 2 – August 2018
      • Volume 22, Number 3 – November 2018
      • Volume 22, Number 4 – February 2019
    • Volume 23
      • Volume 23, Number 1 – May 2019
      • Volume 23, Number 2 – August 2019
      • Volume 23, Number 3 – November 2019
      • Volume 23, Number 4 – February 2020
    • Volume 24
      • Volume 24, Number 1 – May 2020
      • Volume 24, Number 2 – August 2020
      • Volume 24, Number 3 – November 2020
      • Volume 24, Number 4 – February 2021
    • Volume 25
      • Volume 25, Number 1 – May 2021
      • Volume 25, Number 2 – August 2021
      • Volume 25, Number 3 – November 2021
      • Volume 25, Number 4 – February 2022
    • Volume 26
      • Volume 26, Number 1 – May 2022
      • Volume 26, Number 2 – August 2022
      • Volume 26, Number 3 – November 2022
      • Volume 26, Number 4 – February 2023
    • Volume 27
      • Volume 27, Number 1 – May 2023
      • Volume 27, Number 2 – August 2023
      • Volume 27, Number 3 – November 2023
      • Volume 27, Number 4 – February 2024
    • Volume 28
      • Volume 28, Number 1 – May 2024
      • Volume 28, Number 2 – August 2024
      • Volume 28, Number 3 – November 2024
      • Volume 28, Number 4 – February 2025
    • Volume 29
      • Volume 29, Number 1 – May 2025
      • Volume 29, Number 2 – August 2025
      • Volume 29, Number 3 – November 2025
      • Volume 29, Number 4 – February 2026
    • Volume 30
      • Volume 30, Number 1 – May 2026
  • Books
  • How to Submit
    • Submission Info
    • Ethical Standards for Authors and Reviewers
    • TESL-EJ Style Sheet for Authors
    • TESL-EJ Tips for Authors
    • Book Review Policy
    • Media Review Policy
    • TESL-EJ Special issues
    • APA Style Guide
  • Editorial Board
  • Support

Corpus Linguistics for Education: A Guide for Research

August 2022 – Volume 26, Number 2

Corpus Linguistics for Education: A Guide for Research

Author: Pascual Pérez-Paredes (2021) book cover
Publisher: Routledge Taylor & Francis Group
Pages ISBN Price
Pp. xviii + 175 9780367198435 $44.95 U.S.

Education research, with its large bodies of richly contextualized text, is a field ripe for exploration using corpus methods. In Corpus Linguistics for Education: A Guide for Research, Pérez-Paredes provides a practical, entry-level research guide to integrating corpus linguistics in educational research by introducing the ultimate goal of corpus analyses, that is, to generalize empirical description of language use in a target discourse domain through the use of corpora (Egbert et al., 2022). The author shows the potential of corpus methods to improve authenticity and validity regarding traditional textual data and their analyses for better interpretations of research results and findings. Corpus design principles will help reduce bias in the data and allow more inquiries to be answered. Therefore, this book aims to show that corpus linguistics can tap into the research data and that rigorous linguistic feature analyses may help explore education questions. Without assuming specialized knowledge, Pérez-Paredes introduces several key corpus skills for education and entry-level graduate students of any discipline who want to explore corpus methods.

This book is divided into eight chapters, each presenting several skills relevant to educational research, such as why frequency matters, exploring available corpora, and transcribing interview data, among others. Chapter 1 thoroughly introduces concepts of corpus methods and their potential in education. The author emphasizes the role of corpora as samples of language use and why frequencies matter. He then discusses two approaches to using corpora. In the first approach, researchers can use a corpus as their primary or secondary dataset to complement their data resources. In the second approach, researchers can take their existing dataset and analyze it using corpus techniques.

Chapters 2-7 are devoted to corpus-skill building. Chapter 2 provides an overview of several textual analysis methods and their accompanying research paradigms, such as theme analysis with grounded theory and discourse analysis with sociocultural theory. To help clarify that corpus methods also include qualitative analyses besides quantitative (e.g., through frequency counts), the author includes a discussion on register analysis and situational characteristics (see Biber & Conrad, 2009). Chapter 3 covers the most frequently used corpus methods. Fest’s (2015) qualitative study of online self-assessment tools is used as an example to illustrate how corpus analysis can be used to complement the qualitative method of interview transcription. In this example, the transcription was part-of-speech tagged and then analyzed using a corpus tool (i.e., AntConc). In Chapter 4, the author describes comparative methods using two corpora, exemplified through case studies on educational policies. Basic design considerations for building corpora are then discussed. Chapter 5 is devoted to the use of corpus methods in education. The great-detailed discussion of corpus structures on interview transcriptions (i.e., metadata and corpus annotations) shows how complex corpus searches can exploit language structure within the transcription, going beyond the exploration of merely transcribed words.

Chapter 6 focuses on the role of vocabulary and its distinct implications for language use. Keyword analyses using peace treaties and children’s fiction corpus are illustrated. Built-in statistical analyses within corpus software are discussed to introduce the concept of “keyness” and its implications. Chapter 7 discusses how to run complex searches on spoken language. Drawing on the overarching goal of educational research to investigate how meaning is constructed (Cohen et al., 2018), the author shows that combining lexis (a word) and part-of-speech in a tagged corpus would enable more-detailed linguistic feature searches, allowing for more accurate interpretations.

In Chapter 8, the author provides a critical review of discussions presented throughout the book and acknowledges some challenges of incorporating corpus methods into education research due to disciplinary boundaries and different research paradigms. However, he concludes by arguing that corpus analyses could help data triangulation and validation within education research.

Overall, this book provides a valuable and informative discussion on using corpus methods for education research. All the skills presented in the book are clearly described in a way that is easy to follow. Step-by-step guides to explore each skill are given to help readers to visualize corpus analyses and practice independently. Helpful summaries of the skills and additional notes (e.g., a list of useful existing corpora and corpus software) are presented at the end of each chapter. The detailed discussions throughout the chapters, specifically on data exploration presented in Chapters 5-7, illustrate the distinctive contributions of corpus analyses. The book’s demonstration of corpus queries, use of common corpus management tools, chapter summaries, notes for further reading, tables containing terms, figures accompanying the hands-on activities, and chapter-review questions facilitate readers’ understanding.

However, this book also comes with some shortcomings. First, in Chapter 1, the author has  mentioned two competing research paradigms (i.e., positivism and constructivism) and some qualitative research frameworks (i.e., phenomenology and grounded theory). However, situating corpus linguistics among the many other interpretive frameworks that exist, such as   transformative frameworks and pragmatism (see Creswell & Poth, 2018), would help researchers better understand how corpus linguistics relates to their own research perspectives. Second, while there is a brief discussion of the difference between corpus-based and corpus-driven studies, further discussion of how those two approaches are distinct in their practices and implications would be useful. Additionally, supplementary information about part-of-speech tagging and regular expressions would help readers new to corpus better follow practices presented in Chapters 5-7.

Despite its shortcomings, this book offers fresh insight into incorporating corpus methods within education research, positioning this method as a versatile research methodology that can address the need for more detailed data explorations and lead to better research interpretations. The selected corpus skills are relevant to education research, and the guided practice prompts understanding of the discussions. Therefore, this book is useful for education researchers and entry-level graduate students who wish to expand their research skillset by exploring and analyzing corpora.

To cite this article

Lestari, F. (2022), Corpus linguistics for education: A guide for research, Pascual Pérez-Paredes (2021). Teaching English as a Second Language Electronic Journal (TESL-EJ), 26(2). https://doi.org/10.55593/ej.26102r1

References

Biber, D. & Conrad, S. (2009). Genre, register, and style. Cambridge University Press.

Cohen, L., Manion, L., & Morrison, K. (2018). Research methods in education. Taylor Francis.

Creswell J.W., & Poth, C.N. (2018). Qualitative inquiry and research design: Choosing among five approaches (4th ed). SAGE Publications.

Egbert, J., Biber, D., & Gray, B. (2022). Designing and evaluating language corpora. Cambridge University Press.

Fest, J. (2015). Corpora in the social sciences—How corpus-based approaches can support qualitative interview analyses. Revista de Lenguas para Fines Específicos, 21(2), 48–69.

About the reviewer

Febriana Lestari is a Ph.D. student in the Applied Linguistics and Technology program at Iowa State University. Her main research interests center around second language acquisition and corpus linguistics. <febriatmarkiastate.edu> ORCID ID: 0000-0002-6544-2542

© Copyright rests with authors. Please cite TESL-EJ appropriately.
Editor’s Note: The HTML version contains no page numbers. Please use the PDF version of this article for citations.

© 1994–2026 TESL-EJ, ISSN 1072-4303
Copyright of articles rests with the authors.