Ling 120: Introduction to Corpus Linguistics

Course description

This class is a hands-on introduction to Corpus Linguistics. Corpora are collections of spoken and written language that are available in digital format. In this class, we will learn how to use corpora to address questions as diverse as the following:

  • What are the main differences between British and American English?
  • What are children's first verbs? How do children move on from single words to complex utterances?
  • Are text messages really that different from emails or letters?

We will start out with a very basic introduction to different kinds of corpora, different corpus-linguistic methods, and how to compile our own corpora that are tailored to the research questions we want to investigate. The second third of the class will be devoted a practical, step-by-step introduction to different corpus software tools that we can use to browse and search corpora in order to extract relevant data. In the last third of the class, we will deepen our corpus-linguistic expertise by reading a variety of corpus-linguistics papers that address questions like the ones above; we will then replicate, extend, or modify the analyses described in these papers to do our own little research projects.


Course evaluation