2015 Linguistic Summer Institute
Linguistic theory in a world of big data
The Institute is made up of two sessions of two weeks each, as follows:
- Four-week classes: July 6-31
- Session 1 (2-week classes): July 6-17
- Session 2 (2-week classes): July 20-31
Linguistic Theory in a World of Big Data
This theme serves to highlight a growing interest within the field of linguistics to test theory with increasingly larger data sets, such as data from extensive linguistic fieldwork and documentation, data from crowd-sourcing over the web, and corpus data from archival recordings and/or written sources.
It captures several areas of emerging disciplinary interests such as the breakdown of a longstanding fundamental distinction between the categorical, discrete nature of linguistic competence and usage-based gradient variation.
With increased scholarly attention being directed toward the gradient characteristics of both linguistic knowledge and use and toward the probabilistic properties that language knowledge and use may exhibit, some of the necessary groundwork is in place for moving towards a unifying theoretical approach.
News and Updates
Friday, June 19, 2015
We are still accepting applications to register for Field Methods (taught by Lenore Grenoble and Hale Professor Anthony Woodbury). If you would like to be considered for this class, please fill out the following survey to apply: https://www.surveymonkey.com/s/86V3C7D. The new deadline for application is July 3. If your application is successful, you will be contacted to discuss how to adjust your schedule to accommodate field methods.
The web is full of freely available data that is just waiting to be explored by the capable analyst. In this course, we will survey some of the freely available web data sources and discuss linguistic research projects that have been conducted with them. We will emphasize the Buckeye Corpus, the Lexicon Projects, dictionary data, Google Ngram and the TV News Archive, as well as resources from less-studied languages. We discuss projects that relate to a broad range of linguistic topics, including speech/phonetics, semantics and gesture.