2015 Linguistic Summer Institute
Linguistic theory in a world of big data
The Institute is made up of two sessions of two weeks each, as follows:
- Four-week classes: July 6-31
- Session 1 (2-week classes): July 6-17
- Session 2 (2-week classes): July 20-31
Linguistic Theory in a World of Big Data
This theme serves to highlight a growing interest within the field of linguistics to test theory with increasingly larger data sets, such as data from extensive linguistic fieldwork and documentation, data from crowd-sourcing over the web, and corpus data from archival recordings and/or written sources.
It captures several areas of emerging disciplinary interests such as the breakdown of a longstanding fundamental distinction between the categorical, discrete nature of linguistic competence and usage-based gradient variation.
With increased scholarly attention being directed toward the gradient characteristics of both linguistic knowledge and use and toward the probabilistic properties that language knowledge and use may exhibit, some of the necessary groundwork is in place for moving towards a unifying theoretical approach.
News and Updates
Friday, June 19, 2015
We are still accepting applications to register for Field Methods (taught by Lenore Grenoble and Hale Professor Anthony Woodbury). If you would like to be considered for this class, please fill out the following survey to apply: https://www.surveymonkey.com/s/86V3C7D. The new deadline for application is July 3. If your application is successful, you will be contacted to discuss how to adjust your schedule to accommodate field methods.
With the availability of large corpora and the increased computational power that has enabled their efficient processing, modern lexicography has undergone a revolution. Statistical techniques, now central to lexicography, enable lexicographers to produce higher-quality dictionaries at lower cost. This course introduces modern corpus lexicography, with a focus on monolingual dictionaries, using English as an exemplar. We discuss the need for large corpora, how to build them, and the key statistical corpus methods used in modern lexicography.