This course is a practical tour of computational methods that will be useful for practicing linguists. The basics of how text files are encoded on computational devices will be discussed, as well as common methods for manipulating these files, extracting their contents, and converting between formats. In addition, modern tools and algorithms for indexation and search, including regular expression grammars and databases, will be covered. Students will learn how to train some common tools (e.g., concordances, indices, part of speech taggers, chunkers, and parsers) in a variety of languages. The tools, methods, and metrics of modern annotation in computational linguistics will be discussed in considerable depth, including the mechanisms for crowdsourced annotation. After completing this course, students will be able to construct simple programs in Python and Javascript. No background in programming or UNIX will be assumed.
This github repo has sections for:
- classNotes
- logistical information (see the 144 syllabus and 244 syllabus here)
- scripts
- external resources
Pranav Anand [email protected]
Oliver Northrup [email protected]
Class: Cowell 134, MW 17.00-18.45 Section: Crown Mac Lab, T 12.00-13.10, 16.00-17.10
Pranav's Office Hours: Stevenson 260, W 15.30-16.30
Oliver's Office Hours: Stevenson 223, M 15.30-16.30