Yeah, it’s a long time from now (or so it seems) … I will be teaching a three-week class on corpus-based research on syntactic processing and syntactic representation at the LSA Institute 2009 to be held at UC, Berkeley. Right now, there’s only a bunch of vague ideas I have for that class, so if you have ideas, you have about a year to leave them here in form of a comment ;-). Some topics I think should be covered:
- syntactic searches in syntactically annotated corpora and corpora that lack such annotation
- trade-offs of corpus-based research on language processing (compared to experiments) such as a-priori balancing vs. planned control via statistical models.
- statistical analysis of clustered data (mixed models, bootstrapping)
- one or two research themes that are of theoretical interest (e.g. the nature of structural representations; optimal production: efficiency and communicative pressures).
I am thinking of two evening tutorials to get people started on corpus work. The class itself would be at a more advanced level, doing actual research (as much as that is possible with 6 meetings).