R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson
- Jaeger, Graff, Croft, and Pontillo. 2011. Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson. Linguistic Typology 15(2), 281–319. [if you’re not subscribed to Linguistic Typology, check out this pre-final draft or contact me for an offprint].
For a paper I am currently working on, I started to think about Simpson’s paradox, which wikipedia succinctly defines as
“a paradox in which a correlation (trend) present in different groups is reversed when the groups are combined. This result is often encountered in social-science […]”
The wikipedia page also gives a nice visual illustration. Here’s my own version of it. The plot shows 15 groups, each with 20 data points. The groups happen to order along the x-axis (“Pseudo distance from origin”) in a way that suggests a negative trend of the Pseudo distance from origin against the outcome (“Pseudo normalized phonological diversity”). However, this trend does not hold within groups. As a matter of fact, in this particular sample, most groups show the opposite of the global trend (10 out of 15 within-group slopes are clearly positive). If this data set is analyzed by an ordinary linear regression (which does not have access to the grouping structure), the result will be a significant negative slope for the Pseudo distance from origin. So, I got curious: what about linear mixed models?
Hal Tily gave an interesting talk about how processing factors influencing synchronic word order variation ultimately lead to diachronic word order change, focusing on the SOV-SVO variation in Old English. Locality-based processing constraints lead speakers to prefer orders which minimize the distance between syntactically dependent elements (Hawkins 2004, Gibson 2000). All things being equal, then, SVO structures will be preferable to SOV structures, and this preference will be more pronounced as the weight of the object increases. With no additional pressure making SOV structures preferable in circumstances where weight does not play much of a role, SVO structures will become more frequent (and in fact they do). The next question is how does the synchronic production preference eventually lead to diachronic change. Hal suggests that language learners are inducing structure based on their input and do not correct for processing factors such as weight. A simulation shows that VO order will become dominant (and effectively grammaticalized) in a situation where language users select utterances in a way that is sensitive to weight, but where language learners essentially estimate a regression without a term for weight. Of course questions remain: why was there SOV in Old English to begin with, if there is an over-arching preference for SVO? Why aren’t all languages SVO? Maybe SOV structures had the same sorts of processing advantages (anti-locality effects) that modern verb-final languages seem to have, and these gradually dried up as the English case system disappeared….In any case, I thought this study nicely illustrated a way of linking processing phenomena and language change, and it raises some interesting questions.