R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson

Posted on Updated on

Below I am sharing the R code for our paper on the serial founder effect:
This paper is a commentary on Atkinson’s 2011 Science article on the serial founder model (see also this interview with ScienceNews, in which parts of our comment in Linguistic Typology and follow-up work are summarized). In the commentary, we provide an introduction to linear mixed effect models for typological research. We discuss how to fit and to evaluate these models, using Atkinson’s data as an example.We illustrate the use of crossed random effects to control for genetic and areal relations between languages. We also introduce a (novel?) way to model areal dependencies based on an exponential decay function over migration distances between languages.
Finally, we discuss limits to the statistical analysis due to data sparseness. In particular, we show that the data available to Atkinson did not contain enough language families with sufficiently many languages to test whether the observed effect holds once random by-family slopes (for the effect) are included in the model. We also present simulations that show that the Type I error rate (false rejections) of the approach taken in Atkinson is many times higher than conventionally accepted (i.e. above .2 when .05 is the conventionally accepted rate of Type errors).
The scripts presented below are not intended to allow full replication of our analyses (they lack annotation and we are not allowed to share the WALS data employed by Atkinson on this site anyway). However, there are many plots and tests in the paper that might be useful for typologists or other users of mixed models. For that reason, I am for now posting the raw code. Please comment below if you have questions and we will try to provide additional annotation for the scripts as needed and as time permits. If you find (parts of the) script(s) useful, please consider citing our article in Linguistic Typology.

Mixed model’s and Simpson’s paradox

Posted on

For a paper I am currently working on, I started to think about Simpson’s paradox, which wikipedia succinctly defines as

“a paradox in which a correlation (trend) present in different groups is reversed when the groups are combined. This result is often encountered in social-science […]”

The wikipedia page also gives a nice visual illustration. Here’s my own version of it. The plot shows 15 groups, each with 20 data points. The groups happen to order along the x-axis (“Pseudo distance from origin”) in a way that suggests a negative trend of the Pseudo distance from origin against the outcome (“Pseudo normalized phonological diversity”). However, this trend does not hold within groups. As a matter of fact, in this particular sample, most groups show the opposite of the global trend (10 out of 15 within-group slopes are clearly positive). If this data set is analyzed by an ordinary linear regression (which does not have access to the grouping structure), the result will be a significant negative slope for the Pseudo distance from origin. So, I got curious: what about linear mixed models?

Read the rest of this entry »

more from LSA 2009

Posted on

Hal Tily gave an interesting talk about how processing factors influencing synchronic word order variation ultimately lead to diachronic word order change, focusing on the SOV-SVO variation in Old English. Locality-based processing constraints lead speakers to prefer orders which minimize the distance between syntactically dependent elements (Hawkins 2004, Gibson 2000). All things being equal, then, SVO structures will be preferable to SOV structures, and this preference will be more pronounced as the weight of the object increases. With no additional pressure making SOV structures preferable in circumstances where weight does not play much of a role, SVO structures will become more frequent (and in fact they do). The next question is how does the synchronic production preference eventually lead to diachronic change. Hal suggests that language learners are inducing structure based on their input and do not correct for processing factors such as weight. A simulation shows that VO order will become dominant (and effectively grammaticalized) in a situation where language users select utterances in a way that is sensitive to weight, but where language learners essentially estimate a regression without a term for weight. Of course questions remain: why was there SOV in Old English to begin with, if there is an over-arching preference for SVO? Why aren’t all languages SVO? Maybe SOV structures had the same sorts of processing advantages (anti-locality effects) that modern verb-final languages seem to have, and these gradually dried up as the English case system disappeared….In any case, I thought this study nicely illustrated a way of linking processing phenomena and language change, and it raises some interesting questions.