multilevel models

Updated slides on GLM, GLMM, plyr, etc. available

Posted on

Some of you asked for the slides to the Mixed effect regression class I taught at the 2013 LSA Summer Institute in Ann Arbor, MI. The class covered some Generalized Linear Model, Generalized Linear Mixed Models, extensions beyond the linear model, simulation-based approaches to assessing the validity (or power) of your analysis, data summarization and visualization, and reporting of results. The class included slides from Maureen Gillespie, Dave Kleinschmidt, and Judith Degen (see above link). Dave even came by to Ann Arbor and gave his lecture on the awesome power of plyr (and reshape etc.), which I recommend. You might also just browse through them to get an idea of some new libraries (such as Stargazer for quick and nice looking latex tables). There’s also a small example to work through for time series analysis (for beginners).

Almost all slides were created in knitr and latex (very conveniently integrated into RStudio — I know some purists hate it, but comm’on), so that the code on the slides is the code that generated the output on the slides. Feedback welcome.



New R library for multilevel modeling

Posted on

This might be of interest to many of you. MLwiN, a software package for multilevel modeling developed at Bristol that includes functions beyond those present in, e.g., lmer, now has an interface for R (kinda like WinBugs, etc.), so that you can continue to use R while taking advantage of the powerful tools in MLwiN. The package is called R2MLwiN. For more details, see below.

Dear all,
We are pleased to announce a new R package, R2MLwiN (Zhang et al. 2012)
that allows R users access to the functionality within MLwiN directly from
within the R package. This package has been developed as part of the e-STAT
ESRC digital social research programme grant along with the Stat-JR package.
See <> for more details
including examples taken from the book MCMC Estimation in MLwiN.

Feedback gratefully received by either me or Zhengzheng Zhang (

Best wishes,
  Bill Browne. 

New R resource for ordinary and multilevel regression modeling

Posted on

Here’ s what I received from the Center of Multilevel Modeling at Bristol (I haven’t checked it out yet; registration seems to be free but required):

The Centre for Multilevel Modelling is very pleased to announce the addition of
R practicals to our free on-line multilevel modelling course. These give
detailed instructions of how to carry out a range of analyses in R, starting
from multiple regression and progressing through to multilevel modelling of
continuous and binary data using the lmer and glmer functions.

MLwiN and Stata versions of these practicals are already available.
You will need to log on or register onto the course to view these


R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson

Posted on Updated on

Below I am sharing the R code for our paper on the serial founder effect:
This paper is a commentary on Atkinson’s 2011 Science article on the serial founder model (see also this interview with ScienceNews, in which parts of our comment in Linguistic Typology and follow-up work are summarized). In the commentary, we provide an introduction to linear mixed effect models for typological research. We discuss how to fit and to evaluate these models, using Atkinson’s data as an example.We illustrate the use of crossed random effects to control for genetic and areal relations between languages. We also introduce a (novel?) way to model areal dependencies based on an exponential decay function over migration distances between languages.
Finally, we discuss limits to the statistical analysis due to data sparseness. In particular, we show that the data available to Atkinson did not contain enough language families with sufficiently many languages to test whether the observed effect holds once random by-family slopes (for the effect) are included in the model. We also present simulations that show that the Type I error rate (false rejections) of the approach taken in Atkinson is many times higher than conventionally accepted (i.e. above .2 when .05 is the conventionally accepted rate of Type errors).
The scripts presented below are not intended to allow full replication of our analyses (they lack annotation and we are not allowed to share the WALS data employed by Atkinson on this site anyway). However, there are many plots and tests in the paper that might be useful for typologists or other users of mixed models. For that reason, I am for now posting the raw code. Please comment below if you have questions and we will try to provide additional annotation for the scripts as needed and as time permits. If you find (parts of the) script(s) useful, please consider citing our article in Linguistic Typology.

Diagnosing collinearity in mixed models from lme4

Posted on Updated on

I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download). Read the rest of this entry »

Annotated example analysis using mixed models

Posted on Updated on

Jessica Nelson (Learning Research and Development Center, University of Pittsburgh) uploaded a step-by-step example analysis using mixed models to her blog. Each step is nicely annotated and Jessica also discusses some common problems she encountered while trying to analyze her data using mixed models. I think this is a nice example for anyone trying to learn to use mixed models. It goes through all/most of the steps outlined in Victor Kuperman and my WOMM tutorial (click on the graph to see it full size):

Random effect: Should I stay or should I go?

Posted on Updated on

One of the more common questions I get about mixed models is whether there are any standards regarding the removal of random effects from the model. When should a random effect be included in the model? This was also one of the questions we had hope to answer for our field (psycholinguistics) in the pre-CUNY Workshop on Ordinary and Multilevel Models (WOMM), but I don’t think we got anywhere close to a “standard” (see Harald Baayen’s presentation on understanding random effect correlations though for a very insightful discussion).

That being said, I find most of us would probably agree on a set of rules of thumb, at least for factorial analyses of balanced data: Read the rest of this entry »