14
May
09

Random effect: Should I stay or should I go?

One of the more common questions I get about mixed models is whether there are any standards regarding the removal of random effects from the model. When should a random effect be included in the model? This was also one of the questions we had hope to answer for our field (psycholinguistics) in the pre-CUNY Workshop on Ordinary and Multilevel Models (WOMM), but I don’t think we got anywhere close to a “standard” (see Harald Baayen’s presentation on understanding random effect correlations though for a very insightful discussion).

That being said, I find most of us would probably agree on a set of rules of thumb, at least for factorial analyses of balanced data:

  • for balanced data sets, start with fully crossed and fully specified random effects, e.g. for y ~ a*b have lmer(y ~ a * b + (1 + a * b | subject) +(1 + a * b | item), data)
  • if that does not converge because any of the to-be-estimated variances of the random effects are effectively zero, than simplify, e.g.
    • lmer(y ~ a * b + (1 + a + b | subject) + (1 + a + b | item), data)
    • lmer(y ~ a * b + (1 + a * b | subject) + (1 + a | item), data) or lmer(y ~ a * b + (1 + a * b | subject) + (1 + b | item), data)
    • lmer(y ~ a * b + (1 + a * b | subject) + (1 | item), data)
    • etc. I usually reduce the item effects first, because (at least in researcher-made experiments) item variances are usually much much smaller than subject variances.
    • at some point this will converge
  • check the correlations between random effects (see Baayen’s WOMM presentation, too — available on the blog [google: hlp lab womm, linked to schedule]). If there are high correlations, check whether you can further remove random effect terms (following the hierarchy principle). Use the procedures outlined in, e.g. Baayen, Davidson, Bates, 2008 (JML) or Baayen 2008 (book) for random effect model comparisons. I use REML-fitted models if I want to test whether removal of a random term is significant for a linear mixed effect model (b/c REML is less biased in estimating variances than ML), but apparently the parameter estimates are usually very similar for both estimation methods anyway.

The function aovlmer.fnc() in Baayen’s languageR library (for R) allows comparisons of models that differ only in terms of random effects. I also expect there to be functions pretty soon that automate this process somewhat.

As always, updates, comments, and questions are welcome.


0 Responses to “Random effect: Should I stay or should I go?”



  1. No Comments Yet

Leave a Reply




Blog Stats

  • 32,565 hits

Categories

Archives

 

May 2009
M T W T F S S
« Apr   Jun »
 123
45678910
11121314151617
18192021222324
25262728293031