I recently was pointed to this nice and very accessible tutorial on linear mixed effects regression and how to run them in R by Bodo Winter (at UC Merced). If you don’t have much or any background in this type of model, I recommend you pair it with a good conceptual introduction to these models like Gelman and Hill 2007 and perhaps some slides from our LSA 2013 tutorial.
There are a few thing I’d like to add to Bodo’s suggestions regarding how to report your results:
- be clear how you coded the variables since this does change the interpretation of the coefficients (the betas that are often reported). E.g. say whether you sum- or treatment-coded your factors, whether you centered or standardized continuous predictors etc. As part of this, also be clear about the direction of the coding. For example, state that you “sum-coded gender as female (1) vs. male (-1)”. Alternatively, report your results in a way that clearly states the directionality (e.g., “Gender=male, beta = XXX”).
- please also report whether collinearity was an issue. E.g., report the highest fixed effect correlations.
Some of you asked for the slides to the Mixed effect regression class I taught at the 2013 LSA Summer Institute in Ann Arbor, MI. The class covered some Generalized Linear Model, Generalized Linear Mixed Models, extensions beyond the linear model, simulation-based approaches to assessing the validity (or power) of your analysis, data summarization and visualization, and reporting of results. The class included slides from Maureen Gillespie, Dave Kleinschmidt, and Judith Degen (see above link). Dave even came by to Ann Arbor and gave his lecture on the awesome power of plyr (and reshape etc.), which I recommend. You might also just browse through them to get an idea of some new libraries (such as Stargazer for quick and nice looking latex tables). There’s also a small example to work through for time series analysis (for beginners).
Almost all slides were created in knitr and latex (very conveniently integrated into RStudio — I know some purists hate it, but comm’on), so that the code on the slides is the code that generated the output on the slides. Feedback welcome.
This post is in reply to a recent question on in ling-R-lang by Meredith Tamminga. Meredith was wondering whether an analysis she had in mind for her project was circular, causing the pattern of results predicted by the hypothesis that she was interested in testing. I felt her question (described below in more detail) was an interesting example that might best be answered with some simulations. Reasoning through an analysis can, of course, help a lot in understanding (or better, as in Meredith’s case, anticipating) problems with the interpretation of the results. Not all too infrequently, however, I find that intuition fails or isn’t sufficiently conclusive. In those cases, simulations can be a powerful tool in understanding your analysis. So, I decided to give it a go and use this as an example of how one might approach this type of question.
I thought this was worth reposting from ling-R-lang: Sven Hohenstein from the University of Potsdam prepared this script to obtain CIs for, e.g., bar charts from lmer() output. I haven’t tried it yet, but it looks like it will be useful.
R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson
- Jaeger, Graff, Croft, and Pontillo. 2011. Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson. Linguistic Typology 15(2), 281–319. [if you’re not subscribed to Linguistic Typology, check out this pre-final draft or contact me for an offprint].
While searching for something else, I just came across two papers that should be of interest to folks working with mixed models.
- Schielzeth, H. and Forstmeier, W. 2009. Conclusions beyond support: overconfident estimates in mixed models. Behavioral Ecology Volume 20, Issue 2, 416-420. I have seen the same point being made in several papers under review and at a recent CUNY (e.g. Doug Roland’s 2009? CUNY poster). On the one hand, it should be absolutely clear that random intercepts alone are often insufficient to account for violations of independence (this is a point, I make every time I am teaching a tutorial). On the other hand, I have reviewed quite a number of papers, where this mistake was made. So, here you go. Black on white. The moral is (once again) that no statistical procedure does what you think it should do if you don’t use it the way it was intended to.
- The second paper takes on a more advanced issue, but one that is becoming more and more relevant. How can we test whether a random effect is essentially non-necessary – i.e. that it has a variance of 0? Currently, most people conduct model comparison (following Baayen, Davidson and Bates, 2008). But this approach is not recommended (and neither do Baayen et al recommend it) if we want to test whether all random effects can be completely removed from the model (cf. the very useful R FAQ list, which states “do not compare lmer models with the corresponding lm fits, or glmer/glm; the log-likelihoods […] include different additive terms”). This issue is taken on in Scheipl, F., Grevena, S. and Küchenhoff, H. 2008. Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models. Computational Statistics & Data Analysis.Volume 52, Issue 7, 3283-3299. They present power comparisons of various tests.
I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download). Read the rest of this entry »