# lmer

### The ‘softer kind’ of tutorial on linear mixed effect regression

I recently was pointed to this nice and very accessible tutorial on linear mixed effects regression and how to run them in R by **Bodo Winter **(at UC Merced). If you don’t have much or any background in this type of model, I recommend you pair it with a good conceptual introduction to these models like Gelman and Hill 2007 and perhaps some slides from our LSA 2013 tutorial.

There are a few thing I’d like to add to Bodo’s suggestions regarding how to report your results:

- be clear how you coded the variables since this does change the interpretation of the coefficients (the betas that are often reported). E.g. say whether you sum- or treatment-coded your factors, whether you centered or standardized continuous predictors etc. As part of this, also be clear about the
*direction*of the coding. For example, state that you “sum-coded gender as female (1) vs. male (-1)”. Alternatively, report your results in a way that clearly states the directionality (e.g., “Gender=male, beta = XXX”). - please also report whether collinearity was an issue. E.g., report the highest fixed effect correlations.

Happy reading.

### Updated slides on GLM, GLMM, plyr, etc. available

Some of you asked for the slides to the** Mixed effect regression class** I taught at the **2013 LSA Summer Institute in Ann Arbor, MI**. The class covered some Generalized Linear Model, Generalized Linear Mixed Models, extensions beyond the linear model, simulation-based approaches to assessing the validity (or power) of your analysis, data summarization and visualization, and reporting of results. The class included slides from Maureen Gillespie, Dave Kleinschmidt, and Judith Degen (see above link). Dave even came by to Ann Arbor and gave his lecture on the awesome power of plyr (and reshape etc.), which I recommend. You might also just browse through them to get an idea of some new libraries (such as Stargazer for quick and nice looking latex tables). There’s also a small example to work through for time series analysis (for beginners).

Almost all slides were created in knitr and latex (very conveniently integrated into RStudio — I know some purists hate it, but comm’on), so that the code on the slides is the code that generated the output on the slides. Feedback welcome.

### Is my analysis problematic? A simulation-based example

This post is in reply to a recent question on in ling-R-lang by Meredith Tamminga. Meredith was wondering whether an analysis she had in mind for her project was circular, causing the pattern of results predicted by the hypothesis that she was interested in testing. I felt her question (described below in more detail) was an interesting example that might best be answered with some simulations. Reasoning through an analysis can, of course, help a lot in understanding (or better, as in Meredith’s case, anticipating) problems with the interpretation of the results. Not all too infrequently, however, I find that intuition fails or isn’t sufficiently conclusive. In those cases, **simulations can be a powerful tool in understanding your analysis**. So, I decided to give it a go and use this as an example of how one might approach this type of question.

### R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson

- Jaeger, Graff, Croft, and Pontillo. 2011. Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson
*. Linguistic Typology 15(2), 281–319.*[if you’re not subscribed to Linguistic Typology, check out this pre-final draft or contact me for an offprint].

*not*intended to allow full replication of our analyses (they lack annotation and we are not allowed to share the WALS data employed by Atkinson on this site anyway). However, there are many plots and tests in the paper that might be useful for typologists or other users of mixed models. For that reason, I am for now posting the raw code. Please comment below if you have questions and we will try to provide additional annotation for the scripts as needed and as time permits.

**If you find (parts of the) script(s) useful, please consider citing our article in Linguistic Typology.**

### Two interesting papers on mixed models

While searching for something else, I just came across two papers that should be of interest to folks working with mixed models.

- Schielzeth, H. and Forstmeier, W. 2009.
**Conclusions beyond support: overconfident estimates in mixed models**. Behavioral Ecology Volume 20, Issue 2, 416-420. I have seen the same point being made in several papers under review and at a recent CUNY (e.g. Doug Roland’s 2009? CUNY poster). On the one hand, it should be absolutely clear that random intercepts alone are often insufficient to account for violations of independence (this is a point, I make every time I am teaching a tutorial). On the other hand, I have reviewed quite a number of papers, where this mistake was made. So, here you go. Black on white. The moral is (once again) that no statistical procedure does what you think it should do*if you don’t use it the way it was intended to*. - The second paper takes on a more advanced issue, but one that is becoming more and more relevant.
**How can we test whether a random effect is essentially non-necessary – i.e. that it has a variance of 0?**Currently, most people conduct model comparison (following Baayen, Davidson and Bates, 2008). But this approach is not recommended (and neither do Baayen et al recommend it) if we want to test whether*all*random effects can be completely removed from the model (cf. the very useful R FAQ list, which states “*do not*compare lmer models with the corresponding lm fits, or glmer/glm; the log-likelihoods […] include different additive terms”). This issue is taken on in Scheipl, F., Grevena, S. and Küchenhoff, H. 2008.**Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models.**Computational Statistics & Data Analysis.Volume 52, Issue 7, 3283-3299. They present power comparisons of various tests.

### Diagnosing collinearity in mixed models from lme4

I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download). Read the rest of this entry »

### R code for LaTeX tables of lmer model effects

Here’s some R code that outputs text on the console that you can copy-paste into a .tex file and creates nice LaTeX tables of fixed effects of lmer models (only works for family=”binomial”). Effects <.05 will appear in bold. The following code produces the table pasted below. It assumes the model mod.all. prednames creates a mapping from predictor names in the model to predictor names you want to appear in the table. Note that for the TeX to work you need to include `\usepackage{booktabs}`

in the preamble.

Read the rest of this entry »

### Plotting effects for glmer(, family=”binomial”) models

**UPDATE 12/15/10: **Bug fix. Thanks to Christian Pietsch.

**UPDATE 10/31/10:** Some further updates and bug fixes. The code below is the updated one.

**UPDATE 05/20/10: I’ve updated the code with a couple of extensions (both linear and binomial models should now work; the plot now uses ggplot2) and minor fixes (the code didn’t work if the model only had one fixed effect predictor). I also wanted to be clear that the dashed lines in the plots aren’t confidence intervals. They are multiples of the standard error of the effect.**

Here’s a new function for plotting the effect of predictors in multilevel logit models fitted in R using *lmer()* from the *lme4* package. It’s based on code by Austin Frank and I also borrowed from Harald Baayen’s *plotLMER.fnc()* (package *languageR*).* *First a cool pic:

These plots contain the distribution of the predictor (x-axis) against the predicted values (based on the entire model, y-axis) using *hexbinplot() *from the package *hexbin*. On top of that, you see the model prediction fo the selected predictor along with confidence intervals. Note that the predictor is given in its original form (here speech rate) although it was entered into the model as the centered log-transformed speechrate. The plot consideres that. Of course, you can configure things.