Diagnosing collinearity in mixed models from lme4

Posted on Updated on

I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download). Read the rest of this entry »

Some R code to understand the difference between treatment and sum (ANOVA-style) coding

Posted on Updated on

Below, I’ve posted some code that

  1. generates an artificial data set
  2. creates both treatment (a.k.a. dummy) and sum (a.k.a. contrast or ANOVA-style) coding for the data set
  3. compares the lmer output for the two coding systems
  4. suggests a way to test simple effects in a linear mixed model

Mostly though the code is just meant as a starting point for people who want to play with a balanced (non-Latin square design) data set to understand the consequences of coding your predictor variables in different ways.

Read the rest of this entry »

One slide on developing a regression model with interpretable coefficients

Posted on Updated on

While Victor Kuperman and I are preparing our slides for WOMM, I’ve been thinking about how to visualize the process from input variables to a full model. Even though it involves many steps that hugely depend on the type of regression model, which in turn depends on the type of outcome (dependent) variable, there are a number of steps that one always needs to go through if we want interpretable coefficient estimates (as well as unbiased standard error estimates for those coefficients).


Read the rest of this entry »

Mini-tutorial on regression and mixed (linear & logit) models in R

Posted on Updated on

This summer, Austin Frank and I organized a six 3h-session tutorial on regression and mixed models. It is posted on our HLP lab wiki and consists out of reading suggestions and commented R scripts that we went through in class. Among the topics (also listed for each session on the wiki) are:

  • linear & logistics regression
  • linear & logit mixed/multilevel/hierarchical models
  • model evaluation (residuals, outliers, distributions)
  • collinearity tests and dealing with collinearity
  • coding of variables (contrasts)
  • visualization

We used both Baayen’s 2008 textbook Analyzing Linguistic Data: A Practical Introduction to Statistics using R (available online) and Gelman and Hill’s 2007 book on Data Analysis using Regression and Multilevel/Hierarchical Models, both of which we can recommend (they also complement each other nicely). If you have questions about this class or you have suggestions for improvement, please send us an email or leave a comment to this page (we’ll get notified).