Diagnosing collinearity in mixed models from lme4

Posted on February 24, 2011 Updated on February 24, 2011

I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download).

The first file contains functions that I use in building models. We all know that centering and standardizing regression predictors can reduce collinearity. This often leads to code like

library(lme4) d <- within(sleepstudy, { c.Days <- scale(Days, center = TRUE, scale = FALSE) })

m <- lmer(Reaction ~ c.Days + (1 + c.Days | Subject), data = d)

That works fine, until you want to fit the same model to a subset of the data, like

m2 <- update(m, data = subset(d, Days > 4)) mean(m2@frame$c.Days) # != 0

In this case, c.Days will no longer be properly centered (its mean is now 2.5, not 0). If, however, we do the centering when we build the model, everything works nicely

m3 <- lmer(Reaction ~ scale(Days, center = TRUE, scale = FALSE) + (1 + scale(Days, center = TRUE, scale = FALSE) | Subject), data = d) m4 <- update(m3, data = subset(d, Days > 4)) mean(m4@frame$`scale(Days, center = TRUE, scale = FALSE)`) # == 0

… except that the names of our predictors are now incredibly ugly. So regression-utils.R includes some shorthand functions for transformations commonly used in regression models. Using those functions, we can write

m5 <- lmer(Reaction ~ c.(Days) + (1 + c.(Days) | Subject), data = d) m6 <- update(m5, data = subset(d, Days > 4)) mean(m6@frame$`c.(Days)`) # == 0

To me, this looks much nicer and makes it feasible to do transformations inside of formulas.

The functions contained in the file are:

c.(x) : center a predictor
z.(x) : standardize (z-transform) a predictor
r.(formula, ...) : return standardized residuals from regressing a predictor against at least one other predictor
s.(x) : apply a transformation from Seber 1977 that puts the data in the range [-1,1]
p.(x, ...) : polynomial terms around x (uses orthogonal polynomials by default, see ?poly)

Now that we have a convenient way to reduce collinearity within our models (that can be reused on models fit to different subsets of the data), we want to measure the collinearity between the predictors. I’ve adapted three standard collinearity diagnostics to work directly on predictors in lmer glmer models. Let’s look at the effects of using orthogonal vs. natural polynomials on collinearity.

m.natural <- lmer(Reaction ~ p.(Days, 4, raw = TRUE) + (1 | Subject), data = sleepstudy) m.orthogonal <- lmer(Reaction ~ p.(Days, 4) + (1 | Subject), data =sleepstudy)

## kappa, aka condition number. ## kappa < 10 is reasonable collinearity, ## kappa < 30 is moderate collinearity, ## kappa >= 30 is troubling collinearity kappa.mer(m.natural) # 12.53 kappa.mer(m6) # 1.00, properly centered

## variance inflation factor, aka VIF ## values over 5 are troubling. ## should probably investigate anything over 2.5. max(vif.mer(m.natural)) # 14.47 max(vif.mer(m.orthogonal)) # 1

## condition index and variance decomposition proportions, ## see ?colldiag from the package perturb colldiag.mer(m.natural) # the quartic term has a high condition index, and shares a large portion of variance with the quadratic term colldiag.mer(m.orthogonal) # all condition indeces are low, no need to worry about variance proportions

## highest correlation among predictors, can be found in triangle matrix output by summary() on a mer object ## investigate further for any absolute values greater than .4 maxcorr.mer(m.natural) # -0.96 maxcorr.mer(m.orthogonal) # 0.00

Patches and pull requests welcome!

This entry was posted in Statistics & Methodology, statistics/R and tagged collinearity, lme4, lmer, mixed models, multilevel models, R code, vif.

23 thoughts on “Diagnosing collinearity in mixed models from lme4”

Michael Becker said:
March 15, 2011 at 6:34 pm

mer-utils.R is looking good!

Please tell us how you would like this to be referenced in a paper. Better still, give a sample paragraph from a real or hypothetical paper that shows how to report the use of mer-utils.

Diagnosing collinearity in mixed models from lme4

Share this:

Related

23 thoughts on “Diagnosing collinearity in mixed models from lme4”

Questions? Thoughts? Cancel reply