Diagnosing collinearity in mixed models from lme4
I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regressionutils.R (direct download) and merutils.R (direct download).
The first file contains functions that I use in building models. We all know that centering and standardizing regression predictors can reduce collinearity. This often leads to code like
library(lme4)
d < within(sleepstudy, {
c.Days < scale(Days, center = TRUE, scale = FALSE)
})
m < lmer(Reaction ~ c.Days + (1 + c.Days  Subject), data = d)
That works fine, until you want to fit the same model to a subset of the data, like
m2 < update(m, data = subset(d, Days > 4))
mean(m2@frame$c.Days) # != 0
In this case, c.Days
will no longer be properly centered (its mean is now 2.5, not 0). If, however, we do the centering when we build the model, everything works nicely
m3 < lmer(Reaction ~ scale(Days, center = TRUE, scale = FALSE) +
(1 + scale(Days, center = TRUE, scale = FALSE)  Subject),
data = d)
m4 < update(m3, data = subset(d, Days > 4))
mean(m4@frame$`scale(Days, center = TRUE, scale = FALSE)`) # == 0
… except that the names of our predictors are now incredibly ugly. So regressionutils.R includes some shorthand functions for transformations commonly used in regression models. Using those functions, we can write
m5 < lmer(Reaction ~ c.(Days) + (1 + c.(Days)  Subject), data = d)
m6 < update(m5, data = subset(d, Days > 4))
mean(m6@frame$`c.(Days)`) # == 0
To me, this looks much nicer and makes it feasible to do transformations inside of formulas.
The functions contained in the file are:

c.(x)
: center a predictor 
z.(x)
: standardize (ztransform) a predictor 
r.(formula, ...)
: return standardized residuals from regressing a predictor against at least one other predictor 
s.(x)
: apply a transformation from Seber 1977 that puts the data in the range [1,1] 
p.(x, ...)
: polynomial terms around x (uses orthogonal polynomials by default, see ?poly)
Now that we have a convenient way to reduce collinearity within our models (that can be reused on models fit to different subsets of the data), we want to measure the collinearity between the predictors. I’ve adapted three standard collinearity diagnostics to work directly on predictors in lmer glmer models. Let’s look at the effects of using orthogonal vs. natural polynomials on collinearity.
m.natural < lmer(Reaction ~ p.(Days, 4, raw = TRUE) + (1  Subject), data = sleepstudy)
m.orthogonal < lmer(Reaction ~ p.(Days, 4) + (1  Subject), data =sleepstudy)
## kappa, aka condition number.
## kappa < 10 is reasonable collinearity,
## kappa < 30 is moderate collinearity,
## kappa >= 30 is troubling collinearity
kappa.mer(m.natural) # 12.53
kappa.mer(m6) # 1.00, properly centered
## variance inflation factor, aka VIF
## values over 5 are troubling.
## should probably investigate anything over 2.5.
max(vif.mer(m.natural)) # 14.47
max(vif.mer(m.orthogonal)) # 1
## condition index and variance decomposition proportions,
## see ?colldiag from the package perturb
colldiag.mer(m.natural) # the quartic term has a high condition index, and shares a large portion of variance with the quadratic term
colldiag.mer(m.orthogonal) # all condition indeces are low, no need to worry about variance proportions
## highest correlation among predictors, can be found in triangle matrix output by summary() on a mer object
## investigate further for any absolute values greater than .4
maxcorr.mer(m.natural) # 0.96
maxcorr.mer(m.orthogonal) # 0.00
Patches and pull requests welcome!
March 15, 2011 at 6:34 pm
merutils.R is looking good!
Please tell us how you would like this to be referenced in a paper. Better still, give a sample paragraph from a real or hypothetical paper that shows how to report the use of merutils.
LikeLike
March 15, 2011 at 6:39 pm
Hi, I let Austin answer this, but I don’t think there is a paper (nor is there one planned, but maybe Austin will correct me). Maybe you can give Austin Frank (now at Haskins Labs, working with Jim Magnuson) a big shout out in the acknowledgments? Referring to the blog may help other folks to find this and other useful R code. Thank you!
LikeLike
March 15, 2011 at 9:11 pm
Thanks for the input, Florian. I will do as you suggest for now, but note that we regularly reference code, as we do for R, R packages, and other software projects. I think that would look even better than an acknowledgment or footnote.
LikeLike
March 15, 2011 at 9:54 pm
Hi Michael,
good point. Maybe Austin can put together a package with his functions. I’ll ask him.
Florian
LikeLike
March 17, 2011 at 12:55 pm
Thanks for all of this, Austin! Question: You refer to kappa < 10 as reasonable. Baayen (2008:182) suggests < 6 is no collinearity; 6<K<30 medium. So about that 610 range (which, er, just happens to be where a model I'm looking at right now falls!)… Is there a definitive source on interpreting kappa to which you might point me?
Thanks!
Robin
LikeLike
September 25, 2011 at 5:18 am
Thanks for this Austin!
One question. You write:
“We all know that centering and standardizing regression predictors can reduce collinearity.”
What exactly do you mean here? Centering just shifts the values of a variable. This presumably makes interpretation of any interaction coefficient easier, but I am not sure it will alleviate problems of collinearity.
LikeLike
September 26, 2011 at 10:00 am
Hi Niels,
thanks for posting this question. Centering does reduce collinearity between predictors and their higherorder terms (such as interactions or e.g. polynomials).
Let me know whether you need more information or whether this makes sense. You might find some of the tutorials we have posted on the blog useful (e.g. Maureen Gillespie’s tutorial on coding, i think, talks about this, too https://hlplab.wordpress.com/2010/05/10/miniwomm/)
HTH,
Florian
LikeLike
January 3, 2012 at 1:25 pm
[…] austin frank’s post on the hlplab blog […]
LikeLike
September 3, 2012 at 8:52 am
Used your code to determine VIF for some multilevel modeling I’m doing. It worked like a charm. Many thanks!
LikeLike
December 5, 2013 at 6:37 am
Hi Austin,
thanks for the post and the code, both very useful for me!
Anyway some functions don’t work any more with lme4 version 1.05. In detail, kappa.mer doesn’t find the the fit@x, maybe the author changed the definition of the lmer.fit part… It would be nice if you could fix these little problems…
Thanks!
Matteo
LikeLike
December 7, 2013 at 8:37 am
Hi Matteo,
unfortunately, Austin has left the field. I can have a look at this at some point, but it might be faster if you give it a try yourself. Often it’s just a matter of changing a few lines of code. If you find a solution, it’d be great if you could post it here for the benefit of others.
Florian
LikeLike
December 27, 2012 at 11:28 pm
[…] The “car” package in R will calculate VIFs for a linear model. I’ve written a quick function that will identify if any VIFs > cutoff, remove the largest value, recalculate, and repeat until all VIFS < cutoff. It produces a final model with the same name as the original model. I’ve also included a function for calculating VIFs for linear mixed effects models from the “lmer” function in the “lme4″ package (From: https://hlplab.wordpress.com/2011/02/24/diagnosingcollinearityinlme4/). […]
LikeLike
October 13, 2013 at 4:33 pm
Extremely arcane R subtlety alert: The shorthand c., z., etc. functions here don’t actually work the same as the ones in R, because there’s no makepredictcall method defined for them. This means that fitting and model analysis will work fine, but predictions (using predict()) will be wrong.
The reason is that when you use one of the builtin functions like ‘scale’ directly, then lm() or whoever (technically, model.frame.default) notices this, and memorizes the offset/scaling needed for your original data. Then when you want to predict, it reuses the *original* data’s offset/scaling to transform the *new* data, which is correct. But it only knows how to give this special treatment to functions that have a makepredictcall override.
LikeLike
December 20, 2013 at 3:22 pm
Hi,
Indeed lme4 1.05 changed the model class so what was available before through the @ sign is now possible to get with getME(model, optionyouwant).
Incorporating these changes gives the usable merutils.r for lme4 1.05 pasted below:
#########
vif.mer < function (fit) {
## adapted from rms::vif
v < vcov(fit)
nam < names(fixef(fit))
## exclude intercepts
ns 0) {
v < v[(1:ns), (1:ns), drop = FALSE]
nam < nam[(1:ns)]
}
d < diag(v)^0.5
v < diag(solve(v/(d %o% d)))
names(v) < nam
v
}
kappa.mer < function (fit,
scale = TRUE, center = FALSE,
add.intercept = TRUE,
exact = FALSE) {
X < getME(fit, "X")
nam < names(fixef(fit))
## exclude intercepts
nrp 0) {
X < X[, (1:nrp), drop = FALSE]
nam < nam[(1:nrp)]
}
if (add.intercept) {
X < cbind(rep(1), scale(X, scale = scale, center = center))
kappa(X, exact = exact)
} else {
kappa(scale(X, scale = scale, center = scale), exact = exact)
}
}
colldiag.mer 30) with
## more than one high variance propotion. see ?colldiag for more
## tips.
result < NULL
if (center)
add.intercept < FALSE
if (is.matrix(fit)  is.data.frame(fit)) {
X < as.matrix(fit)
nms < colnames(fit)
}
else if (class(fit) == "glmerMod") {
nms < names(fixef(fit))
X < getME(fit, "X")
if (any(grepl("(Intercept)", nms))) {
add.intercept < FALSE
}
}
X < X[!is.na(apply(X, 1, all)), ]
if (add.intercept) {
X < cbind(1, X)
colnames(X)[1] < "(Intercept)"
}
X < scale(X, scale = scale, center = center)
svdX < svd(X)
svdX$d
condindx < max(svdX$d)/svdX$d
dim(condindx) < c(length(condindx), 1)
Phi = svdX$v %*% diag(1/svdX$d)
Phi < t(Phi^2)
pi < prop.table(Phi, 2)
colnames(condindx) < "cond.index"
if (!is.null(nms)) {
rownames(condindx) < nms
colnames(pi) < nms
rownames(pi) < nms
} else {
rownames(condindx) < 1:length(condindx)
colnames(pi) < 1:ncol(pi)
rownames(pi) < 1:nrow(pi)
}
result < data.frame(cbind(condindx, pi))
zapsmall(result)
}
maxcorr.mer < function (fit,
exclude.intercept = TRUE) {
corF < vcov(fit)
nam < names(fixef(fit))
## exclude intercepts
ns 0 & exclude.intercept) {
corF < corF[(1:ns), (1:ns), drop = FALSE]
nam < nam[(1:ns)]
}
corF[!lower.tri(corF)] < 0
maxCor < max(corF)
minCor abs(minCor)) {
zapsmall(maxCor)
} else {
zapsmall(minCor)
}
}
#########
Cheers,
LikeLike
December 20, 2013 at 5:24 pm
Hi Isabelle,
Thank you!
Florian
LikeLike
February 25, 2015 at 7:00 pm
Hi folks,
I have a rather basic question, I think. Is there any statistical reason that car::vif or usdm::vif don’t work on mixed models? In other words, are vif estimates for mixed models unreliable and thus deliberately omitted from those packages, or are those packages just not built to handle vif for mixed models?
Thanks!
Joe
LikeLike
February 25, 2015 at 11:39 pm
Dear Joe, the short answer is (I think) that the VIF of predictor P_k is defined based on the R2 of the linear model predicting P_k from all other predictors. The idea behind this is that this captures the redundancy of this predictor in the other predictors. However, if we have clustered data, this simple relation doesn’t hold anymore, since the redundancy of the different predictors (with regard to all other predictors) should now be assessed conditionally on the random effects.
LikeLike
February 26, 2015 at 11:31 am
Thanks so much for the response.
So, to ask one more obvious question, does vif.mer assesses the VIF of predictors conditionally on the random effects? In the code for vif.mer, I see that it’s based on vcov(fit). Is the vcov(fit) for a mixed model already conditional on the random effects?
Sorry for more questions, I’m just trying to better understand vif.mer, since it’s the only way I’ve found to assess VIF for a lmer model.
Thanks for the code and clarification!
Joe
LikeLike
February 26, 2015 at 5:30 pm
I’m pretty sure that vcov is the variancecovariance matrix of the fixed effects conditional on the random effects. So, in this sense
vif.mer
provides the most straightforward extension for thevif
function from therms
package (formerly known asDesign
) to mixed models.LikeLike