Posts Tagged ‘multilevel models

14
May
09

Random effect: Should I stay or should I go?

One of the more common questions I get about mixed models is whether there are any standards regarding the removal of random effects from the model. When should a random effect be included in the model? This was also one of the questions we had hope to answer for our field (psycholinguistics) in the pre-CUNY Workshop on Ordinary and Multilevel Models (WOMM), but I don’t think we got anywhere close to a “standard” (see Harald Baayen’s presentation on understanding random effect correlations though for a very insightful discussion).

That being said, I find most of us would probably agree on a set of rules of thumb, at least for factorial analyses of balanced data: Continue reading ‘Random effect: Should I stay or should I go?’

07
May
09

Multinomial random effects models in R

This post is partly a response to this message. The author of that question is working on ordered categorical data. For that specific case, there are several packages in R that might work, none of which I’ve tried. The most promising is the function DPolmm() from DPpackage. It’s worth noting, though, that in that package you are committed to a Dirichlet Process prior for the random effects (instead of the more standard Gaussian). A different package, mprobit allows one clustering factor. This could be suitable, depending on the data set. MNP, mlogit, multinomRob, vbmp, nnet, and msm all offer some capability of modeling ordered categorical data, and it’s possible that one of them allows for random effects (though I haven’t discovered any yet). MCMCpack may also be useful, as it provides MCMC implementations for a large class of regression models. lrm() from the Design package handles ordered categorical data, and clustered bootstrap sampling can be used for a single cluster effect.

I’ve recently had some success using MCMCglmm for the analysis of unordered multinomial data, and want to post a quick annotated example here. It should be noted that the tutorial on the CRAN page is extremely useful, and I encourage anyone using the package to work through it.

I’m going to cheat a bit in my choice of data sets, in that I won’t be using data from a real experiment with a multinomial (or polychotomous) outcome. Instead, I want to use a publicly available data set with some relevance to language research. I also need a categorical dependent variable with more than two levels for this demo to be interesting. Looking through the data sets provided in the languageR package, I noticed that the dative data set has a column SemanticClass which has five levels. We’ll use this as our dependent variable for this example. We’ll investigate whether the semantic class of a ditransitive event is influenced by the modality in which it is produced (spoken or written).

library(MCMCglmm)
data("dative", package = "languageR")

k <- length(levels(dative$SemanticClass))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))

m <- MCMCglmm(SemanticClass ~ -1 + trait + Modality,
              random = ~ us(trait):Verb + us(Modality):Verb,
              rcov = ~ us(trait):units,
              prior = list(
                R = list(fix=1, V=0.5 * (I + J), n = 4),
                G = list(
                  G1 = list(V = diag(4), n = 4),
                  G2 = list(V = diag(2), n = 2))),
              burnin = 15000,
              nitt = 40000,
              family = "categorical",
              data = dative)

Read on for an explanation of this model specification, along with some functions for evaluating the model fit.

Continue reading ‘Multinomial random effects models in R’

03
May
09

Multilevel model tutorial at Haskins lab

Austin Frank and I just gave a 2×3 hours workshop on multilevel models at Haskins Lab (thanks to Tine Mooshammer for organizing!). We had a great audience with a pretty diverse background (ranging from longitudinal studies on nutrition, over speech researchers, clinical studies, and psycholinguists, to fMRI researchers), which made for lots of interesting conversations on topics I don’t usually get to think about. Thanks to everyone attending =). We had a great time.

We may post the recordings once we receive them, if it turns out they may be useful. But for now, here are many of the slides we used, a substantial subset of which were created by Roger Levy (UC San Diego) and/or in collaboration with Victor Kuperman (Stanford University) for WOMM’09 at the CUNY Sentence Processing Conference, as indicated on the slides. No guarantees for the R-code and please do not distribute (rather: refer to this page) and ask before citing.

Questions and comments welcome, preferably using the comment box at the bottom of this page. R related questions should be send to the very friendly email support list for language researchers using R (see R-lang link in the navigation bar to the right).

08
Nov
08

Pre-cuny workshop on regression and multilevel modeling (cntd)

Some time ago, I announced that some folks have been thinking about organizing a small workshop on common issues and standards in regression modeling (including multilevel models) in psycholinguistic research to be held the day before CUNY 2009 (i.e. 03/25 at UC Davis). Here’s an update on this “workshop” along with some thoughts for planning. Continue reading ‘Pre-cuny workshop on regression and multilevel modeling (cntd)’

09
Sep
08

Mini-tutorial on regression and mixed (linear & logit) models in R

This summer, Austin Frank and I organized a six 3h-session tutorial on regression and mixed models. It is posted on our HLP lab wiki and consists out of reading suggestions and commented R scripts that we went through in class. Among the topics (also listed for each session on the wiki) are:

  • linear & logistics regression
  • linear & logit mixed/multilevel/hierarchical models
  • model evaluation (residuals, outliers, distributions)
  • collinearity tests and dealing with collinearity
  • coding of variables (contrasts)
  • visualization

We used both Baayen’s 2008 textbook Analyzing Linguistic Data: A Practical Introduction to Statistics using R (available online) and Gelman and Hill’s 2007 book on Data Analysis using Regression and Multilevel/Hierarchical Models, both of which we can recommend (they also complement each other nicely). If you have questions about this class or you have suggestions for improvement, please send us an email or leave a comment to this page (we’ll get notified).




Blog Stats

  • 33,592 hits

Categories

Archives

 

November 2009
M T W T F S S
« Oct    
 1
2345678
9101112131415
16171819202122
23242526272829
30