Posts Tagged ‘multilevel models

27
Jul
11

New R resource for ordinary and multilevel regression modeling

Here’ s what I received from the Center of Multilevel Modeling at Bristol (I haven’t checked it out yet; registration seems to be free but required):

The Centre for Multilevel Modelling is very pleased to announce the addition of
R practicals to our free on-line multilevel modelling course. These give
detailed instructions of how to carry out a range of analyses in R, starting
from multiple regression and progressing through to multilevel modelling of
continuous and binary data using the lmer and glmer functions.

MLwiN and Stata versions of these practicals are already available.
You will need to log on or register onto the course to view these
practicals.

Read More...
http://www.cmm.bris.ac.uk/lemma/course/view.php?id=13
13
Jul
11

R code for Jaeger, Graff, Croft and Pontillo (2011): Mixed effect models for genetic and areal dependencies in linguistic typology: Commentary on Atkinson

Below I am sharing the R code for our paper on the serial founder effect:
This paper is a commentary on Atkinson’s 2011 Science article on the serial founder model (see also this interview with ScienceNews, in which parts of our comment in Linguistic Typology and follow-up work are summarized). In the commentary, we provide an introduction to linear mixed effect models for typological research. We discuss how to fit and to evaluate these models, using Atkinson’s data as an example.We illustrate the use of crossed random effects to control for genetic and areal relations between languages. We also introduce a (novel?) way to model areal dependencies based on an exponential decay function over migration distances between languages.
Finally, we discuss limits to the statistical analysis due to data sparseness. In particular, we show that the data available to Atkinson did not contain enough language families with sufficiently many languages to test whether the observed effect holds once random by-family slopes (for the effect) are included in the model. We also present simulations that show that the Type I error rate (false rejections) of the approach taken in Atkinson is many times higher than conventionally accepted (i.e. above .2 when .05 is the conventionally accepted rate of Type errors).
The scripts presented below are not intended to allow full replication of our analyses (they lack annotation and we are not allowed to share the WALS data employed by Atkinson on this site anyway). However, there are many plots and tests in the paper that might be useful for typologists or other users of mixed models. For that reason, I am for now posting the raw code. Please comment below if you have questions and we will try to provide additional annotation for the scripts as needed and as time permits. If you find (parts of the) script(s) useful, please consider citing our article in Linguistic Typology.
24
Feb
11

Diagnosing collinearity in mixed models from lme4

I’ve just uploaded files containing some useful functions to a public git repository. You can see the files directly without worrying about git at all by visiting regression-utils.R (direct download) and mer-utils.R (direct download). Continue reading ‘Diagnosing collinearity in mixed models from lme4′

16
Apr
10

Annotated example analysis using mixed models

Jessica Nelson (Learning Research and Development Center, University of Pittsburgh) uploaded a step-by-step example analysis using mixed models to her blog. Each step is nicely annotated and Jessica also discusses some common problems she encountered while trying to analyze her data using mixed models. I think this is a nice example for anyone trying to learn to use mixed models. It goes through all/most of the steps outlined in Victor Kuperman and my WOMM tutorial (click on the graph to see it full size):

14
May
09

Random effect: Should I stay or should I go?

One of the more common questions I get about mixed models is whether there are any standards regarding the removal of random effects from the model. When should a random effect be included in the model? This was also one of the questions we had hope to answer for our field (psycholinguistics) in the pre-CUNY Workshop on Ordinary and Multilevel Models (WOMM), but I don’t think we got anywhere close to a “standard” (see Harald Baayen’s presentation on understanding random effect correlations though for a very insightful discussion).

That being said, I find most of us would probably agree on a set of rules of thumb, at least for factorial analyses of balanced data: Continue reading ‘Random effect: Should I stay or should I go?’

07
May
09

Multinomial random effects models in R

This post is partly a response to this message. The author of that question is working on ordered categorical data. For that specific case, there are several packages in R that might work, none of which I’ve tried. The most promising is the function DPolmm() from DPpackage. It’s worth noting, though, that in that package you are committed to a Dirichlet Process prior for the random effects (instead of the more standard Gaussian). A different package, mprobit allows one clustering factor. This could be suitable, depending on the data set. MNP, mlogit, multinomRob, vbmp, nnet, and msm all offer some capability of modeling ordered categorical data, and it’s possible that one of them allows for random effects (though I haven’t discovered any yet). MCMCpack may also be useful, as it provides MCMC implementations for a large class of regression models. lrm() from the Design package handles ordered categorical data, and clustered bootstrap sampling can be used for a single cluster effect.

I’ve recently had some success using MCMCglmm for the analysis of unordered multinomial data, and want to post a quick annotated example here. It should be noted that the tutorial on the CRAN page is extremely useful, and I encourage anyone using the package to work through it.

I’m going to cheat a bit in my choice of data sets, in that I won’t be using data from a real experiment with a multinomial (or polychotomous) outcome. Instead, I want to use a publicly available data set with some relevance to language research. I also need a categorical dependent variable with more than two levels for this demo to be interesting. Looking through the data sets provided in the languageR package, I noticed that the dative data set has a column SemanticClass which has five levels. We’ll use this as our dependent variable for this example. We’ll investigate whether the semantic class of a ditransitive event is influenced by the modality in which it is produced (spoken or written).

library(MCMCglmm)
data("dative", package = "languageR")

k <- length(levels(dative$SemanticClass))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))

m <- MCMCglmm(SemanticClass ~ -1 + trait + Modality,
              random = ~ us(trait):Verb + us(Modality):Verb,
              rcov = ~ us(trait):units,
              prior = list(
                R = list(fix=1, V=0.5 * (I + J), n = 4),
                G = list(
                  G1 = list(V = diag(4), n = 4),
                  G2 = list(V = diag(2), n = 2))),
              burnin = 15000,
              nitt = 40000,
              family = "categorical",
              data = dative)

Read on for an explanation of this model specification, along with some functions for evaluating the model fit.

Continue reading ‘Multinomial random effects models in R’

03
May
09

Multilevel model tutorial at Haskins lab

Austin Frank and I just gave a 2×3 hours workshop on multilevel models at Haskins Lab (thanks to Tine Mooshammer for organizing!). We had a great audience with a pretty diverse background (ranging from longitudinal studies on nutrition, over speech researchers, clinical studies, and psycholinguists, to fMRI researchers), which made for lots of interesting conversations on topics I don’t usually get to think about. Thanks to everyone attending =). We had a great time.

We may post the recordings once we receive them, if it turns out they may be useful. But for now, here are many of the slides we used, a substantial subset of which were created by Roger Levy (UC San Diego) and/or in collaboration with Victor Kuperman (Stanford University) for WOMM’09 at the CUNY Sentence Processing Conference, as indicated on the slides. No guarantees for the R-code and please do not distribute (rather: refer to this page) and ask before citing.

Questions and comments welcome, preferably using the comment box at the bottom of this page. R related questions should be send to the very friendly email support list for language researchers using R (see R-lang link in the navigation bar to the right).

08
Nov
08

Pre-cuny workshop on regression and multilevel modeling (cntd)

Some time ago, I announced that some folks have been thinking about organizing a small workshop on common issues and standards in regression modeling (including multilevel models) in psycholinguistic research to be held the day before CUNY 2009 (i.e. 03/25 at UC Davis). Here’s an update on this “workshop” along with some thoughts for planning. Continue reading ‘Pre-cuny workshop on regression and multilevel modeling (cntd)’

09
Sep
08

Mini-tutorial on regression and mixed (linear & logit) models in R

This summer, Austin Frank and I organized a six 3h-session tutorial on regression and mixed models. It is posted on our HLP lab wiki and consists out of reading suggestions and commented R scripts that we went through in class. Among the topics (also listed for each session on the wiki) are:

  • linear & logistics regression
  • linear & logit mixed/multilevel/hierarchical models
  • model evaluation (residuals, outliers, distributions)
  • collinearity tests and dealing with collinearity
  • coding of variables (contrasts)
  • visualization

We used both Baayen’s 2008 textbook Analyzing Linguistic Data: A Practical Introduction to Statistics using R (available online) and Gelman and Hill’s 2007 book on Data Analysis using Regression and Multilevel/Hierarchical Models, both of which we can recommend (they also complement each other nicely). If you have questions about this class or you have suggestions for improvement, please send us an email or leave a comment to this page (we’ll get notified).




Blog Stats

  • 117,920 hits

 

June 2012
M T W T F S S
« May    
 123
45678910
11121314151617
18192021222324
252627282930  

Categories

RSS Language Log


Follow

Get every new post delivered to your Inbox.