Ever noticed?

The (in)dependence of pronunciation variation on the time course of lexical planning

Posted on Updated on

Language, Cognition, and Neuroscience just published Esteban Buz​’s paper on the relation between the time course of lexical planning  and the detail of articulation (as hypothesized by production ease accounts).

Several recent proposals hold that much if not all of explainable pronunciation variation (variation in the realization of a word) can be reduced to effects on the ease of lexical planning. Such production ease accounts have been proposed, for example, for effects of frequency, predictability, givenness, or phonological overlap to recently produced words on the articulation of a word. According to these account, these effects on articulation are mediated through parallel effects on the time course of lexical planning (e.g., recent research by Jennifer Arnold, Jason Kahn, Duane Watson, and others; see references in paper).


This would indeed offer a parsimonious explanation of pronunciation variation. However, the critical test for this claim is a mediation analysis, Read the rest of this entry »

Ways of plotting map data in R (and python)

Posted on Updated on

Thanks to Scott Jackson, Daniel Ezra Johnson, David Morris, Michael Shvartzman, and Nathanial Smith for the recommendations and pointers to the packages mentioned below.

  • R:
    • The maps, mapsextra, and maptools packages provide data and tools to plot world, US, and a variety of regional maps (see also mapproj and mapdata). This, combined with ggplot2 is also what we used in Jaeger et al., (2011, 2012) to plot distributions over world maps. Here’s an example from ggplot2 with maps.
    Example of using ggplot2 combined with the maps package.
    Example use of ggplot2 combined with the maps package (similar to the graphs created for Jaeger et al., 2011, 2012).

Is my analysis problematic? A simulation-based example

Posted on Updated on

This post is in reply to a recent question on in ling-R-lang by Meredith Tamminga. Meredith was wondering whether an analysis she had in mind for her project was circular, causing the pattern of results predicted by the hypothesis that she was interested in testing. I felt her question (described below in more detail) was an interesting example that might best be answered with some simulations. Reasoning through an analysis can, of course, help a lot in understanding (or better, as in Meredith’s case, anticipating) problems with the interpretation of the results. Not all too infrequently, however, I find that intuition fails or isn’t sufficiently conclusive. In those cases, simulations can be a powerful tool in understanding your analysis. So, I decided to give it a go and use this as an example of how one might approach this type of question.

Results of 16 simulated priming experiments with a robust priming effect (see title for the true relative frequency of each variant in the population).
Figure 1: Results of 16 simulated priming experiments with a robust priming effect (see title for the true relative frequency of each variant in the population). For explanation see text below.

Read the rest of this entry »

A few reflections on “Gradience in Grammar”

Posted on Updated on

In my earlier post I provided a summary of a workshop on Gradience in Grammar last week at Stanford. The workshop prompted many interesting discussion, but here I want to talk about an (admittedly long ongoing) discussion it didn’t prompt. Several of the presentations at the workshop talked about prediction/expectation and how they are a critical part of language understanding. One implication of these talks is that understanding the nature and structure of our implicit knowledge of linguistic distributions (linguistic statistics) is crucial to advancing linguistics. As I was told later, there were, however, a number of people in the audience who thought that this type of data doesn’t tell us anything about linguistics and, in particular, grammar (unfortunately, this opinion was expressed outside the Q&A session and not towards the people giving the talks, so that it didn’t contribute to the discussion). Read the rest of this entry »

Congratulations to Dr. Alex B. Fine

Posted on

It’s my great pleasure to announce to the world (i.e., all 4 subscribed readers to this blog) that Alex B. Fine successfully defended his thesis entitled “Prediction, Error, and Adaptation During Online Sentence Comprehension” jointly advised by Jeff Runner and me. Alex is the first HLP lab graduate (who started his graduate studies in the lab), so we gave him a very proper send-off and roasted the heck out of him. Alex will be starting his post-doc at the University of Illinois Psychology Department in June, working with Gary Dell, Sarah Brown-Schmidt, and Duane Watson.

Dr. Fine's defending
Dr. Fine’s defending

Read the rest of this entry »

NSF post-doctoral funding opportunities

Posted on

Thanks to Jeff Runner, I just became aware of this post-doctoral program of the NSF (in SBE, i.e. the Social, Behavioral & Economic Sciences, which includes psychology, cognitive science, and linguistics). This program also recently underwent some changes. The program provides 2 years of funding. As for eligibility, let me quote the linked page: Ph.D. degree of the fellowship candidate must have been obtained within 24 months before application deadline (previously was within 30 months) or within 10 months after the application deadline (previously was 12 months).

Good luck to everyone interested.

transferring installed packages to a different installation of R

Posted on Updated on

It used to take me a while to reinstall all the R packages that I use after upgrading to a new version of R.  I couldn’t think of another way to do this than to create a list of installed packages by examining the R package directory, and to manually select and install each one of those packages in the new version of R.  In order to ensure that my home and office installation of R had the same packages installed, I did something similar.

I recently discovered that there is a much, much easier way to transfer the packages that you have installed to a different installation of R.  I found some R code on the web that I adapted to my needs.  Here is what you need to do:

1. Run the script “store_packages.R” in your current version of R.

# store_packages.R
# stores a list of your currently installed packages

tmp = installed.packages()

installedpackages = as.vector(tmp[is.na(tmp[,"Priority"]), 1])
save(installedpackages, file="~/Desktop/installed_packages.rda")

(Make sure that all the quotation marks in the script are straight.  The scripts will generate an error if they include any curly quotation marks.  For some reason, when I saved this blog entry, some quotation marks changed to curly ones.  WordPress is probably to blame for this problem, which I have not been able to fix.)

2. Close R.  Open the installation of R that you want the packages to be installed in.

3. Run the script “restore_packages.R”.

# restore_packages.R
# installs each package from the stored list of packages


for (count in 1:length(installedpackages)) install.packages(installedpackages[count])

Note that if you want to install the list of packages in an installation of R on a different computer, you should transfer the .rda file that is created by the store_packages script to that computer, and make sure that the path for the “load” command in the restore_packages script is set to the right location.

The reproducibility project

Posted on Updated on

Thanks to Anne Pier Salverda who made me aware of this project to replicate all studies in certain psych journals, including APA journals that publish psycholinguistic work, such as JEP:LMC. This might be a fine April fools joke slightly delayed, but it sure is a great idea! In a similar study researchers apparently found that 6 out of 53 cancer studies replicated (see linked article).

And while we are at it, here’s an article that, if followed, is guaranteed to increase the proportion of replications (whereas power, effect sizes, lower p-values, family-wise error corrections, min-F and all the other favorites out there are pretty much guaranteed to not do the job). Simmons et al 2011, published in Psychological Science, shows what we should all know but that is all too often forgotten or belittled: lax criteria in excluding data, adding additional subjects, transforming data, adding or removing covariates inflate the Type I error rate (in combination easily up to over 80% false negatives for p<.05!!!).  Enjoy.

Correlation plot matrices using the ellipse library

Posted on Updated on

My new favorite library is the ellipse library. It includes functions for creating ellipses from various objects. It has a function, plotcorr() to create a correlation matrix where each correlation is represented with an ellipse approximating the shape of a bivariate normal distribution with the same correlation. While the function itself works well, I wanted a bit more redundancy in my plots and modified the code. I kept (most of) the main features provided by the function and I’ve included a few: the ability to plot ellipses and correlation values on the same plot, the ability to manipulate what is placed along the diagonal and the rounding behavior of the numbers plotted. Here is an example with some color manipulations. The colors represent the strength and direction of the correlation, -1 to 0 to 1, with University of Rochester approved red to white to blue.

First the function code:

Read the rest of this entry »

Google scholar now provides detailed citation report

Posted on Updated on

This might be of interest to some of you: Google Scholar now allows you to correct links or citations to your work. It also provides a complete summary of all your citations, by article, by year, etc. It’s a functionality similar to academia.edu, but it let’s you remove wrong links to your work (e.g. to old prepublished manuscripts).

The interface is rather convenient since it allows you to import all references from scholar, which is almost 95% correct. Overall, it’s actually much more convenient than academia.edu (though I’d say it serves a slightly different purpose). It also generates a list of all your co-authors and other schnick-schnack ;). Check it out. Sweet.

Read the rest of this entry »

some (relatively) new funding mechanisms through NSF

Posted on

This might be of interest to folks, in case you haven’t seen it. First, there’s RAPID and EAGER. RAPID is a mechanism for research that requires fast funding decisions (e.g. b/c the first language with only one phoneme was just discovered but its last speaker is just about to enter into a vow of silence). EAGERs are “Early-concept Grants for Exploratory Research” for exploratory work – i.e. high risk research with a high potential for high pay-off. One important property of both mechanisms is that submissions do not have to be sent out for external review, which should substantially shorten the time until you hear back from NSF.

Second, there is now a new type of proposal that is specifically aimed at interdisciplinary work that would not usually be funded by any of the existing NSF panels alone – CREATIV: Creative Research Awards for Transformative Interdisciplinary Ventures.

Note that all three of these funding types allow no re-submission.

Bayesian Data Analysis, p-values, and more: What do we need?

Posted on

Some of you might find this open letter by John Kruschke (Indiana University) interesting. He is making a passionate argument to abandon traditional “20th century” data analysis in favor of Bayesian approaches.

Interesting article about Google Scholar, H-index, etc.

Posted on Updated on

If you’re interested in different measures of impact (for journals, people, or articles) have a look at this article comparing ISI’s Web of Science against Google Scholar results. They also discuss different measures like the h-index and one I hadn’t heard of (the g-index) accounting for potential effects of high impact articles beyond what they would contribute to the h-index.

2010 in review

Posted on

Crunchy numbers

Featured image

In 2010, there were 22 new posts, growing the total archive of this blog to 114 posts. There were 16 pictures uploaded, taking up a total of 21mb. That’s about a picture per month.

The busiest day of the year was December 21st with 316 views. The most popular post that day was Watch the OCP rock Google’s ngram viewer.

Where did they come from?

Some visitors came searching, mostly for nagelkerke, glmer, mcmcglmm, nagelkerke r square, and nagelkerke r2.

Attractions in 2010

These are the posts and pages that got the most views in 2010.


Watch the OCP rock Google’s ngram viewer December 2010


Information on applying for a waiver of the J1-visa Foreign Residence Requirement April 2009


Multinomial random effects models in R May 2009


Nagelkerke and CoxSnell Pseudo R2 for Mixed Logit Models August 2009
1 comment


Plotting effects for glmer(, family=”binomial”) models January 2009

R code for LaTeX tables of lmer model effects

Posted on Updated on

Here’s some R code that outputs text on the console that you can copy-paste into a .tex file and creates nice LaTeX tables of fixed effects of lmer models (only works for family=”binomial”). Effects <.05 will appear in bold. The following code produces the table pasted below. It assumes the model mod.all. prednames creates a mapping from predictor names in the model to predictor names you want to appear in the table. Note that for the TeX to work you need to include \usepackage{booktabs} in the preamble.
Read the rest of this entry »