Ever noticed?

The reproducibility project

Posted on Updated on

Thanks to Anne Pier Salverda who made me aware of this project to replicate all studies in certain psych journals, including APA journals that publish psycholinguistic work, such as JEP:LMC. This might be a fine April fools joke slightly delayed, but it sure is a great idea! In a similar study researchers apparently found that 6 out of 53 cancer studies replicated (see linked article).

And while we are at it, here’s an article that, if followed, is guaranteed to increase the proportion of replications (whereas power, effect sizes, lower p-values, family-wise error corrections, min-F and all the other favorites out there are pretty much guaranteed to not do the job). Simmons et al 2011, published in Psychological Science, shows what we should all know but that is all too often forgotten or belittled: lax criteria in excluding data, adding additional subjects, transforming data, adding or removing covariates inflate the Type I error rate (in combination easily up to over 80% false negatives for p<.05!!!).  Enjoy.


Correlation plot matrices using the ellipse library

Posted on Updated on

My new favorite library is the ellipse library. It includes functions for creating ellipses from various objects. It has a function, plotcorr() to create a correlation matrix where each correlation is represented with an ellipse approximating the shape of a bivariate normal distribution with the same correlation. While the function itself works well, I wanted a bit more redundancy in my plots and modified the code. I kept (most of) the main features provided by the function and I’ve included a few: the ability to plot ellipses and correlation values on the same plot, the ability to manipulate what is placed along the diagonal and the rounding behavior of the numbers plotted. Here is an example with some color manipulations. The colors represent the strength and direction of the correlation, -1 to 0 to 1, with University of Rochester approved red to white to blue.

First the function code:

Read the rest of this entry »

Google scholar now provides detailed citation report

Posted on Updated on

This might be of interest to some of you: Google Scholar now allows you to correct links or citations to your work. It also provides a complete summary of all your citations, by article, by year, etc. It’s a functionality similar to academia.edu, but it let’s you remove wrong links to your work (e.g. to old prepublished manuscripts).

The interface is rather convenient since it allows you to import all references from scholar, which is almost 95% correct. Overall, it’s actually much more convenient than academia.edu (though I’d say it serves a slightly different purpose). It also generates a list of all your co-authors and other schnick-schnack ;). Check it out. Sweet.

Read the rest of this entry »

some (relatively) new funding mechanisms through NSF

Posted on

This might be of interest to folks, in case you haven’t seen it. First, there’s RAPID and EAGER. RAPID is a mechanism for research that requires fast funding decisions (e.g. b/c the first language with only one phoneme was just discovered but its last speaker is just about to enter into a vow of silence). EAGERs are “Early-concept Grants for Exploratory Research” for exploratory work – i.e. high risk research with a high potential for high pay-off. One important property of both mechanisms is that submissions do not have to be sent out for external review, which should substantially shorten the time until you hear back from NSF.

Second, there is now a new type of proposal that is specifically aimed at interdisciplinary work that would not usually be funded by any of the existing NSF panels alone – CREATIV: Creative Research Awards for Transformative Interdisciplinary Ventures.

Note that all three of these funding types allow no re-submission.

Bayesian Data Analysis, p-values, and more: What do we need?

Posted on

Some of you might find this open letter by John Kruschke (Indiana University) interesting. He is making a passionate argument to abandon traditional “20th century” data analysis in favor of Bayesian approaches.

Interesting article about Google Scholar, H-index, etc.

Posted on Updated on

If you’re interested in different measures of impact (for journals, people, or articles) have a look at this article comparing ISI’s Web of Science against Google Scholar results. They also discuss different measures like the h-index and one I hadn’t heard of (the g-index) accounting for potential effects of high impact articles beyond what they would contribute to the h-index.

2010 in review

Posted on

Crunchy numbers

Featured image

In 2010, there were 22 new posts, growing the total archive of this blog to 114 posts. There were 16 pictures uploaded, taking up a total of 21mb. That’s about a picture per month.

The busiest day of the year was December 21st with 316 views. The most popular post that day was Watch the OCP rock Google’s ngram viewer.

Where did they come from?

Some visitors came searching, mostly for nagelkerke, glmer, mcmcglmm, nagelkerke r square, and nagelkerke r2.

Attractions in 2010

These are the posts and pages that got the most views in 2010.


Watch the OCP rock Google’s ngram viewer December 2010


Information on applying for a waiver of the J1-visa Foreign Residence Requirement April 2009


Multinomial random effects models in R May 2009


Nagelkerke and CoxSnell Pseudo R2 for Mixed Logit Models August 2009
1 comment


Plotting effects for glmer(, family=”binomial”) models January 2009