Watch the OCP rock Google’s ngram viewer

Posted on Updated on

Thanks to Zach Warren for pointing me to this cool tool by Google: the ngram viewer.

And just as a demonstration of how cool this is: watch how the OCP rocks that-mention in complement clauses to the verb believe. Of course, we would have to check for other complement clause embedding verbs and for the a priori probability of this vs. that determiner and pronoun uses. And no, that will be my paper. So don’t you dare write it. If you into OCP effects on optional function word use (e.g. because they might be taken to argue for phonological effects on grammatical encoding), see the references below.

And here are some more verbs for those complement clause lovers out there:
Read the rest of this entry »


Ever noticed? Have a closer look at Google counts

Posted on Updated on

The other day, Anne Pier Salverda made me aware of the following strange co-incidence. While googling for “two women were having dinner” only yields a handful of hits (3 when I was performing the search today), the search for “two men were having dinner” yields tens of thousands of hits (about 220,000 when I performed the search). Let’s not ask why Anne Pier was searching for these strings to begin with ;), but it was curious, so I looked for more.

This does not seem to be driven by highly repeated mentions of only a few stories.  Neither does it seem to be gender specific. Searches for “two x were having dinner”, where x is one of “boys”, “girls”, “guys” yield no or only a few hits.

So, is this really just a weird coincidence of this particular string “two men were having dinner”? Curiously, the same set of pattern in the present progressive yields the same asymmetry: “two men are having dinner” yielded about 185,000 hits, where as the other searches (“women”, “girls”, “boys”, “guys”) yielded no or only a handful of hits. The same asymmetry also holds for “two x have dinner”. Huh?!?

Then I had a closer look at these thousands of hits. You can do it yourself for any of the searchers (provided that Google hasn’t changed this already): As soon as you click to see any of the hits beyond page 1, Google suddenly claims that there are only about 40-50 hits (depending on the particular “Two men HAVE dinner” string used). These few hits in turn seem to come from only a few news sources. So, actually, all that seems to be going on is that there are a few more hits for the “men” examples — perfectly consistent with the fact that the internet (surprisingly) seems to talk more about “men” than “women” (a 10:7 when I last looked).

Great. Have you ever noticed anything comparable? Is this a bug? I just check and they seem to return the right count. What a sad day.