Watch the OCP rock Google’s ngram viewer

Posted on Updated on

Thanks to Zach Warren for pointing me to this cool tool by Google: the ngram viewer.

And just as a demonstration of how cool this is: watch how the OCP rocks that-mention in complement clauses to the verb believe. Of course, we would have to check for other complement clause embedding verbs and for the a priori probability of this vs. that determiner and pronoun uses. And no, that will be my paper. So don’t you dare write it. If you into OCP effects on optional function word use (e.g. because they might be taken to argue for phonological effects on grammatical encoding), see the references below.

And here are some more verbs for those complement clause lovers out there:

Interesting how the effect seems to get stronger for some verbs over time. First I thought this is due to the small sample bias (keep in mind that there is more overall data for later years, providing more accurate estimates of the two probabilities, both of which are low). But it also seems to be observed for some high frequency verbs (e.g. think above), albeit to a much subtler extent.

Verbs with rather similar meanings seem to show different historic trends:

Ah, how sweet this would be if it came with confidence intervals. I guess if we had the absolute amount of data for each bin, we could calculate them. Anybody listening at Google?

Related references

  1. Jaeger, T. F. in press. Phonological Optimization and Syntactic Variation: The Case of Optional that. Proceedings of the 32nd Annual Meeting of the Berkeley Linguistic Society.
  2. Lee, M. W. and Gibbons, J. 2007. Rhythmic alternation and the optional complementiser in English: New Evidence of phonological influence in phonological encoding. Cognition 105(2), 446-456.
  3. Walter, M.A. and Jaeger, T. F. 2008. Constraints on English that-drop: A strong lexical OCP effect. In Edwards, R.L., Midtlyng, P.J., Stensrud, K.G., and Sprague, C.L. (eds.) Proceedings of the Main Session of the 41st Meeting of the Chicago Linguistic Society, 505-519. Chicago, IL: CLS.

4 thoughts on “Watch the OCP rock Google’s ngram viewer

    Dan Pontillo said:
    December 21, 2010 at 11:38 pm

    They’re releasing the datasets. It looks like a pretty nice format:

    ngram TAB year TAB match_count TAB page_count TAB volume_count NEWLINE


      tiflo said:
      December 22, 2010 at 4:03 am

      yep, I think it’s really neat. Robin Melnick sent me a nice graph showing how you can use this tool to beautifully see the rise of do-insertion (or aux-inversion) in English and the decline of the original V-first order.


    tiflo said:
    December 29, 2010 at 8:11 pm

    Awesome: I am a bloke:,com_kunena/Itemid,73/func,view/catid,28/id,468145/#468540 — I made it! Thanks also, as always, to the LousyLinguist (


    2010 in review « HLP/Jaeger lab blog said:
    January 3, 2011 at 9:48 am

    […] The busiest day of the year was December 21st with 316 views. The most popular post that day was Watch the OCP rock Google’s ngram viewer. […]


Questions? Thoughts?

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s