I recently parsed the British National Corpus (BNC) using the latest version of the parser by Charniak’s group @ Brown. In running the results through ‘tgrep2 -p’ (i.e., building a corpus file), I ran into some troubles that I thought I’d put up here in case they save someone a bit of grief.
- 183,699 hits
- Review of Glottolog 2.0 August 24, 2013 Steve Moran
- Free Science Blog February 20, 2013 Emily M. Bender
- Crowdsourcing WALS using Linked Data September 3, 2012 Sebastian Nordhoff
- Interview: New blog for experimental statistics in corpus linguistics June 20, 2012 Emily M. Bender
- NSF/OCI Data Infrastructure Building Blocks (DIBBs) solicitation June 15, 2012 D Terence Langendoen
- The long get longer December 4, 2013 Mark Liberman
- I met someone and they make me happy December 4, 2013 Geoffrey K. Pullum
- Substituting Pinyin for unknown Chinese characters December 3, 2013 Victor Mair
- Speech rhythms and brain rhythms December 2, 2013 Mark Liberman
- Subtle differences December 1, 2013 Mark Liberman
- An error has occurred; the feed is probably down. Try again later.