Posts Tagged ‘syntactic corpora

29
Mar
11

Celebrating Tom Wasow: A Festschrift

At this year’s CUNY Sentence Processing Conference, Emily Bender and Jennifer Arnold presented a Festschrift celebrating Thomas Wasow. Here’s what the publisher’s site (CSLI) says (picture taken from the publisher’s website, which is hopefully ok; see the book for copyrights):

This book is a collection of papers on language processing, usage, and grammar, written in honor of Tom Wasow to commemorate his career on the occasion of his 65th birthday. Tom is a professor of linguistics and philosophy. But more accurately, he is a renaissance academic, having done work that connects with many different disciplines, including formal linguistics, sociolinguistics, historical linguistics, psycholinguistics, computational linguistics, and philosophy. Appropriately, this book reflects the diversity of Tom’s research and interests, including topics from multiple branches of linguistics and human information processing. These papers are written with minimal background assumed, so they can be used as teaching materials for beginning scholars. As such, this volume is a tribute to what is perhaps Tom’s most lasting contribution to the field—the mentorship and inspiration he provided to his students and collaborators, many of whom have contributed to this volume.

The book contains introductory and overview articles on a variety of topics in cognitive science from Emily M. Bender, Dan Flickinger, Stephan Oepen, Ash Asudeh, Peter Sells, Amy Perfors, James Paul Gee, John R. Rickford, T. Florian Jaeger, Jennifer E. Arnold, Harry J. Tily, Neal Snider, John A. Hawkins, and Susanne Riehemann.

Continue reading ‘Celebrating Tom Wasow: A Festschrift’

09
Feb
11

Compiling TGrep2 on an Intel Macintosh

When TGrep2 (Ed: a search tool for syntactic corpora in Penn Treebank format) was originally written, all Macs were PPC, but since late 2006 all Macs have been x86 and newer ones are even x86_64. In order to compile tgrep2 on an Intel Mac you have to make some small modifications to the TGrep2 and DRUtils source directories before you compile. This fix at least works on MacOS X 10.5 and 10.6. (Apologies for WordPress mangling the <code> blocks.)

First the patch for DRUtils:
Continue reading ‘Compiling TGrep2 on an Intel Macintosh’

21
Jul
09

LSA09-125: Psycholinguistics and Syntactic Corpora

The LSA Summer Institute is almost over and it has been a lot of fun so far. I didn’t get to see nearly as many talks and classes as I had hoped to, but instead there were tons of interesting conversations, new ideas, and just nice moments hanging out in the sun.

Brief update: It couldn’t have been different — I missed my flight. That happens every time I try to leave the Bay area. I am so used to it, I am not even trying to be on time anymore ;) . Ah well, it gives me a chance to enjoy a cappuccino in my favorite SF Cafe (Ritual Roasters) and even to attend Dan’s party (yippie!). Oh, and to upload some random pictures from the class room. Yeah, pretty dark I know. If you have better pictures — can you send them to me and I upload them? Also, here are some pics from our office hours at Caffee Strada (thanks to Judith and Alex for a great job!):

LSA125-ers — thanks for an enjoyable class, for all the questions, and I hope you keep enjoying your projects (or, if nothing else, now know for certain that you really really never want to work with corpora ;) . Send us an update about your papers as they progress.

To everyone else out there: If you’re interested in the use of syntactic corpora to investigate language production, you may find our LSA125 class webpage useful (see especially the links and information on the corpus pages, but also the slides). If you use material from this page, please let us know. Thanks to Judith, we now have a nicely documented version of the TGrep2 Database Tools, which we have dubbed TDTlite. Alex and Judith have also prepared example projects. TDTlite allows you to combine the output of TGrep2 searchers on syntactic corpora into a nice tab-delimited database that can be importated into R, Excel, or the stats program of your choice. While it doesn’t give you the full flexibility of scripting things yourself, it makes it considerably easier to start your own corpus-based project. We’re in the progress of polishing things up for distribution (thanks to all the brave members of our class who helped us to understand which parts still need further improvement!). So, if something like that might be of interest to you, let us know whether you would like further information. We hope to have a beta release by the end of August.




Blog Stats

  • 117,920 hits

 

June 2012
M T W T F S S
« May    
 123
45678910
11121314151617
18192021222324
252627282930  

Categories

RSS Language Log


Follow

Get every new post delivered to your Inbox.