January 1, 2012

I Say Coca, You Think...

Coca doesn't just mean the plant from which cocaine is derived, and its many associations, or the name of a gifted comedienne, any longer. Spelled COCA, it stands for Corpus of Contemporary American English, a database which tracks new and emerging meanings for English language words. Did The Mentalist's Agent Lisbon just mention that she has requested a "bolo" for some suspect? What is a "bolo," anyway? Search COCA, and discover it's a "be on the lookout" alert (or notice) for a car or suspect as well as a kind of tie. "Be on the lookout" makes more sense in the context of "The Mentalist"'s episode. COCA lets you search for phrases and related words as well. I searched the two word phrase "Socratic method" and got 39 hits; the system gave me context, dates, and classification (about 20 words, academic, popular, or fictional, 1990s-today). Results are laid out in columns and the screen can look a little cluttered, but that's not fatal. Guided tour here.

Another database for emerging words is Wordnik, which presents new terms in their unvarnished glory. Wordnik uses many new sources, such as blogs and Twitter, to find its candidates. When I searched "bolo," I got the usual definitions of "bolo tie." When I capitalized "BOLO," I got "be on the lookout" as a definition. Having to know enough about terminology and searching to try capitalizing the word may be a drawback of this database, since users may not always capitalize the word when they write it, nor may searchers capitalize when they search for the term. When I searched "socratic method," the system found a limited number of hits, but suggested "socratic method" as a phrase. I re-ran the search and found many more hits.

The system also offers alternatives for a search, when it doesn't find many hits. Other features: "random word" and "word of the day." The database is set up differently from COCA. It resembles more precisely a traditional print dictionary: definition on the left and examples on the right. Results are easy to read.

More here from the New York Times about both Wordnik and COCA, and online dictionaries in general.

