As promised, I'm doing a collocation competition. Collocation is all about what words generally appear next to other words. So if you took some million written words and put them all in a database, you could search for what word appears frequently next to other words. Those words collocate.
So the way the game works is that I will give you a word and you have to try to guess what the top word that collocates with it is. Using a collocation or corpus engine is strictly forbidden - that's cheating! I will award points based on the top 100 matches for that word, based on the order in which the words appear. So the higher up the matching list your guess is, the more points you get. Note that words like 'the' and 'an' will be ignored as being too frequent and obvious. Example could be if I said Dell, laptop and computer might be winning matches. I'll do a couple words a day since there is a lag in response times. Guesses are first-come, first served. You can't repeat a word someone else has guessed for that entry, and you can only guess one word per entry.
Monday's words
#1: developer
#2: anthropology
#3: timeline
I'll reveal the winning answers the following day.
Monday, 14 July 2008
Subscribe to:
Post Comments (Atom)
13 comments:
er...OK? not sure whether we are supposed to be doing any sort of investigation (though I note the strictly vorbodden comment), educated guesses, top of head guesses, collocation at the conference? - also any extra points for a comedy guess?
My entries are:
#1 web..developer
#2 social..anthropology
#3 interactive..timeline
(have a feeling I'm completely rubbish at this, but perhaps I need to do a few warm up laps)
Extra credit entries(?):
#1 developer..geekfest
#2 klingon..anthropology
#3 dungeonmaster..timeline
My entries are:
#1: object..developer
#2: web..anthropology
#3: historical..timeline
and for my extra credit guesses ;)
#1: transcendental*...developer
#2: bandit..anthropology
#3: doom...timeline
*McCAAAAIINNNN!
I will give extra points for humorous answers, though not as much as regularly correct answers, how's that?
You can do a limited amount of research, just no collocation or corpus tool searches. Top of the head is probably more fun and less time consuming though.
2 bonus points to Louise for geekfest and Klingon, but I just did not get yours Robin, sorry - no points for you. Actual entry points awarded later today.
nevermind! To explain -
1- John McCain/Stephen Colbert joke
2- bandit as in one armed bandit.
3- doom timeline, as in credit crunch media reaction.
Still a bit confused about what a one-armed bandit has to do with anthropology.
People in Las Vegas don't interact with people, just slot machines don't they? ;)
Ok, here are the results from round one:
Developer:
Top words are property, late, software, land, millionaire, kit, trump, donald, wealthy
Unfortunately neither of your entries came back in the top 100. I think web might have in a different corpus. This particular corpus is 56 million words of written and spoken English, but I guess that wasn't one of the top matches. 0 points to both
Anthropology:
Top word is social, scoring a whopping 12 points for Louise (10 for being in the top 10%, and an extra 2 for getting the top choice). Way to go Louise.
Other words were cultural, london, university, history, sociology, philosophical, scientific, student, professor. Sorry Robin, no 'web'
Timeline:
Few entries for this one unfortunately, I guess not a common word. Top words were sample, reading, curriculum, following, pinpointing, bibliography, chronological. No points there I'm afraid. Good guesses I would've though though.
Oh, most of the slot machines are push button, but I see what you mean. Still not sure of the anthropology connection though. I'm looking for relevant humor here! Get your game up Robin, you're down in the scorecard.
oh I'm so very, very proud - I'd like to thank my friends, family and everyone who has blah blah (aka I think I may have peaked early, gonna make the most of it).
annoyingly I'd thought of property for developer but thought it must have been overtaken by the huuge amount of online stuff (as per wesch) and that on the web people mostly talk about web people...
anyway, feeling chuffed to be in double figures - more than happy if you want to close the comp now ;-)
I should note that the corpus does not just come from the web, but is spoken and written samples of English from letters, speeches, etc.
yeah I get that (well, I get it now, anyway) - it was just that if you follow the exponential growth of information premise of wesch's exabytes etc then even though the non-web stuff had been around for a lot longer the proportions should be tipping in favour of the web stuff, that was my thought - obviously wrong (very wrong) but... I was trying
There are some collocation and concordance tools that can use a search engine like Google to generate the corpus of words. Those might yield different results.
Post a Comment