You are hereBlog / Rob Speer's blog

Rob Speer's blog

  • warning: Parameter 1 to profile_load_profile() expected to be a reference, value given in /usr/share/drupal6/includes/module.inc on line 476.
  • warning: Parameter 1 to profile_load_profile() expected to be a reference, value given in /usr/share/drupal6/includes/module.inc on line 476.
  • warning: Parameter 1 to profile_load_profile() expected to be a reference, value given in /usr/share/drupal6/includes/module.inc on line 476.
  • warning: Parameter 1 to profile_load_profile() expected to be a reference, value given in /usr/share/drupal6/includes/module.inc on line 476.
  • warning: Parameter 1 to profile_load_profile() expected to be a reference, value given in /usr/share/drupal6/includes/module.inc on line 476.

New names for our Python packages

Years ago, we made a decision to put all our Python packages in a common namespace called csc. To put it simply, this did not work well. Today, we have finally undone this decision by deprecating the csc namespace and renaming every single one of our modules. Python programmers, please learn from our mistake and never make a namespace package.

If you upgrade our software, you'll get more straightforward names for our modules. The new releases are ConceptNet 4.0.0, Divisi2 2.2.0, csc-utils 0.6, and a new package called simplenlp 0.9.

The new names to import are:

  • csc.conceptnet → conceptnet
  • csc.divisi2 → divisi2
  • csc.nl → simplenlp
  • csc.util → csc_utils
  • csc.corpus → conceptnet.corpus
  • csc.lib → conceptnet.lib
  • csc.django_settings → conceptnet.django_settings
  • csc.pseudo_auth → conceptnet.pseudo_auth
  • csc.webapi → conceptnet.webapi

There's still a package called csc, and basically what it's there for is to make your old code keep working. When you import something from csc, this package will find the new name of the package and import everything from there.

Read on for more details, including how to update your installation.

Applying common-sense computing to an "NLP Challenge"

A blogger named Joseph Turian recently posted a challenge about finding semantically-related terms based on a very noisy data set. The results will be evaluated by humans.

This sounded like exactly what our techniques are good at.

The input data was certainly as noisy as promised. It was basically a corpus of .uk Web pages, run through a very unreliable phrase detector and reduced down to unique phrases.

Divisi at SciPy 2010

Catherine and I are currently at SciPy 2010, presenting Divisi, our Python SVD toolkit.

Building APIs

I've been spending some time working on backend code -- the code that can make everything else we do easier and smoother.

For one thing, I want to draw attention to the ConceptNet Web API. This has actually been around for a while, but I've recently added a couple of new commands to it, and I consider it stable enough that it deserves more publicity.

If you want to build an application that uses information from ConceptNet, the "traditional" way is to install our Python API, download a copy of the database, set up that database, and write Django queries that give you what you want. This is overkill for a lot of situations, and of course doesn't work very well for programs that aren't written in Python.

The ConceptNet Web API is there as a more flexible alternative. Instead of having to set up anything, you can use it right now: Hey ConceptNet, tell me three things about penguins.

New SQLite database

We finally have an updated SQLite database for ConceptNet. It's designed for use with ConceptNet 4.0b8 (just released). This will fix the long-standing "best_raw_id" bug.

Verbosity, and one meeeelion sentences

How did we just get nearly 200,000 new statements in Open Mind Common Sense?

We've just imported a whole lot of data from Verbosity, one of Luis von Ahn's Games with a Purpose. Verbosity collects common sense knowledge through a game: one person is given a word, and needs to get the other person to guess that word by listing common-sense facts about it.

Welcome back, Catherine Havasi!

Catherine Havasi co-created the Open Mind Common Sense project, as an undergraduate researcher working with Push Singh way back in 1999. For the last five years, she's been working on a doctorate in computational linguistics at Brandeis University. She's been doing a lot of cross-campus research with this group.

Last month, she finally earned her Ph.D (congratulations!). Now, she's returned to the Media Lab as a post-doc, where she'll once again be able to work on Open Mind and its applications full time. It's great to have her as an official part of the group again!

Divisi for Windows

The theme of this week is "make it so that our underlying code can actually be run by other people". One recent accomplishment: I finally figured out how to make a Windows installer of Divisi, our machine learning library. (The hard work to make Divisi compile on Windows at all was done by contributor Akshay Bhat. Thanks, Akshay.)

Speed issues

The Open Mind Common Sense website is currently really, really slow, and I'm sorry about that.

As we acquire more users and try to do more complicated reasoning behind the scenes, clearly what we need to do is spend the piles of money that we have just sitting around on a huge fancy server

Sorry, I meant to say: clearly what we need to do is keep finding ways to cache lots of stuff and using whatever computing power we can find. Anyway, I'm working on it.

New site.

We've got a new version of the Open Mind Common Sense site: openmind.media.mit.edu

The big changes: