Episode 1: Scavenger hunt!

By Jacob Kaplan-Moss

Let’s kick things off with something a bit unusual: a virtual scavenger hunt.

At some point, nearly every web geek gets a chance to hack on some open data, usually from a government source. The buzzword here is “mashup,” but knowing how to find and consume openly available data will remain a valuable skill long after its faddishness ends.

Unfortunately, governments, and especially the US government, are often incredibly awful at providing this data. Sure, it’s available — but you’ve got to find it first.

So this question is all about finding that data. Since I’m most familiar with the USA, this question is USA-specific (but I’d love to see answers to any questions that apply to other nations).

In each case, the answer should be a URL where you can either download the data in question, or at least find a direct link to the data. There may be multiple sources for each, including ones that could be screen-scraped for the data. I’m not looking for those sources, however — just the ones with easily downloadable data in a format that can be easily parsed by a computer (i.e. CSV, XML, plain text). “Friendly” formats, in other words.

So, where can I download data to:

  1. Analyze the nutritional content of foods?
  2. Find the population (and other basic demographics) of my city?
  3. Analyze the latest SEC filings by public companies?
  4. Look at historical gas prices?
  5. Look for trends in juvenile arrest rates?

Post your answers into the comments. For extra brownie points, tell us how you located each piece of data — did The Google serve you well, or were you forced to turn elsewhere?

If you really want to stretch your brain, try to write a tool to import each chunk of data into your favorite relational database. There will be a related question in a couple of weeks involving modeling one of these pieces of data, so you overachievers can start thinking about it now…

Good luck, and check back this weekend for the answers.

