Episode 1: Scavenger hunt!By Jacob Kaplan-Moss
Let’s kick things off with something a bit unusual: a virtual scavenger hunt.
At some point, nearly every web geek gets a chance to hack on some open data, usually from a government source. The buzzword here is “mashup,” but knowing how to find and consume openly available data will remain a valuable skill long after its faddishness ends.
Unfortunately, governments, and especially the US government, are often incredibly awful at providing this data. Sure, it’s available — but you’ve got to find it first.
So this question is all about finding that data. Since I’m most familiar with the USA, this question is USA-specific (but I’d love to see answers to any questions that apply to other nations).
In each case, the answer should be a URL where you can either download the data in question, or at least find a direct link to the data. There may be multiple sources for each, including ones that could be screen-scraped for the data. I’m not looking for those sources, however — just the ones with easily downloadable data in a format that can be easily parsed by a computer (i.e. CSV, XML, plain text). “Friendly” formats, in other words.
So, where can I download data to:
- Analyze the nutritional content of foods?
- Find the population (and other basic demographics) of my city?
- Analyze the latest SEC filings by public companies?
- Look at historical gas prices?
- Look for trends in juvenile arrest rates?
Post your answers into the comments. For extra brownie points, tell us how you located each piece of data — did The Google serve you well, or were you forced to turn elsewhere?
If you really want to stretch your brain, try to write a tool to import each chunk of data into your favorite relational database. There will be a related question in a couple of weeks involving modeling one of these pieces of data, so you overachievers can start thinking about it now…
Good luck, and check back this weekend for the answers.