Blog Post RSS ?

Blogs » Web Developer Quiz » Episode 1: Scavenger hunt!
 

Episode 1: Scavenger hunt!

by Jacob Kaplan-Moss

Let’s kick things off with something a bit unusual: a virtual scavenger hunt.

At some point, nearly every web geek gets a chance to hack on some open data, usually from a government source. The buzzword here is “mashup,” but knowing how to find and consume openly available data will remain a valuable skill long after its faddishness ends.

Unfortunately, governments, and especially the US government, are often incredibly awful at providing this data. Sure, it’s available — but you’ve got to find it first.

So this question is all about finding that data. Since I’m most familiar with the USA, this question is USA-specific (but I’d love to see answers to any questions that apply to other nations).

In each case, the answer should be a URL where you can either download the data in question, or at least find a direct link to the data. There may be multiple sources for each, including ones that could be screen-scraped for the data. I’m not looking for those sources, however — just the ones with easily downloadable data in a format that can be easily parsed by a computer (i.e. CSV, XML, plain text). “Friendly” formats, in other words.

So, where can I download data to:

  1. Analyze the nutritional content of foods?
  2. Find the population (and other basic demographics) of my city?
  3. Analyze the latest SEC filings by public companies?
  4. Look at historical gas prices?
  5. Look for trends in juvenile arrest rates?

Post your answers into the comments. For extra brownie points, tell us how you located each piece of data — did The Google serve you well, or were you forced to turn elsewhere?

If you really want to stretch your brain, try to write a tool to import each chunk of data into your favorite relational database. There will be a related question in a couple of weeks involving modeling one of these pieces of data, so you overachievers can start thinking about it now…

Good luck, and check back this weekend for the answers.

If you liked this blog, share the love:

  • Save to Del.icio.us

This post has 12 responses so far

  1. Nutritional Content DB http://www.ars.usda.gov/Services/docs.htm?docid=13746 found via google for ‘nutritional content of foods’

    Census Information (for Orlando Florida, but you can choose any other city, I just like Florida) http://factfinder.census.gov/servlet/QTTable?_bm=y&-context=qt&-qr_name=DEC_1990_STF1_DP1&-ds_name=DEC_1990_STF1_&-CONTEXT=qt&-tree_id=100&-all_geo_types=N&-redoLog=true&-_caller=geoselect&-currentselections=DEC_1990_STF1_DP1&-geo_id=label&-geo_id=16000US121600&-search_results=16000US121600&-format=&-_lang=en found via google for ‘census data download’

    Juvenile Arrest Rates http://ojjdp.ncjrs.org/ojstatbb/ezaucr/asp/ucr_display.asp found via google for ‘juvenile arrest rates’

     
  2. I feel so slow now. :( I only just found the Nutritional Content. :P Same URL as you have. There are other sources, however. I took time to read through some things (which slowed me down), and even the USDA get their information from 3+ sources originally.

     
  3. 1. http://www.ars.usda.gov/Services/docs.htm?docid=13746
    2. http://www.census.gov/popest/datasets.html
    1. Note: the “Subcounty population dataset” lists population by city.
    3. ftp://ftp.sec.gov/edgar/daily-index/
    4. http://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_history.html
    5. http://ojjdp.ncjrs.gov/ojstatbb/dat.html#downloadable

     
  4. Historical Gas Prices:
    http://www.eia.doe.gov/oil_gas/petroleum/data_publications/wrgp/mogas_history.html

    Select how you want the info grouped and it is delivered in xls.

    Found Via Google using: “Historical gas prices”

     
  5. SEC Filings

    http://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent

    Good times.

     
  6. Whoops, forgot to mention that The Google helped with “latest SEC filings.”

     
  7. I can only have a go at these once every couple hours since I’m at work. As far as the SEC Filings (google “SEC filings”, traversed first hit subcategory…) are concerned, this is what I’ve found thus far:

    http://www.sec.gov/cgi-bin/browse-edgar?action=getcurrent
    Allows a search of most recent filings, about as real-time as you can get. I have not found any way to gather a full listing of this data without scraping. The following links may assist with this, but my 5 min search break is over. :)
    - Link 1
    - Link 2
    I think these are in the right direction, but again, my time’s up.

    http://www.sec.gov/Archives/edgar/xbrlrss.xml
    An RSS feed which is updated daily.

    http://www.sec.gov/edgar/searchedgar/webusers.htm
    Main page that delivered me to the two “main” links from above.

    As an aside, when using a search engine, using advanced filters can help quite a bit when you know what form of information you’re looking for, especially if a government organization is most likely involved. With google, for instance, you can specify in the search terms: site:.gov “sec filings” or site:.org “sec filings” — limiting your search results goes a long way in removing unimportant data.

     
  8. Juvenile arrest rates:
    http://www.ojp.usdoj.gov/bjs/data/violarr.wk1
    Linked from http://www.ojp.usdoj.gov/bjs/dtdata.htm

    Probably not the most friendly format though!

     
  9. http://www.sitepoint.com/blogs/2006/11/14/scavenger-hunt/

    This page has direct links to most of the resources required for the hunt.

     
  10. You’re good, cranial-bore! :D

     
  11. 1. Analyze the nutritional content of foods?
    http://www.ars.usda.gov/SP2UserFiles/Place/12355000/apps/fndds1_ascii.exe
    This one was the hardest for me to find ’cause I kind of geek out at Food & Nutrition sites.

    2. Find the population (and other basic demographics) of my city?
    Here’s all of Wisconsin.
    http://www.doa.state.wi.us/dir/wisconsin/WIsf3_demo_profiles.xls

    My girlfriend works for the city, so finding this one was easy. No The Google necessary.

    3. Analyze the latest SEC filings by public companies?
    http://www.sec.gov/Archives/edgar/xbrlrss.xml

    4. Look at historical gas prices?
    http://tonto.eia.doe.gov/oog/ftparea/wogirs/xls/pswrgvwnus.xls
    First thing in The Google.

    5. Look for trends in juvenile arrest rates?
    http://ojjdp.ncjrs.org/ojstatbb/ezaucr/asp/ucr_export.asp?Select_State=0&Select_County=0&rdoData=1c&rdoYear=99&Print=no

    Thanks for doing this - I had a blast!

    Ruth

     
  12. i didnt look for each item specifically, but this site has always helped when searching for US Government information: http://www.firstgov.gov/

     

Sponsored Links

Leave a response

You are not logged in, log in with your SitePoint Forum username and password.

-OR- Post Anonymously

* Make sure any code samples are escaped (i.e. ‘<b>’ becomes ‘&lt;b&gt;’).

If not logged in, your comments will be placed in a moderation queue. This means your comment may not appear until one of our moderators approves it.

SitePoint Marketplace

Buy and sell Websites, templates, domain names, hosting, graphics and more.

Logo Design, Web page Design and more!

99designs

  • Custom logo designs created ‘just for you’.
  • Pick the design you like best.
  • Only pay if you’re satisfied with the result.

The Web Site Revenue Maximizer

New Release

Free PDF Download:

101 Ways To Make Money From Your Website!

Free eBook! Firefox Revealed