SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Evangelist
    Join Date
    Mar 2001
    Location
    New York
    Posts
    502
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Fixing Zip Code Database

    I've just bought a zip code database and I'm very happy with it. It only has 5 digit US zip codes but it comes with many other things and I'm happy with it. However, I have about 1400 entries in a database that appear randomnly as 12345-1234 format or 12345 format. I was wondering if there is a query I could use to check whether the entry contains a 9 digit zip code and then truncate it after 5 digits. I can think of how to do this with a php string function (I think so) but I'm sure there is an easier way to do this and I'm hoping that someone with experience reads this!

    Thanks in advance.
    My law forum and legal information website -- I also buy websites you may wish to sell

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,275
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    why would you want to change the data you have? presumably it's more accurate to use 9 rather than 5, so you'd be destroying data (never a good idea)

    whenever you need to join your data to the zipcode table, just join on the leftmost 5 characters
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    whenever you need to join your data to the zipcode table, just join on the leftmost 5 characters
    Wouldn't this prevent mysql from using index on this column?

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,275
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    no, i don't think so, not if it's the leftmost 5 characters

    would be worth a test to be sure, eh

    (and of course it would have to be a test on a alrge enough table)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    Knowledge is key 2 progression Tryst's Avatar
    Join Date
    Sep 2003
    Location
    Wales
    Posts
    1,181
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Rudy, you told me yesterday that FUNCTIONS prevent indexes being used.

    Wouldn't LEFT (assuming this is the name) be a Function?

    Tryst

  6. #6
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well I've made the test... Sorry, it's in php

    PHP Code:
    mysql_connect("localhost""root""");

    mysql_query("USE test");
    mysql_query("DROP TABLE IF EXISTS abc");
    mysql_query("CREATE TABLE abc (
        id int NOT NULL auto_increment,
        foo varchar(31),
        PRIMARY KEY (id),
        KEY foo(foo)
    )"
    );
    $CNT 10000;
    while(--
    $CNT)
        
    mysql_query("INSERT IGNORE INTO abc(foo) VALUES('X" . ($CNT 100). "')");

    echo 
    "<pre>";

    print_r(mysql_get_server_info() . "\n");

    print_r(mysql_fetch_assoc(
        
    mysql_query("EXPLAIN SELECT * FROM abc WHERE foo = 'X12'")
    ));
    print_r(mysql_fetch_assoc(
        
    mysql_query("EXPLAIN SELECT * FROM abc WHERE LEFT(foo,3) = 'X12'")
    )); 
    And here is what I have

    Code:
    4.1.9-nt
    Array
    (
        [id] => 1
        [select_type] => SIMPLE
        [table] => abc
        [type] => ref
        [possible_keys] => foo
        [key] => foo
        [key_len] => 32
        [ref] => const
        [rows] => 80
        [Extra] => Using where
    )
    Array
    (
        [id] => 1
        [select_type] => SIMPLE
        [table] => abc
        [type] => ALL
        [possible_keys] => 
        [key] => 
        [key_len] => 
        [ref] => 
        [rows] => 9999
        [Extra] => Using where
    )
    Seems index isn't used with function... even if it's LEFT().
    Please comment.

  7. #7
    Chessplayer kleineme's Avatar
    Join Date
    Apr 2004
    Location
    Germany
    Posts
    608
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    I can't test it for myself at the moment, but what about this one:

    EXPLAIN SELECT * FROM abc WHERE foo LIKE 'X12%';

    Shouldn't this result in an index range scan?
    Never ascribe to malice,
    that which can be explained by incompetence.
    Your code should not look unmaintainable, just be that way.

  8. #8
    SitePoint Evangelist
    Join Date
    Mar 2001
    Location
    New York
    Posts
    502
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm only a relative neophyte regarding mysql. My experience has mostly been performing calculations in PHP to make matches in mysql. I surmise that the 9 digit zip codes are correct but it's really not so important to have them all since the 5 digits will work fine for the mailing lists and even for distance calculations which use lattitude and longitudinal points and not necessarily the 9 digit zip code that the post office uses.

    Regarding indexes, there are no primary keys for these tables that I can see. Since these databases are updated regularly one needs to pop in the new and any hard indexing will not work. To explain, there can be several area codes in one city and several cities in one zip code. Zoning may even change which corresponds with the other.

    Thus I get back to my original question. Either (1) Is there a simple method of truncating the data to only the five leftmost characters or (2) what is the simple method of searching for comparisons to the 5 leftmost characters in the 9 digit zip code entered into the database?

    For now, I can easily see how to compare 5 digit (or character) zip codes.
    My law forum and legal information website -- I also buy websites you may wish to sell

  9. #9
    SitePoint Evangelist
    Join Date
    Mar 2001
    Location
    New York
    Posts
    502
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Silly me. I could always change the varchar length of the zip code field to 5 and it will truncate the rightmost 5 characters including the dash. I have kept the old database but this makes everything easier. I still would enjoy seeing code that deals with the 5 leftmost (first five) characters and can be used for comparison functions. The way I see this being accomplished is with arrays although it adds extra processing to read the first 5 chars into an array.
    My law forum and legal information website -- I also buy websites you may wish to sell


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •