SitePoint Sponsor

User Tag List

Results 1 to 13 of 13

Hybrid View

  1. #1
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Multiple Table Joins - Advice Needed

    I have a possible 4 table join, and things are ugly.

    I'm trying to implement a zipcode based search and i can't seem to narrow down the rows properly. Now here are my tables (i took out several irrelevant columns though)

    Code:
    CREATE TABLE `member` (
      `member_id` int(10) unsigned NOT NULL auto_increment,
      `member_firstname` varchar(45) NOT NULL default '',
      `member_lastname` varchar(45) NOT NULL default '',
      `member_email` varchar(45) NOT NULL default '',
      `member_password` varchar(45) NOT NULL default '',
      `member_status  varchar(1) NOT NULL default '',
      `member_country` varchar(45) NOT NULL default '',
      `member_lastlogin` datetime default '0000-00-00 00:00:00',
      `member_state` varchar(45) NOT NULL default '',
      `member_city` varchar(45) NOT NULL default '',
      `member_moniker` varchar(45) default '',
      `member_zipcode` varchar(5) NOT NULL default '',
      PRIMARY KEY  (`member_id`),
      KEY `idx_email` (`member_email`),
      KEY `idx_password` (`member_password`),
      KEY `idx_zipcode` (`member_zipcode`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    
    CREATE TABLE `profile` (
      `profile_memberid` int(10) unsigned NOT NULL default '0',
      `profile_businessid` int(10) unsigned NOT NULL default '0',
      `profile_status` enum('I','C','P') NOT NULL default 'I',
      `profile_views` int(10) unsigned NOT NULL default '0',
      `profile_aboutme` text NOT NULL,
      `profile_headline` varchar(250) NOT NULL default '',
      `pofile_url` text,
      PRIMARY KEY  (`profile_memberid`),
      KEY `profile_memberid` (`profile_memberid`),
      KEY `profile_businessid` (`profile_businessid`),
      KEY `mpbusmemid` TYPE BTREE (`profile_businessid`,` profile_status`,` profile_memberid`),
      KEY `indx_status` (` profile_status`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC;
    
    CREATE TABLE `zipcodes` (
      `zipcode` varchar(5) NOT NULL default '0',
      `latitude` float NOT NULL default '0',
      `longitude` float NOT NULL default '0',
      `city` varchar(45) NOT NULL default '0',
      `state` varchar(45) NOT NULL default '0',
      `abbr` char(2) NOT NULL default '0',
      PRIMARY KEY  (`zipcode`),
      KEY `indx_latitude` (`latitude`),
      KEY `indx_longitude` (`longitude`)
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 230400 kB';
    
    CREATE TABLE `business` (
      `bus_id` int(11) NOT NULL default '0',
      `bus_name` varchar(50) NOT NULL default '',
      `bus_address` varchar(100) default NULL,  
      PRIMARY KEY  (`bus_id),
    ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

    In pseudo SQL here is what I am trying to accomplish:

    "Given a zipcode, find all active & members, their profile information, and an associated business within x miles radius "

    Here is one of the queries I tried running with no success, I AM ABSOLUTELY TERRIBLE at multi-table joins for some reason, I don't ever remember having this kind of SQL trouble before, it's like I'm in a slump or something The subquery is basically "borrowed" from a thread here at SP and simply does a quick lookup of several zipcodes that match the criteria I'm looking for based on a starting lat and lon, and those hard coded floating point numbers were just taken right out of my query right before it ran in my page. Anyhow, it's not the zipcode lookup I'm having trouble with, it's basically joining all these tables & derived tables together properly. It's driving me insane!


    Code:
    SELECT 	m.member_id, 
    		m.member_firstname, 
    		m.member_lastname, 
    		m.member_country, 
    		m.member_city, 
    		m.member_state,  
    		mp.profile_businessid, 
    		mp.profile_aboutme, 
    		mp.profile_headline, 
    		b.bus_name 
    FROM 	( 	SELECT	zipcode, 
    					sqrt(pow(latitude - 42.1819, 2) + pow(longitude - -73.3664, 2))*66.6666666667 as distance 
    			FROM 	zipcodes 			
    			WHERE	latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75 
    		) as zip,
    			
    		member m INNER JOIN profile  p on m.member_id = mp.profile_memberid , 
    		business b 
    WHERE 	m.member_status = 'A' 
      		AND	b.bus_id IN('36') 
    		AND b.bus_id = mp.profile_businessid 
    		AND zip.zipcode = m.member_zipcode 
    LIMIT 0, 10
    As always, any help, insight, suggestions, etc. are always appreciated.

  2. #2
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    my god, that SQL you have is a mess.
    Code:
    SELECT 	m.member_id, 
    		m.member_firstname, 
    		m.member_lastname, 
    		m.member_country, 
    		m.member_city, 
    		m.member_state,  
    		mp.profile_businessid, 
    		mp.profile_aboutme, 
    		mp.profile_headline, 
    		b.bus_name 
    FROM member m
    INNER JOIN profile mp ON
    	m.member_id = mp.profile_memberid
    INNER JOIN business b ON
    	b.bus_id = mp.profile_businessid
    WHERE
    	m.member_status = 'A'
    	AND m.member_zipcode in (
    		SELECT zipcode
    		FROM zipcodes 			
    		WHERE latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75 
    	)
    LIMIT 0, 10

  3. #3
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the reply,

    Well it seems that the query above that you gave runs very slowly, like 13 seconds, of course taking out the subselect speeds it up by quite a bit. Is there any way around this? Is there another way to join these up? What bothers me is that the

    "where m.member_zipcode in(select ...)" should simply be an index lookup right?

    Here is what explain says if that helps: (Notice the 2nd row, that says ALL and then estimates over 50k rows. I'm not sure how to optimize that though.)

    1, 'PRIMARY', 'b', 'const', 'PRIMARY', 'PRIMARY', 4, 'const', 1, ''
    1, 'PRIMARY', 'm', 'ALL', '', '', , '', 54995, 'Using where'
    1, 'PRIMARY', 'mp', 'eq_ref', 'PRIMARY,profile_memberid,profile_businessid,mpbusmemid', 'PRIMARY', 4, 'db.m.member_id', 1, 'Using where'
    2, 'DEPENDENT SUBQUERY', 'zipcodes', 'unique_subquery', 'PRIMARY,indx_latitude,indx_longitude', 'PRIMARY', 5, 'func', 1, 'Using index; Using where'

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    i think that WHERE condition refers to the member_status, which it has to retrieve via table scan

    what if you made the primary key (member_id,member_status) instead?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Thumbs down

    Quote Originally Posted by r937
    i think that WHERE condition refers to the member_status, which it has to retrieve via table scan

    what if you made the primary key (member_id,member_status) instead?
    Yeah, I tried that, but no joy. This is really, really annoying.

  6. #6
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    good call. didn't notice that.

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    well, if you're only going to show LIMIT 0,10 anyway...

    run a simpler query, joining the member table to the zipcodes table, do the proximity thing, and select only the member ids

    then plunk those member ids (there's only 10 of them, right?) into an IN List in a second query which retrieves the other columns

    CAVEAT: i am not a dba, and i don't normally tune queries for performance
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    well, if you're only going to show LIMIT 0,10 anyway...

    run a simpler query, joining the member table to the zipcodes table, do the proximity thing, and select only the member ids

    then plunk those member ids (there's only 10 of them, right?) into an IN List in a second query which retrieves the other columns

    CAVEAT: i am not a dba, and i don't normally tune queries for performance
    even running a *simple* query gives a table scan, query as follows:

    Code:
    SELECT 	 m.member_id
    FROM member m
    	
    WHERE
     m.member_zipcode in (
    		SELECT zipcode
    		FROM zipcodes 			
    		WHERE latitude
        between 42.1819 - 0.75 and 42.1819 + 0.75
        and longitude between -73.3664 - 0.75 and -73.3664 + 0.75
    I have an index on member_zipcode, yet it does a full table scan? I just don't get it.


    1, 'PRIMARY', 'm', 'ALL', '', '', , '', 54995, 'Using where'
    2, 'DEPENDENT SUBQUERY', 'zipcodes', 'unique_subquery', 'PRIMARY,indx_latitude,indx_longitude', 'PRIMARY', 5, 'func', 1, 'Using index; Using where'

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    like i said, i'm no dba, and tuning isn't my skill...
    Code:
    select m.member_id
      from zipcodes   
    inner
      join member m
        on m.member_zipcode 
         = zipcodes.zipcode
     where latitude
            between 42.1819 - 0.75 
                and 42.1819 + 0.75
       and longitude 
            between -73.3664 - 0.75 
                and -73.3664 + 0.75
    order by something
    limit 0,10
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Right before you posted, I realized something. Grabbing the ID's in the zipcodes first won't work simply because those ID's have to match not only the zipcode, but also a certain status. So if I grab 10 ID's and all of those members are "inactive", i'm hosed. I tell you what, DB tuning isn't my area either, that's for sure!


    Quote Originally Posted by r937
    like i said, i'm no dba, and tuning isn't my skill...
    Code:
    select m.member_id
      from zipcodes   
    inner
      join member m
        on m.member_zipcode 
         = zipcodes.zipcode
     where latitude
            between 42.1819 - 0.75 
                and 42.1819 + 0.75
       and longitude 
            between -73.3664 - 0.75 
                and -73.3664 + 0.75
    order by something
    limit 0,10

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    so what's your ratio of A members to all members? a third?

    then grab 30, of which you expect to find 10 A's...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    so what's your ratio of A members to all members? a third?

    then grab 30, of which you expect to find 10 A's...
    Well, I need to page the results as well as this is a search interface, i.e. after you run the sql it shows 10 results, and you can page through the rest. This would seem quite complicated to do with said ration scheme. I swear, I am losing a grip here.

  13. #13
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    try running just the zipcode distance query to get the zipcode. then use IN with those results as a replacement for those zipcodes in the query i gave you.

    mysql doesn't seem to optimize subqueries correctly (yet).


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •