SitePoint Sponsor |
|
User Tag List
Results 1 to 13 of 13
-
Jul 26, 2005, 21:33 #1
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Multiple Table Joins - Advice Needed
I have a possible 4 table join, and things are ugly.
I'm trying to implement a zipcode based search and i can't seem to narrow down the rows properly. Now here are my tables (i took out several irrelevant columns though)
Code:CREATE TABLE `member` ( `member_id` int(10) unsigned NOT NULL auto_increment, `member_firstname` varchar(45) NOT NULL default '', `member_lastname` varchar(45) NOT NULL default '', `member_email` varchar(45) NOT NULL default '', `member_password` varchar(45) NOT NULL default '', `member_status varchar(1) NOT NULL default '', `member_country` varchar(45) NOT NULL default '', `member_lastlogin` datetime default '0000-00-00 00:00:00', `member_state` varchar(45) NOT NULL default '', `member_city` varchar(45) NOT NULL default '', `member_moniker` varchar(45) default '', `member_zipcode` varchar(5) NOT NULL default '', PRIMARY KEY (`member_id`), KEY `idx_email` (`member_email`), KEY `idx_password` (`member_password`), KEY `idx_zipcode` (`member_zipcode`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE `profile` ( `profile_memberid` int(10) unsigned NOT NULL default '0', `profile_businessid` int(10) unsigned NOT NULL default '0', `profile_status` enum('I','C','P') NOT NULL default 'I', `profile_views` int(10) unsigned NOT NULL default '0', `profile_aboutme` text NOT NULL, `profile_headline` varchar(250) NOT NULL default '', `pofile_url` text, PRIMARY KEY (`profile_memberid`), KEY `profile_memberid` (`profile_memberid`), KEY `profile_businessid` (`profile_businessid`), KEY `mpbusmemid` TYPE BTREE (`profile_businessid`,` profile_status`,` profile_memberid`), KEY `indx_status` (` profile_status`)) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC; CREATE TABLE `zipcodes` ( `zipcode` varchar(5) NOT NULL default '0', `latitude` float NOT NULL default '0', `longitude` float NOT NULL default '0', `city` varchar(45) NOT NULL default '0', `state` varchar(45) NOT NULL default '0', `abbr` char(2) NOT NULL default '0', PRIMARY KEY (`zipcode`), KEY `indx_latitude` (`latitude`), KEY `indx_longitude` (`longitude`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 230400 kB'; CREATE TABLE `business` ( `bus_id` int(11) NOT NULL default '0', `bus_name` varchar(50) NOT NULL default '', `bus_address` varchar(100) default NULL, PRIMARY KEY (`bus_id), ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
In pseudo SQL here is what I am trying to accomplish:
"Given a zipcode, find all active & members, their profile information, and an associated business within x miles radius "
Here is one of the queries I tried running with no success, I AM ABSOLUTELY TERRIBLE at multi-table joins for some reason, I don't ever remember having this kind of SQL trouble before, it's like I'm in a slump or somethingThe subquery is basically "borrowed" from a thread here at SP and simply does a quick lookup of several zipcodes that match the criteria I'm looking for based on a starting lat and lon, and those hard coded floating point numbers were just taken right out of my query right before it ran in my page. Anyhow, it's not the zipcode lookup I'm having trouble with, it's basically joining all these tables & derived tables together properly. It's driving me insane!
Code:SELECT m.member_id, m.member_firstname, m.member_lastname, m.member_country, m.member_city, m.member_state, mp.profile_businessid, mp.profile_aboutme, mp.profile_headline, b.bus_name FROM ( SELECT zipcode, sqrt(pow(latitude - 42.1819, 2) + pow(longitude - -73.3664, 2))*66.6666666667 as distance FROM zipcodes WHERE latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75 ) as zip, member m INNER JOIN profile p on m.member_id = mp.profile_memberid , business b WHERE m.member_status = 'A' AND b.bus_id IN('36') AND b.bus_id = mp.profile_businessid AND zip.zipcode = m.member_zipcode LIMIT 0, 10
-
Jul 27, 2005, 07:57 #2
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
my god, that SQL you have is a mess.
Code:SELECT m.member_id, m.member_firstname, m.member_lastname, m.member_country, m.member_city, m.member_state, mp.profile_businessid, mp.profile_aboutme, mp.profile_headline, b.bus_name FROM member m INNER JOIN profile mp ON m.member_id = mp.profile_memberid INNER JOIN business b ON b.bus_id = mp.profile_businessid WHERE m.member_status = 'A' AND m.member_zipcode in ( SELECT zipcode FROM zipcodes WHERE latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75 ) LIMIT 0, 10
-
Jul 27, 2005, 17:19 #3
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Thanks for the reply,
Well it seems that the query above that you gave runs very slowly, like 13 seconds, of course taking out the subselect speeds it up by quite a bit. Is there any way around this? Is there another way to join these up? What bothers me is that the
"where m.member_zipcode in(select ...)" should simply be an index lookup right?
Here is what explain says if that helps: (Notice the 2nd row, that says ALL and then estimates over 50k rows. I'm not sure how to optimize that though.)
1, 'PRIMARY', 'b', 'const', 'PRIMARY', 'PRIMARY', 4, 'const', 1, ''
1, 'PRIMARY', 'm', 'ALL', '', '', , '', 54995, 'Using where'
1, 'PRIMARY', 'mp', 'eq_ref', 'PRIMARY,profile_memberid,profile_businessid,mpbusmemid', 'PRIMARY', 4, 'db.m.member_id', 1, 'Using where'
2, 'DEPENDENT SUBQUERY', 'zipcodes', 'unique_subquery', 'PRIMARY,indx_latitude,indx_longitude', 'PRIMARY', 5, 'func', 1, 'Using index; Using where'
-
Jul 27, 2005, 17:54 #4
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
i think that WHERE condition refers to the member_status, which it has to retrieve via table scan
what if you made the primary key (member_id,member_status) instead?
-
Jul 27, 2005, 18:16 #5
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
good call. didn't notice that.
-
Jul 27, 2005, 18:55 #6
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Originally Posted by r937
-
Jul 27, 2005, 19:22 #7
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
well, if you're only going to show LIMIT 0,10 anyway...
run a simpler query, joining the member table to the zipcodes table, do the proximity thing, and select only the member ids
then plunk those member ids (there's only 10 of them, right?) into an IN List in a second query which retrieves the other columns
CAVEAT: i am not a dba, and i don't normally tune queries for performance
-
Jul 27, 2005, 19:51 #8
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Originally Posted by r937
Code:SELECT m.member_id FROM member m WHERE m.member_zipcode in ( SELECT zipcode FROM zipcodes WHERE latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75
1, 'PRIMARY', 'm', 'ALL', '', '', , '', 54995, 'Using where'
2, 'DEPENDENT SUBQUERY', 'zipcodes', 'unique_subquery', 'PRIMARY,indx_latitude,indx_longitude', 'PRIMARY', 5, 'func', 1, 'Using index; Using where'
-
Jul 27, 2005, 20:36 #9
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
like i said, i'm no dba, and tuning isn't my skill...
Code:select m.member_id from zipcodes inner join member m on m.member_zipcode = zipcodes.zipcode where latitude between 42.1819 - 0.75 and 42.1819 + 0.75 and longitude between -73.3664 - 0.75 and -73.3664 + 0.75 order by something limit 0,10
-
Jul 27, 2005, 20:47 #10
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Right before you posted, I realized something. Grabbing the ID's in the zipcodes first won't work simply because those ID's have to match not only the zipcode, but also a certain status. So if I grab 10 ID's and all of those members are "inactive", i'm hosed.
I tell you what, DB tuning isn't my area either, that's for sure!
Originally Posted by r937
-
Jul 27, 2005, 20:48 #11
- Join Date
- Jul 2002
- Location
- Toronto, Canada
- Posts
- 39,347
- Mentioned
- 63 Post(s)
- Tagged
- 3 Thread(s)
so what's your ratio of A members to all members? a third?
then grab 30, of which you expect to find 10 A's...
-
Jul 27, 2005, 20:54 #12
- Join Date
- Sep 2004
- Location
- United States
- Posts
- 123
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Originally Posted by r937
-
Jul 28, 2005, 06:15 #13
- Join Date
- Feb 2004
- Location
- Tampa, FL (US)
- Posts
- 9,854
- Mentioned
- 1 Post(s)
- Tagged
- 0 Thread(s)
try running just the zipcode distance query to get the zipcode. then use IN with those results as a replacement for those zipcodes in the query i gave you.
mysql doesn't seem to optimize subqueries correctly (yet).
Bookmarks