Large table join

Hi

Im stuck with a large table join, thats giving me max_join_size issues.

I have a table of properties (around 2500 entries), a table of towns & postcodes with Longitude & Latitude (38000 entries), and a table of images relating to the properties (42000 entries). The status table only has about 50 entries.

Im using google maps API to draw a circle on a map and fetching all properties within that circle, properties are only referenced with postcodes in my table, google maps uses GPS LatLng, hence the lookup table. I also need to pull one image for each prop from the images table and the current status

My sql with real data is this


SELECT prop_t.Prop_ID,  prop_pics_t.Pic_ID,  prop_t.Prop_Status_Code, 
               prop_t.Map_Pin_Postcode,  prop_t.Map_Pin_Town,  prop_t.Prop_Title_En, 
               prop_t.Area_Land,  prop_t.Agents_Prop_Code, Publish
 FROM prop_t 
    LEFT JOIN prop_pics_t ON prop_t.Prop_ID = prop_pics_t.Prop_ID 
    LEFT JOIN prop_stat_t ON prop_t.Prop_Status_Code = prop_stat_t.Property_Status_Code 
 JOIN (
     SELECT postal_code FROM frenchmap 
        WHERE Lat BETWEEN '43.236572693911285' AND '47.54848605768498' 
           AND Lng BETWEEN '-0.7042013034011916' AND '5.437465803401324' 
     GROUP BY postal_code
         ) AS MAP ON MAP.postal_code = Map_Pin_Postcode 

WHERE (Primary_Image = 'Y' OR Primary_Image IS NULL) 
     AND HQ_Image = 'N' 
     AND prop_t.Deleted = 'N' 
     AND Publish = 'Y' 
     AND Prop_Dept < '96'  
     AND Price_Euro BETWEEN '100000' AND '1000000' 
     AND Area_Land BETWEEN '0' AND '62000' 

I can get it to work if I drop the image join, but that would mean Id need to access the DB again just to get the image reference, Id rather do it cleanly in one query.

Im assuming that Im attacking this wrongly in the sql :blush:

Thanks for any pointers

please identify all your primary keys

selecting postal codes in a subquery seems to invalidate the rectangle (not,as you posted, circle) concept

postal codes can be overlap your rectangle, yes?

aside: why is it a “frenchmap”?

HI Rudy, thanks

Primary keys are

prop_t > Prop_ID
prop_images_t > Prop_ID
frenchmap > town_name

Yes visually its a circle but the coord’s returned form a rectangle, I may code the edges out later on if needed to return more of a circle.

Yes again its not exact and postcodes will overlap the circle edges, everyone knows their postcode, but not many their GPS location.

frenchmap as thats basically what the table provides for me, it holds all the info of french territories has a whole lot of other data in there too, polygons of town limits, postcodes, population details, official names of all towns and their administration statuses etc along with the gps latlng that Im using in this case … and it was the first name I came up with :wink:

really? so how come there are 2500 properties but 42000 images?

that means you have about 39500 images for unknown properties, i.e. properties that aren’t in your properties table

really?

Sorry Rudy

Ive rechecked, the image table primary key is on an auto inc field called Pic_ID not on the Prop_ID field of that table

so you’re going to have to do something about selecting only one image per property, likely with another subquery

also, make sure each column used in a join is indexed

OK so Ive modified the query to this


[FONT=Courier New][FONT=Courier New]SELECT prop_t.Prop_ID, Images.Pic_ID, prop_t.Prop_Status_Code, 
       prop_t.Map_Pin_Postcode,  prop_t.Map_Pin_Town, prop_t.Prop_Title_En, 
       prop_t.Area_Land, prop_t.Agents_Prop_Code, Publish
  FROM prop_t

  JOIN ([/FONT]    
    SELECT Pic_ID, Prop_ID
      FROM prop_pics_t
        WHERE Primary_Image =  'Y'
          AND HQ_Image =  'N'
  ) AS Images ON prop_t.Prop_ID = Images.Prop_ID

  JOIN (
    SELECT postal_code
      FROM frenchmap
        WHERE Lat BETWEEN  '43.236572693911285' AND  '47.54848605768498'
          AND Lng BETWEEN  '-0.7042013034011916' AND  '5.437465803401324'
      GROUP  BY postal_code
  ) AS MAP ON MAP.postal_code = Map_Pin_Postcode

  LEFT JOIN prop_stat_t ON prop_t.Prop_Status_Code = prop_stat_t.Property_Status_Code

WHERE prop_t.Deleted =  'N'
  AND Publish =  'Y'
  AND Prop_Dept <  '96'
  AND Price_Euro BETWEEN  '100000' AND  '1000000'
  AND Area_Land BETWEEN  '0' AND  '62000'
 
 
[/FONT] 

which has made the error go away but is still slow (1.4 secs) despite indexing all the cols in the Joins … anything else I should be doing ?

that stuff in your WHERE clause… could you qualify those columns using their table names please

if all the join columns are indexed then the slowness could still be coming from a table scan somewhere

could you do an EXPLAIN please

All where clauses are on prop_t

heres the result of EXPLAIN

Edit: the query is running faster today, though ive changed nothing except qualify the where clauses, its now giving me 0.6 secs which is more acceptable … I am using a shared host on this so suspect the timing is down to their server loads/issues etc

looks like the lat/lng subquery isn’t using any index, any chance you can fix that?

postal_code, lat and lng all have indexes, what have I missed ?

could you do a SHOW CREATE TABLE for that table please

Sorry for delay … heres the show create table for frenchmap


[TABLE="class: data"]
[TR="class: odd"]
[TD]CREATE TABLE `frenchmap` (
  `insee_com` varchar(12) NOT NULL default '',
  `postal_code` varchar(60) default NULL,
  `nom_comm` varchar(75) default NULL,
  `nom_dept` varchar(75) default NULL,
  `nom_region` varchar(75) default NULL,
  `statut` varchar(75) default NULL,
  `z_moyen` decimal(6,1) default NULL,
  `superficie` decimal(6,1) default NULL,
  `population` decimal(6,1) default NULL,
  `geo_point_2d` text,
  `Lat` decimal(10,8) default NULL,
  `Lng` decimal(11,8) default NULL,
  `geo_shape` text,
  `id_geofla` text,
  `code_comm` bigint(20) default NULL,
  `code_cant` int(11) default NULL,
  `code_arr` int(11) default NULL,
  `code_dept` int(11) default NULL,
  `code_reg` int(11) default NULL,
  PRIMARY KEY  (`insee_com`),
  KEY `postal_code` (`postal_code`),
  KEY `Lat` (`Lat`),
  KEY `Lng` (`Lng`),
  KEY `code_dept` (`code_dept`),
  KEY `nom_comm` (`nom_comm`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8[/TD]
[/TR]
[/TABLE]



that subquery that operates on the frenchmap table, perhaps you can run it by itself with EXPLAINs to see whether you can get it to utilize an index

the separate indexes on lat and lng don’t seem to be much use – try a composite index on bofadem