Optimize query speed


SELECT * FROM `table1` as A LEFT JOIN `table2` as B ON
				A.file = B.file WHERE
				A.hash <> B.hash

There are 4000+ records in both tables, how can I speed up the comparison? It takes 6.0655 sec now. Anything I can do with smart indexes or something, I’m not too familiar with optimizing mysql.

please do a SHOW CREATE TABLE for both tables

also, do you really mean to use an outer join? or would an inner join suffice?


CREATE TABLE `table1` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `file` varchar(255) NOT NULL DEFAULT '',
  `hash` varchar(255) NOT NULL DEFAULT '',
  `module` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`ID`),
  KEY `ID` (`ID`,`file`,`hash`,`module`)
) ENGINE=MyISAM AUTO_INCREMENT=140317 DEFAULT CHARSET=latin1


CREATE TABLE `table2` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `file` varchar(255) NOT NULL DEFAULT '',
  `hash` varchar(255) NOT NULL DEFAULT '',
  `module` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`ID`),
  KEY `ID` (`ID`,`file`,`hash`,`module`)
) ENGINE=MyISAM AUTO_INCREMENT=4010 DEFAULT CHARSET=latin1

Outer join was chosen because i know it works :slight_smile: nothing more, nothing less

it’s unlikey either of the two ID indexes will ever be used, because in order for the optimizer to use one of them, you have to feed it an ID value, and the PK in both tables already covers those situations

you’re joining on file, so you need an index on that column at least

make it an INNER JOIN

Holy moly,

Query took 0.0676 sec!

What does an index do exactly then? And why is an inner join so much quicker than a LEFT or RIGHT join? Which is an OUTER join and that I didn’t know :wink:

an index provides the engine with a means of locating a specific row quickly

imagine if you had a list of names that you had to look up their numbers in the phone book

and now imagine that the phone book wasn’t in alphabetic order

you’d have to read the phone book from front to back to find each name, and you’d be doing it as many times as you had names to look up

as for the difference between inner and outer joins, please see any competent sql tutorial…

… or perhaps http://www.sitepoint.com/simply-sql-the-from-clause/ (and please disregard the big FAIL in the code samples where they’ve inadvertently stuck <br> tags into the sql code)

in your case you didn’t need an outer join, but the difference in performance was due entirely to the indexes, not the join type