SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Help! I have a problem doing a mysql index

    Hi,

    Could you please help me on a mysql query that I am doing. I need to do an index for it so that I can search through my database quickly, but seem to be having problems with the index.

    I think that my query may be the problem, which is shown as follows:

    select descrip,id,merchant,cat,price from products where cat like '%home & garden%' and price between 0 and 9999 order by cat asc

    I have used one index in all that contains cat and price in the exact order as just written.

    I have also tried using a full text index on just cat, but it just dosent seem to make things any faster.

    Could you please tell me a way that I could query my database the same way as I am querying it but instead of using the 'LIKE' part, I could use something else and it will still bring up the same results and also work with my index.

    Thanks!

  2. #2
    SitePoint Zealot CdeMky's Avatar
    Join Date
    Sep 2004
    Location
    United States
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I would have a category table with a "description", so that I could search for a category id's using "%like%" or using the fulltext features of MyISAM.

    Then, once I got category ID's that matched what I was looking for, I would plug them into my query as so:

    pseudo code:
    select descrip,id,merchant,cat,price .......where cat in (found_category_id's)

    Essentially, this is how I would approach this. That way, you can do an index lookup using a category without using the pattern matching.

  3. #3
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That sounds good, but it seems like it will take to long and will involved to much work as all my data is coming from online merchants datafeeds, which means my categories are changing all of the time.

    Is the another way where I could keep all of my fields in one table, but do a better query so that I could use an index.

  4. #4
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Please look at the next post for my next problem
    Last edited by john278; Aug 3, 2005 at 19:28.

  5. #5
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    OK!

    It looks as if it still isn't working correctly. The Fulltext index is working. It is just the price column that isn't working. I have done a normal index on the price column, which is a decimal type field.

    I am using the follwoing query and it is just giving me 0 results, when it should be giving me quite a few.

    The query:

    select descrip,id,merchant,cat,price from products where cat like '%home & garden%' and price between 0 and 9999 order by cat asc

    I hope you can help!

    Thanks!

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by john278
    I am using the follwoing query and it is just giving me 0 results, when it should be giving me quite a few.
    the best way to debug a query like this is to use a small number of sample rows in a sample table, then script them out and show these to us (the CREATE TABLE and INSERT INTO statements), so that we can set up the same sample table and test the query here
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Heres more information:

    I don't quite know how to show the table so I have just done an export.

    Here it is:

    PHP Code:
    CREATE TABLE `products` (
      `
    idbigint(15NOT NULL auto_increment,
      `
    midsmallint(6NOT NULL default '0',
      `
    fidsmallint(6NOT NULL default '0',
      `
    pidvarchar(100collate latin1_general_ci NOT NULL default '',
      `
    mervarchar(50collate latin1_general_ci NOT NULL default '',
      `
    urlvarchar(255collate latin1_general_ci NOT NULL default '',
      `
    titlevarchar(255collate latin1_general_ci NOT NULL default '',
      `
    descripvarchar(255collate latin1_general_ci NOT NULL default '',
      `
    iurlvarchar(255collate latin1_general_ci NOT NULL default '',
      `
    catvarchar(255collate latin1_general_ci NOT NULL default '',
      `
    pricedecimal(10,2NOT NULL default '0.00',
      `
    randombigint(10NOT NULL default '0',
      `
    isbnvarchar(150collate latin1_general_ci NOT NULL default '',
      `
    manuvarchar(250collate latin1_general_ci NOT NULL default '',
      `
    apptinyint(1NOT NULL default '1',
      `
    hitssmallint(6NOT NULL default '0',
      `
    datedate NOT NULL default '0000-00-00',
      
    PRIMARY KEY  (`id`),
      
    KEY `price` (`price`(3)),
      
    FULLTEXT KEY `cat` (`cat`),
    ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=886887 
    The query that I want to do is somethine like the following:

    PHP Code:
    select substring_index(descrip,' ',30) as descrip,id,mid,mer,url,title,manu,isbn,iurl,cat,price from products where cat like '%health & beauty%' and Price between '0' AND '999999999' and app='1' order by random asc 
    So that my query will then search through the category column and get all of the results with the "health & beauty" phrase in it and a price of 0 and 99999999.

    At the moment I have just over 650,000 listings and the query is taking about 0.12 - 0.25 seconds, which is rather slow as some of my queries will be similar to below meaning that the query has to search through mysql three times or so to return the results.:

    PHP Code:
    select substring_index(descrip,' ',30) as descrip,id,mid,mer,url,title,manu,isbn,iurl,cat,price from products where cat like '%health & beauty%' and Price between '0' AND '999999999' and app='1' or cat like '%home & garden%' and Price between '0' AND '999999999' and app='1' or cat like '%toys & games%' and Price between '0' AND '999999999' and app='1' order by random asc 
    The table is also about 287 MB's in size.

    Could you please help me.

    Thanks!

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    your query is going to do a table scan any time you do LIKE '%something%', i.e. with a wildcard at the front

    what's the purpose of indexing the leftmost three characters of the string conversion of the price?

    also, Price between '0' AND '999999999' seems kind of pointless -- why would you want to convert a number to a strinbg to see if it fits between two other strings, when a numeric comparison would be much faster, assuming it wasn't totally irrelevant (by this i mean, can you tell me what prices are not between those two values?)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I need the wildcard as it is looking through the cat field to see if a phrase or keyword matches any words within the cat field.

    The price section is "between 0 and 9999999999 for defualt. But if people want to narrow down things and search for products on my site that have a price of between $10 and $20 then they will type that into a text box and it will then change the "between 0 and 9999999999" to "between 10 and 20".

    Also, I don't quite understand what you mean by when you say "when a numeric comparison would be much faster"

    Could you give me a bit more information on that please.

    Thanks for all your help.

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by john278
    I need the wildcard as it is looking through the cat field to see if a phrase or keyword matches any words within the cat field.
    i understand that, but just be aware that you'll get a table scan for this

    regarding the price default, your default should be to not generate the WHERE condition -- only generate it when there's an actual range to test

    my remarks about numeric comparison were intended to draw your attention to the fact that price is numeric and you were comparing it to strings -- mysql lets you get away with this, but you'll get syntax errors in other databases, and in any case the numeric comparison is faster
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't know or have not heard anything about numeric comparison. Is it a PHP and MySQL thing. If so then could you please give me a place where I can learn it so I know more about it and I can give it ago.

    Also, is there another way that I can look through my cat field for a certain phrase withour using a wildcard as I don't want to do a fill table scan.

    I tried using cat = 'some word' and it is alot faster. But I can use "=" as the results from my query needs to return results no matter if the phrase/keyword is at the begining of the cat field, in the middle of it between other text or even at the end of the cat field for each row within mysql.

    Is there a way that I can use a where statement to do that and then put an index on the cat field.

    Thanks for all your help.

  12. #12
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    unfortunately what you want requires LIKE, and the performance will be a scan
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  13. #13
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How does other major sites do things like this as they seem to use search features where they do similar things to me and their search seems really fast.

  14. #14
    SitePoint Wizard
    Join Date
    Nov 2003
    Location
    United Kingdom
    Posts
    2,120
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    At the moment with just over 600,000 records it takes about 15 - 20 seconds for each page to load, which is not good at all as all my visitors will be gone before they even see the fully loaded pages.

  15. #15
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    one approach would be to have a predetermined set of categories

    it's the LIKE search that's killing you, so don't allow a LIKE search on a 255-character free-form field
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •