SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    426
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Add indices for an advanced search option

    I've taken over a database that lists products and there is currently around 6k records but it grows by about 2k per year. Broadly speaking, each product has a single category field (GUID) and then six other fields relating to other areas of products.

    When you click on a page it shows a category so I thought it prudent to index that although to be fair it seems to run fine without. The cardinality of the category column is about 60.

    There is an advanced search where you can then optionally choose to filter the other six fields so you always filter by category the other six fields are optional and could be in any number or combination.

    The noteworthy point is there is an int flag called status of 0, 1 or 2 corresponding to invisible, for sale, archived respectively.

    Databases are my weak point so forgive these questions if they're stupid:

    1. With MySQL being quite fast, is there a rule-of-thumb for when to add an index in terms of number of records? Was I right to add one at ~6,000 records?

    2. Am I right in saying the index on the category effectively separates the ~6,000 records into ~60 different sections. If we assumed the categories were evenly spread, performance-wise it's almost as good as querying a single table of 100 records as the index means MySQL “knows” where the correct 100 are. Is that how it works? In which case, there is no need to index any of the other six fields, right?

    3. I've read that an index on the status field would be a bad idea since the cardinality is 3. Why is this? Again, is there any rule-of-thumb as to when to add an index?

    Generally speaking, with MySQL are you okay up to 10,000 records even on a shared server?

    Thanks!

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by DrQuincy View Post
    1. With MySQL being quite fast, is there a rule-of-thumb for when to add an index in terms of number of records? Was I right to add one at ~6,000 records?
    my rule of thumb is to run the query without the index, make note of stats, add index and re-run for new stats, if no improvement then drop index

    and, of course, EXPLAINs on the before- and after-index queries

    Quote Originally Posted by DrQuincy View Post
    2. Am I right in saying the index on the category effectively separates the ~6,000 records into ~60 different sections. If we assumed the categories were evenly spread, performance-wise it's almost as good as querying a single table of 100 records as the index means MySQL “knows” where the correct 100 are. Is that how it works?
    more or less, yes, but there are index reads in addition to row reads, and those rows might be spread across the dataset, making them more expensive in terms of physical-vs-logical reads

    you're probably right that there is no need to index any of the other six fields, because presumably you want to return the entire product row, thus ruling out a covering index

    Quote Originally Posted by DrQuincy View Post
    3. I've read that an index on the status field would be a bad idea since the cardinality is 3. Why is this?
    because the additional number of index reads added to row reads, compared with row reads of the entire table (table scan), are prohibitive
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Evangelist
    Join Date
    May 2006
    Posts
    426
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Ah, that makes sense, thanks for that. I am learning to use EXPLAIN to am getting there.

    Very interesting and clever stuff!

  4. #4
    SitePoint Enthusiast
    Join Date
    Nov 2013
    Posts
    26
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    my rule of thumb is to run the query without the index, make note of stats, add index and re-run for new stats, if no improvement then drop index
    Remember to disable the querycache and to test the index's performance through a loadtest. Don't go by one-of measurements.

    [quote]
    Generally speaking, with MySQL are you okay up to 10,000 records even on a shared server?
    [quote]

    10.000 records is tiny and the performance you are seeing now is probably due to the fact that the entire database fits into memory, eliminating most of the drawbacks of sequential scans. Databases are generaly clever enough to know how much of their data they can expect to find in cache and can choose not to use an index even if the querylogic says the index could be beneficial.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •