SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Enthusiast
    Join Date
    May 2008
    Posts
    65
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    help tuning sql query

    Hi,

    I'm working on a message board implemenatation which uses the following mysql query to pull a list of threads (and the posts within each thread) and hope someone can offer some help tuning the query. I've included the explain results below as well.

    The query is returning the first 25 threads (based on create_time) from the threads table, and pulling the associated posts (replies) for each of those threads. The threads are displayed in order with their replies in a nested tree fashion like an old style message board, which is what the lft/rgt/depth dimensions are used for.

    CPU is getting pegged from mysql load right now and I'm trying to find ways to reduce the load. Given the number of rows accessed, I'm guessing my rudimentary sql is the culprit? This is the primary query that runs on the board.

    Thanks for your help.


    THE QUERY:
    ==================
    SELECT threads.id AS thread_id, board_id, num_replies, modify_date, posts.id AS post_id, posts.user_id AS author_id, subject, message, lft, rgt, depth, DATE_FORMAT( posts.create_date, '%c/%e/%y %l:%i%p' ) AS reply_date, NULL AS watch_flag
    FROM (

    SELECT id, board_id, num_replies, create_date, modify_date
    FROM threads
    WHERE board_id
    IN ( 1, 2, 3, 4, 27, 15, 16 )
    ORDER BY create_date DESC
    LIMIT 0 , 25
    ) AS threads, posts

    WHERE threads.id = posts.thread_id
    ORDER BY threads.create_date DESC , thread_id DESC , lft ASC ;


    EXPLAIN
    ==================
    ROW 1:
    id=1
    select_type=PRIMARY
    table=<derived>
    type=ALL
    possible_keys=NULL
    key=NULL
    key_len=NULL
    ref=NULL
    rows=25
    extra=using temporary;using filesort

    ROW 2:
    id=1
    select_type=PRIMARY
    table=posts
    type=ref
    possible_keys=thread_id
    key=thread_id
    key_len=4
    ref=threads.id
    rows=11
    extra=

    ROW 3:
    id=2
    select_type=DERIVED
    table=threads
    type=index
    possible_keys=board_id,board_idxidxcreate_date
    key=create_date
    key_len=4
    ref=NULL
    rows=259986
    extra=
    Last edited by jacksonheights; Aug 14, 2008 at 14:35. Reason: removed extra=using temp/filesort from explain results row3, typo =P

  2. #2
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    create another index on the thread table like this: (create_date, board_id).

    also, change your join in the outer query to an explicit JOIN.
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast

  3. #3
    SitePoint Enthusiast
    Join Date
    May 2008
    Posts
    65
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by longneck View Post
    create another index on the thread table like this: (create_date, board_id).

    also, change your join in the outer query to an explicit JOIN.
    Thanks longneck, I've tried the (create_date,board_id) index as well and mysql continues to use create_date for the query. I think the cardinality of board_id is too low to make it useful, so I just removed the composite index for now to save on rebuilding it.

    I will give the explicit JOIN a try.

  4. #4
    SitePoint Enthusiast
    Join Date
    May 2008
    Posts
    65
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I made the join explicit in the sql statement, but had no noticeable change. fwiw, I found this in the mysql docs:

    "INNER JOIN and , (comma) are semantically equivalent in the absence of a join condition: both produce a Cartesian product between the specified tables (that is, each and every row in the first table is joined to each and every row in the second table). "

    Do you see something in the sql statement that suggests I need the explicit join? Thanks.

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,215
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    longneck's suggestion to use explicit joins was not related to the performance issue, it was a recommendation to improve your query readability -- a habit that will pay dividends in future queries

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    try this:
    Code:
    SELECT threads.id AS thread_id, board_id, num_replies, modify_date, posts.id AS post_id, posts.user_id AS author_id, subject, message, lft, rgt, depth, DATE_FORMAT( posts.create_date, '&#37;c/%e/%y %l:%i%p' ) AS reply_date, NULL AS watch_flag
    FROM (
    
    SELECT id, board_id, num_replies, create_date, modify_date
    FROM threads FORCE INDEX create_date
    WHERE board_id
    IN ( 1, 2, 3, 4, 27, 15, 16 )
    ORDER BY create_date DESC
    LIMIT 0 , 25
    ) AS threads, posts
    
    WHERE threads.id = posts.thread_id
    ORDER BY threads.create_date DESC , thread_id DESC , lft ASC ;
    make sure you have an index called crate_date that has only create_date in it, or replace the index name in the query above with one that has create_date as the first element.
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast

  7. #7
    SitePoint Enthusiast
    Join Date
    May 2008
    Posts
    65
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    longneck you're awesome, that totally did the trick, cpu util. went from 35% idle to 95% idle in my tests just now.

    Out of curiosity, I had already added a create_date compound index this morning with create_date as the first dimension. When I ran EXPLAIN it said it was using this create_date compound key, but Extra was blank. After adding the FORCE INDEX, Extra changed to "using where".

    So if the key field in the EXPLAIN results already lists the create_date key, how do you know to have to add FORCE INDEX anyway?

    Thanks again!

  8. #8
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    sometimes mysql will know that a index is applicable to a query, but it won't use it if it decides that reading the entire table sequentially off disk is going to be faster. i think that's what was happening here.
    Check out our new Industry News forum!
    Keep up-to-date with the latest SP news in the Community Crier

    I edit the SitePoint Podcast


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •