SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    painfully slow MySQL query please help

    Hi Guys!

    I have the following MySQL query which loads painfully slow. It performs a search on a database of over 30,000 jobs. It shouldn't be this slow (takes around 30 secs to load).

    PHP Code:
    SELECT jobs.job_titlejobs.job_urljobs.job_salaryjobs.job_otejobs.job_perksjobs.job_descriptionjobs.date_postedcompanies.company_namecompanies.company_urlcompanies.logo_thumbnailcountapplications.id ) AS applicationscountfeatured_jobs.id ) AS featuredl.display_namel.postcodel.display_county
    FROM jobs
    INNER JOIN companies ON companies
    .id jobs.company_id
    INNER JOIN jobs_industries ON jobs_industries
    .job_id jobs.id
    INNER JOIN jobs_job_types ON jobs_job_types
    .job_id jobs.id
    INNER JOIN 
    (

    SELECT job_id, if( countjob_id ) >1'Locations throughout the UK'display_name ) AS display_namepostcodedisplay_countylocation_id
    FROM jobs_locations
    GROUP BY job_id
    ) AS l ON l.job_id jobs.id
    LEFT OUTER JOIN applications ON applications
    .job_id jobs.id
    LEFT OUTER JOIN featured_jobs ON featured_jobs
    .job_id jobs.id
    WHERE jobs
    .status 'on'
    GROUP BY jobs.id
    ORDER BY featured DESC 
    date_posted DESC 
    I did an explain on the query and it produced this:

    PHP Code:
    id     select_type     table     type     possible_keys     key     key_len     ref     rows     Extra
    1     PRIMARY     applications     system     NULL    NULL    NULL    NULL    0     
    const row not found
    1     PRIMARY     
    <derived2>     ALL     NULL    NULL    NULL    NULL    31405     Using temporaryUsing filesort
    1     PRIMARY     jobs     eq_ref     PRIMARY
    ,company_id     PRIMARY     4     l.job_id     1     Using where
    1     PRIMARY     companies     eq_ref     PRIMARY     PRIMARY     4     db_name
    .jobs.company_id     1     Using where
    1     PRIMARY     featured_jobs     ALL     NULL    NULL    NULL    NULL    2     
    1     PRIMARY     jobs_industries     ref     job_id     job_id     4     db_name
    .jobs.id     12     Using whereUsing index
    1     PRIMARY     jobs_job_types     ref     job_id     job_id     4     db_name
    .jobs_industries.job_id     12     Using whereUsing index
    2     DERIVED     jobs_locations     ALL     NULL    NULL    NULL    NULL    31405     Using temporary
    Using filesort 
    Any input is appreciated.

  2. #2
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Any ideas on this guys, still having the same problem!

    And happy new year to all! :-)

  3. #3
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    is my MySQL query slow?

    Hi Guys!

    I am running a MySQL query on my server that seems to be causing a massive spike on the server and I really don't know how to fix it. When I run the query in phpMyadmin it takes 3.75 seconds to execute. The table only has 31,000 rows but my query is doing a few joins (INNER, LEFT).

    Is there anyway to speed it up? Most of my queries run in under a second...I just wanted to know if this is normal for a query to take this long.

    Thanks in advance.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    it's normal if the query has to read an entire table -- at that point, an index won't help and will be ignored

    in your joins, the join columns need to be properly indexed as well

    i remember your other thread, where you had that massive EXPLAIN, and i must confess, it's pretty hard to pinpoint where the issue might be

    have you considered hiring a performance expert? (note: i don't qualify, so this isn't a hint)
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    it's normal if the query has to read an entire table -- at that point, an index won't help and will be ignored

    in your joins, the join columns need to be properly indexed as well

    i remember your other thread, where you had that massive EXPLAIN, and i must confess, it's pretty hard to pinpoint where the issue might be

    have you considered hiring a performance expert? (note: i don't qualify, so this isn't a hint)
    Hi - thanks for your reply.

    I think it's the INNER JOIN SELECT (with concat) part of the query that's taking the time because when I remove it, it loads much quicker (in under a second). Here's the query:

    Code:
    SELECT jobs.id, jobs.job_title, jobs.job_description, jobs.job_salary, jobs.job_url, companies.company_name, companies.company_url, l.display_name
    FROM jobs
    INNER JOIN companies ON companies.id = jobs.company_id
    INNER JOIN (
    
    SELECT job_id, if( count( job_id ) >1, 'Locations throughout the UK', concat( display_town, ', ', display_county ) ) AS display_name
    FROM jobs_locations
    GROUP BY job_id
    ) AS l ON l.job_id = jobs.id
    WHERE companies.status = 'Active'
    AND jobs.status = 'on'
    ORDER BY jobs.date_posted DESC
    LIMIT 3
    I have considered hiring a MySQL performance expert but they are more than likely going to charge me a few hundred bucks just to figure out this one issue. I would rather get my head round it first and then if we are still having issues, i'll try to hire someone.

    Thanks for your help.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    run the subquery by itself and see how long it takes, maybe that's your problem
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Wizard Zaggs's Avatar
    Join Date
    Feb 2005
    Posts
    1,048
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    run the subquery by itself and see how long it takes, maybe that's your problem
    Hmmm that's very strange, loads very fast:

    Code:
    Showing rows 0 - 29 ( 31,405 total, Query took 0.0008 sec)
    Does that mean it's the inner joins causing the issue?

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    i think all it means is that the subquery itself isn't your issue.... i think
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Wizard TheRedDevil's Avatar
    Join Date
    Sep 2004
    Location
    Norway
    Posts
    1,196
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    I am by far an SQL expert, but looking at your query and hearing how few rows there is in those tables and its already kneeling, I cant stop thinking that your problem is actually in the database model.

    Have you considered reviewing the database model, and perhaps de-normalize parts of it if it make sense performance wise.
    I.e. when going through this, think how will I get this data, how will I use this data etc.

  10. #10
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,031
    Mentioned
    152 Post(s)
    Tagged
    2 Thread(s)
    Yes, this sounds like one of those situations where denormalization would really help. So instead of using the subquery there, add a field location_name to the jobs_table, and set the value of that field to the relevant value when you save the record in your application.

    Before you do that though, could you please post the EXPLAIN for that query?
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  11. #11
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by ScallioXTX View Post
    ... add a field location_name to the jobs_table, and set the value of that field to the relevant value when you save the record in your application.
    that may not be as easy as you think

    the thing is, a job can have multiple locations, so if you did create this new column, you'd have to set it to the concatenated location name for the first location, and then update it if you add another location for that job

    as for the EXPLAIN, it's in his previous thread... perhaps you could please merge the threads?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  12. #12
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,000
    Mentioned
    101 Post(s)
    Tagged
    0 Thread(s)
    Two threads merged.

    Try the query again but without the LIMIT clause
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  13. #13
    SitePoint Wizard TheRedDevil's Avatar
    Join Date
    Sep 2004
    Location
    Norway
    Posts
    1,196
    Mentioned
    4 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    that may not be as easy as you think
    But is not that why we love this job? Due to the challenges it bring

    Zaggs:
    I've reviewed the thread again after Rémon was kind enough to merge your topics.

    I'll go out on a limb and say that you are not used to work with high load or high dependency websites.

    After reviewing your first query, I'm 100% sure your problem can be fixed with de-normilization and with proper use of cache techniques.

    Query in main post:
    -You run two LEFT OUTER queries which count how many applications the job has received and one who count featured (not sure what this is for). Both these counts should also have been a separate column on the jobs table. Which is increased/decreased when a record is added/removed in said tables. This can be handled either by server side or triggers in the database.

    -Your sub query is strange, you check to see if the job is only available on one location or multiple, but also pull the postcode etc. While the query is GROUP BY the job id, meaning that if there is more than one location you have no idea which postcode etc. your pulling. Which makes it look like if its in more than one location, you only show "Locations throughout the UK" for the user. With other words, this can ALSO be added as a column to the jobs table to keep track of, similar to what is mentioned above.

    -In addition the pulling of the locations the job is available in should be moved outside the initial query, even if you want to pull all locations. Lets say you want to pull all locations per job, and you pull multiple jobs at a time. What you would do is: First loop over the jobs and in addition store the job ids in an array, then after pull all locations used across the jobs. With other words everything is done with two queries.

    -You have INNER JOINS which you don't use, example: jobs_industries and jobs_job_types Make certain you remove anything from the query if you don't need it.

    -You are using ENOM's or at least string keys like job status "on". Stop doing that and start using integers, that will make it slightly easier to convert to other databases in the future if needed.

    -I assume you do not store the thumbnail image as a blob, referring to the companies table "logo_thumbnail" column. If you do stop with that, its very ineffective. In the event you store only the image file name, stop doing that as well! Instead use the company id for the file name, i.e. "ID_thumb.jpg" etc.

    Second Query:
    -You have a LIMIT on three records, just first running this query first:
    SELECT job_id FROM jobs INNER JOIN companies as comp ON (jobs.company_id=comp.id AND comp.status='Active') WHERE jobs.status='on' ORDER BY jobs.date_posted DESC LIMIT 3
    And then after do WHERE jobs.job_id IN (1, 2, 3) I.e. using the results from the first query, or even using it as a sub query would most probably make that query fast again.

    -Though the other points I mentioned above I would have done here as well.


    What you need to understand is that there is two kinds of people that work with Databases, the ones who want everything normalized, and the ones who understand that while normalized is the ultimate goal, it just does not work to have everything normalized for a high load or high dependability system.

    With other words, a normalized database come at the cost of speed and required resources.

    Now, with that said keep your database model as normalized as possible, as you can apply a lot of caching on the expensive parts as well. Even separating a larger query into two or more queries might change the application from taking 3 seconds, to take 0.00000005 seconds again even with more queries.

  14. #14
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,031
    Mentioned
    152 Post(s)
    Tagged
    2 Thread(s)
    Excellent post from TheRedDevil, I applaud you! All very valid points and indeed all of them will improve the speed of that query.

    Having just looked at the EXPLAIN from your query, the problem is with "Using temporary; Using filesort", which means MySQL needs to create a view from what you're selecting in the subquery, write it to disk, and sort it there; which is sloooooooow. Every time you see "Using filesort" in an EXPLAIN that's what's killing it.

    From all solutions given above I'd personally go with

    Quote Originally Posted by TheRedDevil
    Second Query:
    -You have a LIMIT on three records, just first running this query first:
    SELECT job_id FROM jobs INNER JOIN companies as comp ON (jobs.company_id=comp.id AND comp.status='Active') WHERE jobs.status='on' ORDER BY jobs.date_posted DESC LIMIT 3
    And then after do WHERE jobs.job_id IN (1, 2, 3) I.e. using the results from the first query, or even using it as a sub query would most probably make that query fast again.
    because that way you can leave the database tables structure intact (and still normalized), and still get a huge performance gain. Win-win I'd say
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy

  15. #15
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,031
    Mentioned
    152 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by r937 View Post
    that may not be as easy as you think

    the thing is, a job can have multiple locations, so if you did create this new column, you'd have to set it to the concatenated location name for the first location, and then update it if you add another location for that job
    You could do that quite easily in your application, or via database triggers. Sure, it takes a little bit of time to create the new value and update the field, but it probably doesn't happen very often and it most likely doesn't touch any indexes (so no b-tree restructure -- which are sloooow -- will be needed), and since it's likely to speed up the SELECT query a lot, it should be worth doing.

    Although now that TheRedDevil just stepped in, his idea is better than mine
    Rémon - Hosting Advisor

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •