SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Wizard bronze trophy cydewaze's Avatar
    Join Date
    Jan 2006
    Location
    Merry Land, USA
    Posts
    1,096
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Problems trying to streamline this query

    I'm trying to combine some queries on a web page and I'm having unexpected results.

    We're using a home-built CMS that manages our website, and it holds things like the page title and the contact person for that page. Our website is broken down into topic areas, so we have a summary page that shows all the topics listed in a table, along with the default contact person for that topic. It also lists total pages per topic, and pages with no contact or not title.

    The summary page is built using 4 queries. One to list out the topics with their default contact, and three more to do the total pages, the pages with no contact, and the pages with no title.

    Here are the queries from the old, working version of the page:

    Main query to get topics and contacts:
    Code SQL:
    SELECT    default_contacts.topic_id
    		, default_contacts.topic_name
    		, default_contacts.contact_id
    		, maindir.first_name & ' ' & maindir.last_name AS fullname
    		, pagecontacts.page_topic
    		, pagecontacts.page_id
    		, pagecontacts.page_title
    		, pagecontacts.contact_id
    FROM ( ( default_contacts
    LEFT JOIN maindir
    		ON maindir.main_id = default_contacts.contact_id )
    LEFT JOIN pagecontacts
    		ON default_contacts.topic_name = pagecontacts.page_topic )
    ORDER BY topic_name

    Then these three to get the totals

    Total pages in topic:
    Code SQL:
    SELECT COUNT(page_id) AS a
    FROM pagecontacts
    WHERE page_topic = '#gettopics.page_topic#'

    Pages with no contact:
    Code SQL:
    SELECT COUNT(page_id) AS b
    FROM pagecontacts
    WHERE page_topic = '#gettopics.page_topic#'
    AND contact_id = 0

    Pages with no title:
    Code SQL:
    SELECT COUNT(page_id) AS c
    FROM pagecontacts
    WHERE page_topic = '#gettopics.page_topic#'
    AND ( page_title IS NULL
    OR page_title = '' )

    Sine we're stuck with MS Access, I've been trying to figure out a way to streamline pages. So one of the things I thought of was to combine the above four queries into one. Well, sort of one. I still have the same number of select statements, so it might not be ideal. This is what I came up with:

    Code SQL:
    SELECT    default_contacts.topic_id
    		, default_contacts.topic_name
    		, default_contacts.contact_id
    		, maindir.first_name & ' ' & maindir.last_name AS fullname
    		, pagecontacts.page_topic
    		, pagecontacts.page_id
    		, pagecontacts.page_title
    		, pagecontacts.contact_id
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts
    				WHERE page_topic = topic_name ) AS a		
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts 
    				WHERE page_topic = topic_name
    				AND contact_id = 0 ) AS b
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts
    				WHERE page_topic = topic_name
    				AND ( page_title IS NULL
    				OR page_title = '' ) ) AS c
    FROM ( ( default_contacts
    LEFT JOIN maindir
    		ON maindir.main_id = default_contacts.contact_id )		
    LEFT JOIN pagecontacts
    		ON default_contacts.topic_name = pagecontacts.page_topic )
    ORDER BY topic_name

    When I run this on my local machine, it takes waaaaaay longer than the four separate queries, and when I copy it to our test server, the query times out. I would expect that at worse, it would perform identical to the separate ones and not worse.

    What am I doing wrong?
    <cfset myblog = "http://cydewaze.org/">

  2. #2
    SitePoint Enthusiast
    Join Date
    Aug 2011
    Location
    OH, USA
    Posts
    72
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it looks like you confuse db engine the same table names... try that

    Code:
    SELECT    dc.topic_id
    		, dc.topic_name
    		, dc.contact_id
    		, md.first_name & ' ' & md.last_name AS fullname
    		, pc.page_topic
    		, pc.page_id
    		, pc.page_title
    		, pc.contact_id
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts pp1
    				WHERE pp1.page_topic = dc.topic_name ) AS a		
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts pp2
    				WHERE pp2.page_topic = dc.topic_name
    				AND pp2.contact_id = 0 ) AS b
    		, ( SELECT COUNT(page_id)
    				FROM pagecontacts pp3
    				WHERE pp3.page_topic = dc.topic_name
    				AND ( pp3.page_title IS NULL
    				OR pp3.page_title = '' ) ) AS c
    FROM ( ( default_contacts as dc
    LEFT JOIN maindir as md
    		ON md.main_id = dc.contact_id )		
    LEFT JOIN pagecontacts as pc
    		ON dc.topic_name = pc.page_topic )
    ORDER BY dc.topic_name

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by cydewaze View Post
    We're using a home-built CMS that manages our website, and it holds things like the page title and the contact person for that page. Our website is broken down into topic areas, so we have a summary page that shows all the topics listed in a table, along with the default contact person for that topic.
    i'm curious about the LEFT JOINs you have

    is it possible for default_contacts.contact_id to have a value which does not exist in the maindir table?

    is it possible for default_contacts.topic_name to have a value which does not exist in the pagercontacts table?

    i think a few words on the purpose of each of your tables would help me a lot

    i can't approach the COUNT queries until i understand the main query...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Wizard bronze trophy cydewaze's Avatar
    Join Date
    Jan 2006
    Location
    Merry Land, USA
    Posts
    1,096
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    is it possible for default_contacts.contact_id to have a value which does not exist in the maindir table?
    No, but a topic can have a default contact who's not been assigned yet. Without the left join there, the topic wouldn't show up on the list so we couldn't click it to assign a contact.

    Quote Originally Posted by r937 View Post
    is it possible for default_contacts.topic_name to have a value which does not exist in the pagercontacts table?
    Yes. The way the CMS is set up is that you add a topic first. That topic appears in the defaults_contact table, but the entries in pagecontacts may not be there yet because you have to browse the page to add them. We have some topics with thousands (and even tens of thousands) of pages, so adding them manually wasn't an option. The CMS therefore adds the pages as they're browsed (on the dev server) which includes running a link check across them. So one pass of the link checker not only checks links, it also adds all the pages for that topic to the CMS.

    So we have the following tables:

    maindir: contains our complete staff listing (names, email addies, office location, phone number, etc)

    pagecontacts: list of pages (by path), topic name for each page, and ID numbers for the primary, secondary, etc contacts. The contact ID is the same as the ID in the maindir table. Also contains the title for the page (so we can use it in the H1 and title tag with the latter being buried in an include and hard to get to otherwise.

    default_contacts: The ID of the default contact for each topic. Also serves as the list of topics. This ID is the "fallback" contact in the event we neglect to assign an ID to a page in the pagecontacts table.

    It might seem a little quirky, but this database started off just being a way to populate our online office directory, and the maindir table was the only thing in there. We soon realized it could do a lot more, so it's sort of evolved into what it is now. It's basically an evolution over 3 or 4 years, and despite how quirky it seems, it's ended up making the website really easy to update, and the pages load super fast despite the Access DB. Once we get a real database under us, things should really work well.

    Hope that helps explain things!

    Right now I've put the original 4 queries back, but I've changed the three that do the counts to a QofQ of the main query, and that seems to have sped things up a little, but I'm really interested in the best way to do it.
    <cfset myblog = "http://cydewaze.org/">

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    sorry for the delay in replying

    i looked at this several times, and i am sorry, i must admit defeat

    despite your explanations, i ~still~ cannot understand the relationships in your tables

    however, the fact that you're using query-of-query on the results of the main query in order to get your counts, well, that tells me that the counts shouldn't have been incorporated into the main query in the first place

    the reason is, since you are returning all the detailed rows anyway, the best place to count them is in the application

    i cannot vouch for the performance of query-of-query in general but i would be genuinely surprised if it's not substantially better to do counts in the app than in the database
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Wizard bronze trophy cydewaze's Avatar
    Join Date
    Jan 2006
    Location
    Merry Land, USA
    Posts
    1,096
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    No apology needed rudy! I'm the one sorry for not explaining it sufficiently!

    I think one of my problems is that every since I read your book, I seem to go out of my way to do *everything* in the SQL that I can possibly make it do! So I end up with those extremely complex queries and very simple CF code.

    I'll take another look at this app tomorrow and see if there's a way to make the CF code to the counting, although since the counts are sill based on the database results (I need to count how many pages exist for a certain condition in the database), I'm not sure how practical it'll be.
    <cfset myblog = "http://cydewaze.org/">

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by cydewaze View Post
    I'll take another look at this app tomorrow and see if there's a way to make the CF code to the counting...
    there is -- query-of-query
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    SitePoint Wizard bronze trophy cydewaze's Avatar
    Join Date
    Jan 2006
    Location
    Merry Land, USA
    Posts
    1,096
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    there is -- query-of-query
    Well that's what I was doing before! But now I've changed it to increment a counter for each case. It *seems* faster, but EVERYTHING is faster at home. I'll bring the page to work tomorrow and see if it's faster through the coffee stirrer straw that's our connection to the dev server.
    <cfset myblog = "http://cydewaze.org/">

  9. #9
    SitePoint Wizard bronze trophy cydewaze's Avatar
    Join Date
    Jan 2006
    Location
    Merry Land, USA
    Posts
    1,096
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    OOOH! It actually is faster!

    I uploaded it to the server at work, and new version of the page loads significantly faster. I'll have to remember to keep the math out of the SQL and put it in the app from now on.
    <cfset myblog = "http://cydewaze.org/">

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    you can do all kinds of math in SQL, as long as it doesn't add to the data retrieval load

    e.g. GROUP BY with a HAVING restriction is best done in SQL
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •