SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Multiple LEFT JOINS

    Hi folks:

    I have 3 tables: "PageCat" for category data, "PageDoc" for Document data and "PageImg" for Image data. and I'd like to be able to list category names as well as the number of docs and imgs in each category.

    However I get incorrect results, ie: If I add another file (Img or doc) the count for both imgs and docs increases, when only the doc count should increase.

    The Table Structure:

    * "PageCat" has a primary key auto_increment ID column ("PageCatID")
    * "PageDoc" has a primary key auto_increment ID column ("PageDocID") and a foreign key for joins to the "PageCat" table ("PageDocPageCatID")
    * "PageImg" also has a primary key auto_increment ID column ("PageImgID") and a foreign key for joins to the "PageCat" table ("PageImgPageCatID")

    The Query:

    $sql = "SELECT PageCat.*,COUNT(PageDocPageCatID) AS DocCount,COUNT(PageImgPageCatID) AS ImgCount
    FROM PageCat
    LEFT JOIN PageDoc ON PageDocPageCatID=PageCatID
    LEFT JOIN PageImg ON PageImgPageCatID=PageCatID
    GROUP BY PageCatID
    ORDER BY PageCatName";

    I'm using PHP4.3.1/MySQL4.0.12-nt with Apache 1.3 on Win2k Prof.
    Thanks for any pointers you can give me

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    in mysql versions prior to 4.1, your best bet is two separate queries

    select PageCat.*,COUNT(PageDocPageCatID) AS DocCount
    FROM PageCat
    LEFT JOIN PageDoc ON PageCatID = PageDocPageCatID

    select PageCat.*,COUNT(PageImgPageCatID) AS ImgCount
    FROM PageCat
    LEFT JOIN PageImg ON PageCatID = PageImgPageCatID
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hey Rudy - thanks for your help but I iterate thru the result set for the display of category data thats why I need to JOIN to some other, linked tables. It works when there's only one table to join to but not when there's more than one. And I don't see an easy way to integrate the results of two separate queries without messing about with more arrays.

    I read somewhere you can use an AND within an ON clause and tried it but MySQL moaned: " Unknown column 'PageImgPageCatID' in 'field list' " when there really is a column called this. The query I tried was:

    SELECT PageCat.*,COUNT(PageDocPageCatID) AS DocCount,COUNT(PageImgPageCatID) AS ImgCount
    FROM PageCat
    LEFT JOIN PageDoc ON PageDocPageCatID=PageCatID AND PageImgPageCatID=PageCatID
    GROUP BY PageCatID
    ORDER BY PageCatName

    Is it really impossible to perform this from one query? Cheers for your help though

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    i will not go out on a limb and say impossible without subqueries, but the double-join will involve so many rows that performance will be terrible

    and i have never bothered to figured out the double join syntax, simply because subqueries is the right way to do it

    if you want to join the two counts together, you can use a temp table instead of doing the joining yourself in programming arrays

    hey, don't shoot the messenger, subqueries are a pretty normal part of the standard sql language, and why mysql decided not to implement them until release 4.1 is anybody's guess


    rudy

  5. #5
    SitePoint Zealot
    Join Date
    Dec 2001
    Location
    UK
    Posts
    105
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm not an SQL expert by any stretch of the imagination but how subselects, or multiple queries can be any more efficient that a single sql statement with a join (or even multiple joins) is beyond my comprehension.

    In addition, the only reason a join will return a shed-load of rows, is if the relationships have not been created in an optimal way, therefore having queries which result in cartesian products.

    As I said, I'm not an expert in SQL but give me a little more information and I will see if I can help.

    Can you tell me the relationships between the tables? In other words, does PageCat have a one to many relationship with PageDoc? Does PageDoc have a one to many relationship with PageImgs. If you want to PM me your schema and some sample data I am happy to have a go at constructing the query for you (can't promise to come up with the solution).

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    torrent, the problem here is that you cannot -- or, rather, should not -- join both the PageDoc and PageImg tables to the PageCat table, because you will get blown up totals

    simple example:

    person
    3 curly
    5 larry
    7 moe

    cars
    3 buick
    3 ford
    3 volvo
    7 vw
    7 kia

    wives
    3 mary
    3 jane
    5 betty
    7 linda
    7 patty
    7 beth

    how would you propose to join person to both cars and wives?

    just because you can write a simple join, doesn't mean it'll give you anything meaningful in the way of results!!

    if the problem is to produce a result set like this --

    person cars wives
    curly 3 2
    larry 0 1
    moe 2 3

    then i know i can do it with subqueries, and i'm fairly certain that IF it can even be done with joins, THEN it will surely be inefficient (something about cross joins, except i dunno how to deal with 0 cars for larry in a cross-join)

  7. #7
    SitePoint Zealot
    Join Date
    Dec 2001
    Location
    UK
    Posts
    105
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Your experience is greater than mine in this field, of that I have no doubt. However, it is my understanding (albeit limited) that if you have a 3NF compliant schema then a correctly structured SQL join query, along with relevant column indexing, should be more efficient than subselects. Also, just done a quick search in Google (where else?) and found this:

    http://www.sql-server-performance.com/tuning_joins.asp

    which, compromisingly , supports both sides of the coin:
    If you have the choice of using a JOIN or a subquery to perform the same task, generally the JOIN (often an OUTER JOIN) is faster. But this is not always the case. For example, if the returned data is going to be small, or if the are no indexes on the joined columns, then a subquery may indeed be faster.

    The only way to really know for sure is to try both methods and then take a look at their query plans. If this operation is run often, you should seriously consider writing the code both ways, and selecting the code that is most efficient.

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    i can't reach that site right now -- maybe they're busy running a join (heh)

    here's the way i would write this with subqueries:
    Code:
    select PageCat.*
    , COUNT(PageDocPageCatID) AS DocCount 
    , X.ImgCount
    FROM PageCat
    LEFT 
    JOIN PageDoc ON PageCatID = PageDocPageCatID
    LEFT 
    JOIN ( 
    select PageCat.PageCatID
    , COUNT(PageImgPageCatID) AS ImgCount 
    FROM PageCat
    LEFT
    JOIN PageImg 
    ON PageCatID = PageImgPageCatID 
    ) as X
    ON PageCatID = X.PageCatID
    i would call this a derived table, even though technically it's still a subquery

    EDIT (number 5): i give up, i just cannot get this board's STUPID wysiwyg to honour my code spacing -- trust me, it looks a lot better properly indented

  9. #9
    SitePoint Zealot
    Join Date
    Dec 2001
    Location
    UK
    Posts
    105
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937
    i can't reach that site right now -- maybe they're busy running a join (heh)

  10. #10
    SitePoint Addict phptek's Avatar
    Join Date
    Jun 2002
    Location
    Wellington, NZ
    Posts
    363
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Torrent and thanks to everyone who tried to help me out I'm working from my home PC on this and could conceivably upgrade to > MySQL4.1 but I know that the hosts I use only have < MySQL 4.1 and so it would be a bit of a waste of time.

    I just opted to use a separate category concept for each of Documents and Images (Kinda makes things easier actually)

    But Thanks again folks nevertheless.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •