SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    24
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Complex SQL vs PHP Loop Logic

    Im working on an application that will have a huge set of statistics to run through - im trying to figure out the best way to work through it. Is it better to use a complex SQL query or PHP Loop to handle the logic... The SQL would look something like...

    Please note, these are cake queries so they are all prefixed with the model name:
    Code SQL:
    SELECT *, COUNT( TRANSACTION.id ) AS clicks, SUM( IF( is_lead = '1', payout, 0 ) ) AS revenue, SUM( is_lead ) AS leads, SUM( IF( is_lead = '1', payout, 0 ) ) / COUNT( TRANSACTION.id ) AS epc, `Campaign`.`id` FROM `transactions` AS `Transaction` LEFT JOIN `campaigns` AS `Campaign` ON (`Transaction`.`campaign_id` = `Campaign`.`id`) WHERE `Transaction`.`publisher_id` = '1' AND `Transaction`.`created` BETWEEN '2008-09-01 00:00:01' AND '2008-11-16 23:59:59' GROUP BY `Campaign`.`id`

    As you can see there are lots of SUMs, Ifs, Counts, etc

    Would this be better served as a simple SQL query and then loop through all of the results and handle it with PHP?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    the use of the dreaded, evil "select star" is logically incompatible with GROUP BY

    other than that, this isn't a particularly complex SQL query

    you asked "Would this be better served as a simple SQL query" and now i'm wondering if you were thinking of complicating it any further, or trying to simplify it

    what was your question again?

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    24
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, the reason this came up was because I need to add conditions to the revenue column - So now instead of just is_lead = 1, it also needs to be approved = 1, can I just append AND approved ='1' to the IF clause in the query?

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    yes, i suppose... but it would be better to use CASE expressions
    Code:
    SELECT c.id
         , COUNT( t.id ) AS clicks
         , SUM( CASE WHEN is_lead = 1
                      AND approved = 1 
                     THEN payout
                     ELSE 0 END ) AS revenue
         , SUM( is_lead ) AS leads
         , SUM( CASE WHEN is_lead = 1
                      AND approved = 1 
                     THEN payout
                     ELSE 0 END ) / COUNT( t.id ) AS epc
      FROM transactions AS t 
    INNER
      JOIN campaigns AS c 
        ON c.id = t.campaign_id 
     WHERE t.publisher_id = 1
       AND t.created >= '2008-09-01' 
       AND t.created  < '2008-11-17' 
    GROUP 
        BY c.id
    several tips:
    • learn to use line breaks and indentation to write SQL (you will thank me later)
    • use short table alias names, not ones which are almost identical to their table names
    • compare numeric columns to numbers, not strings -- write approved = 1, not approved = '1'
    • use backticks only when needed to delimit names that are reserved words or contain special characters, and use these sparingly or not at all
    • use INNER JOINS when the relationship always exists (your LEFT OUTER JOIN appears to cater for transactions which have campaign ids which don't exist in the campaign table
    • for datetimes, use an upper-open-ended range test, not BETWEEN
    • avoid the dreaded, evil "select star" and never use it with GROUP BY
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    24
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    r937 - That is some awesome advice, thank you so much.

    A couple of the tip items you mentioned are things I cant fix in particular - The table alias names, backticks, etc. are a result of using Cake and that is the syntax it uses.

    I wasnt aware of the CASE clause and I will definitely use it.

    Quick Edit: As far as the evil select * - How would I, short of defining every single column get all column in addition to the column that im creating as a result of the SUM, Counts, etc.

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by WildFoxMedia View Post
    How would I, short of defining every single row get all rows in addition to the rows that im creating as a result of the SUM, Counts, etc.
    you wouldn't, full stop

    let's see if you can figure out the logical problem here: give me the count of the number of students in each classroom, as well as lastname and age

    the logical problem: do you want aggregates, or details of individual rows?

    they are incompatible
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    24
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    r937, I understand where your going with this - I guess then, at the very least I need the campaign_id that im grouping by so I can get the Campaign associated with it

  8. #8
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    you could actually get more than the campaign id, and still do it in the same query

    watch this --
    Code:
    SELECT Campaign.id
         , Campaign.someothercolumn1
         , Campaign.someothercolumn2
         , Campaign.someothercolumnetc
         , t.clicks
         , t.revenue
         , t.leads
         , t.revenue / t.clicks AS epc
      FROM campaigns AS Campaign 
    INNER
      JOIN ( SELECT campaign_id 
                  , COUNT( id ) AS clicks           
                  , SUM( CASE WHEN is_lead = 1        
                               AND approved = 1       
                              THEN payout             
                              ELSE 0 END ) AS revenue 
                  , SUM( is_lead ) AS leads           
               FROM transactions AS Transaction
              WHERE publisher_id = 1        
                AND created >= '2008-09-01' 
                AND created  < '2008-11-17' 
             GROUP          
                 BY campaign_id ) AS t
        ON t.campaign_id = c.id
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  9. #9
    SitePoint Wizard silver trophy kyberfabrikken's Avatar
    Join Date
    Jun 2004
    Location
    Copenhagen, Denmark
    Posts
    6,157
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    learn to use line breaks and indentation to write SQL (you will thank me later)
    Totally off topic here, but I had a discussion with a couple of my colleagues recently, about how to format SQL queries. We all do it a little bit differently, and when we talked about it, it seems that we aren't entirely consistent about it either. Do you know of any standard (de-facto or otherwise) for formatting SQL? I tried googling, but didn't find much useful.

  10. #10
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    i don't think there are any standards, but there are plenty of conventions, a couple of which i always incorporate into my SQL --

    - use line breaks (at the very least for each new clause)
    - use indentation (i like to have the keywords sticking out to the left of column 7)

    there is another, which i always use, called leading comma convention, which you will find several times in the forum archives, where people have, by failing to use it, caused themselves syntax errors that were maddeningly difficult to find (so difficult that they had to post on a fourm for help), that would've been completely obvious had they used that convention

    Joe Celko has an entire book on the subject, SQL Programming Style, ISBN 0-12-088797-5
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  11. #11
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Maybe I have deviated a bit, but another benefit of using line breaks is that it is just so much easier to move JOIN blocks as I fiddle around trying to get multiple join statements the correct way round.

    Code:
    INNER JOIN table2 as t2
    ON t1.id = t2.cat_ref
    
    INNER JOIN table3 as t3
    ON t2.id = t3.menu_ref
    The leading comma trick has been a life saver.

  12. #12
    SitePoint Guru
    Join Date
    Nov 2004
    Location
    Plano
    Posts
    643
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    performance-wise, generally it's better to pass as little information as possible between your db and your web server. the reason is that your db server is a different server than your web server...so there can be significant overhead in sending more information back and forth than you're actually going to actively use.

    but, from a development standpoint, it's almost always harder to maintain complex SQL than it is a PHP block.

  13. #13
    Spirit Coder allspiritseve's Avatar
    Join Date
    Dec 2002
    Location
    Ann Arbor, MI (USA)
    Posts
    648
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    there is another, which i always use, called leading comma convention
    What is the leading comma convention, in a nutshell? A google search and a search of this forum didn't come up with anything.

  14. #14
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Simply put, instead of writing this;

    Code:
    SELECT this, that, the_other FROM mytable
    Or this:
    Code:
    SELECT this, 
    that, 
    the_other
    FROM mytable
    You write your selects like so, with the comma first:
    Code:
    SELECT this
    , that
    , the_other
    FROM mytable
    How many times have you found a sql error caused by this:

    Code:
    SELECT this, that, the_other, from mytable
    This especially true for me when changing the column names on complex queries, adding a new line is less error prone.

    There could be other reasons, but I picked it up recently from reading r937's posts.

    For me its just a readability and error-reducing trick that is simple and works.

  15. #15
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,350
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by allspiritseve View Post
    What is the leading comma convention, in a nutshell? A google search and a search of this forum didn't come up with anything.
    see this thread, starting at post #25 --
    http://www.sitepoint.com/forums/showthread.php?t=330911

    there are links to several other threads in there too
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •