SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Zealot
    Join Date
    Aug 2002
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question SELECT/COUNT/JOIN on multiple tables?

    Hi there.

    I am having trouble putting together a query to handle the output of website traffic stats using PHP/MYSQL. I am collecting the following cols in a 'traffic' table:

    time [timestamp] | url [$PHP_SELF] | ip [$REMOTE_ADDR]

    I have another table, called 'emails', which has the following cols:

    id [primary key] | collected [timestamp] | address

    I am trying to extract hourly statistics that count the
    raws, uniques and emails collected for each hour on any given day. Something like:

    Hour|Raws|Uniques|Emails
    12 | 100 | 65 | 3
    13 | 150 | 85 | 5
    14 | 250 | 95 | 8 etc...

    I have experimented a bit and the closest I have come is the following query.

    -----------------------------------------

    SELECT
    HOUR(traffic.time) AS Hour,
    COUNT(traffic.url) AS Raws,
    COUNT(DISTINCT traffic.ip) AS Uniques,
    COUNT(DISTINCT emails.id) AS Emails

    FROM traffic

    LEFT JOIN emails
    ON HOUR(traffic.time) = HOUR(emails.collected)

    WHERE DAYOFYEAR(traffic.time) = DAYOFYEAR("2002-7-31")

    GROUP BY
    HOUR(traffic.time)

    -----------------------------------------

    (The "date" ("2002-7-31") is PHP generated.)

    This almost works except that it returns the hourly emails collected for every day on record, not the "date" that is requested.

    I understand that I could forget about the email stats in this query and fire a query at the db on every loop (in PHP) through the traffic stats to count the emails, but this is obviously not a prefered option. I also understand that I could achieve this with a nested query, but thats not an option either.

    I think there must be a way to do this query but I just can't figure it out. Any help?

    Thx in advance.

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    you cannot join rows on hour until there's only one row of each hour, but that's not immediately obvious, is it

    if you have mysql 4, try a union instead
    Code:
    SELECT "traffic" AS StatType
         , HOUR(traffic.time) AS Hour
         , COUNT(traffic.url) AS Raws
         , COUNT(DISTINCT traffic.ip) AS Uniques, 
      FROM traffic
     WHERE DAYOFYEAR(traffic.time) = DAYOFYEAR("2002-7-31")
    GROUP 
        BY HOUR(traffic.time)
    UNION ALL
    SELECT "emails" 
         , HOUR(emails.collected)
         , COUNT(DISTINCT emails.id) 
         , null
      FROM emails
     WHERE DAYOFYEAR(emails.collected) = DAYOFYEAR("2002-7-31")
    GROUP 
        BY HOUR(emails.collected)
    ORDER
        BY 1 -- StatType
         , 2 -- Hour
    if you cannot do the union, then just run the two queries in succession

    anything is better than running queries inside a loop
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Zealot
    Join Date
    Aug 2002
    Posts
    168
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks Rudy, I can see that this is a good way to go about it, unfortunately I am using MYSQL version 3.32 though. I don't have a problem upgrading to version 4 but I am not sure where I am going to get hosting yet so I guess its better to stick with 3.23 for now. Or do you think it would be safe to assume that a host would have version 4?

    If I should stick with 3.23 then I will need to "run the two queries in succession". I have been thinking about this and I am not sure of the best way to go about it. Should I be creating a TEMPORARY table and then joining the 'emails' table to it?

    Something like creating a temp table with the following cols:

    DAYOFYEAR(traffic.time) | HOUR(traffic.time) | COUNT(traffic.url) | COUNT(DISTINCT traffic.ip)

    Then joining, somehow, an email count on DAYOFYEAR(emails.collected) with the DAYOFYEAR(traffic.time) col in the temp table?

    I have attempted to do something like this to no avail.

    Or do you mean something different?

    Thanks again.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    what i meant was just run the first query, and then the second, and print them separately

    if getting the combined data onto single rows is important. and you don't want to massage it in your program, then you could combine the rows with a temp table something like this
    Code:
    create temptable 
     (theHour int, Raws int, Uniques int, Emails int)
    
    insert into temptable
    SELECT HOUR(traffic.time)
         , COUNT(traffic.url)
         , COUNT(DISTINCT traffic.ip)
         , 0
      FROM traffic
     WHERE DAYOFYEAR(traffic.time) = DAYOFYEAR("2002-7-31")
    group 
        by HOUR(traffic.time)
    
    insert into temptable
    SELECT HOUR(emails.collected)
         , 0
         , 0
         , COUNT(DISTINCT emails.id) 
      FROM emails
     WHERE DAYOFYEAR(emails.collected) = DAYOFYEAR("2002-7-31")
    group 
        by HOUR(emails.collected)
    
    select theHour
         , sum(Raws) as Raws
         , sum(Uniques) as Uniques
         , sum(Emails) as Emails
      from temptable
    group
        by theHour
    rudy


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •