Now what I neeed is how many hours were watched in a given time frame. Lets say I want to know how much time in seconds video 1 was watched on 2010-07-14. What would that query look like?
Also, some other conditions if you can help me are:
There will duplicate video id entries. I need to group by the video id and get a sum of how much time in seconds was watched for video 1 on 2010-07-14.
SELECT SUM(
CASE WHEN firstrecordeddatetime >= '2010-07-14'
THEN UNIX_TIMESTAMP(
LEAST(lastrecordeddatetime -- case 1
,firstrecordeddatetime + INTERVAL 1 DAY)) -- case 3
- UNIX_TIMESTAMP(firstrecordeddatetime)
ELSE UNIX_TIMESTAMP(
LEAST(firstrecordeddatetime + INTERVAL 1 DAY -- case 4
,lastrecordeddatetime)) -- case 2
- UNIX_TIMESTAMP('2010-07-14')
END) AS seconds
FROM ...
you may not be familiar with the LEAST function in mysql, and generally i don’t use non-standard functions, but it’s a lot easier to write LEAST(a,b) than CASE WHEN a < b THEN a ELSE b END
if you refer back to the timeline diagram you can confirm how i’ve covered all 4 cases
last step will be the construction of the WHERE clause, as it is very important to pull up only these 4 cases and no others, i.e. we’re only interested in videos which actually overlap ‘2010-07-14’
Thank you very much for all your help. Here is the query again with the WHERE clause how you explained. I also added GROUP BY videoid. Please have a look.
SELECT videoid, SUM(
CASE WHEN firstrecordeddatetime >= '2010-07-14 00:00:00'
THEN UNIX_TIMESTAMP(
LEAST(lastrecordeddatetime -- case 1
,firstrecordeddatetime + INTERVAL 1 DAY)) -- case 3
- UNIX_TIMESTAMP(firstrecordeddatetime)
ELSE UNIX_TIMESTAMP(
LEAST(firstrecordeddatetime + INTERVAL 1 DAY -- case 4
,lastrecordeddatetime)) -- case 2
- UNIX_TIMESTAMP('2010-07-14 00:00:00')
END) AS seconds
FROM useractivity
WHERE lastrecordeddatetime > '2010-07-14 00:00:00' -- rules out case 5
AND firstrecordeddatetime < ('2010-07-14 00:00:00' + INTERVAL 1 DAY) -- rules out case 6
GROUP BY videoid
Assuming that query is correct, if I wanted to get an hourly reading, would I just be able to do D + 1 hour instead of day? D being the start time? Sorry if I am jumping ahead!
yes, of course, D and D+1 are simply two datetimes, and, as i suggested, you should probably be feeding both values in from php (or whatever application language you’re using)
and there’s no jumping ahead, 'cause i think we’re done
OMG OMG OMG i made a couple of huge errors in post #20
(what can i say, i’m sorry, but it was 6am and i hadn’t even had my coffee yet)
it should be like this (changes marked in red) –
SELECT SUM(
CASE WHEN firstrecordeddatetime >= '2010-07-14'
THEN UNIX_TIMESTAMP(
LEAST(lastrecordeddatetime -- case 1
,[COLOR="Red"]'2010-07-14'[/COLOR] + INTERVAL 1 DAY)) -- case 3
- UNIX_TIMESTAMP(firstrecordeddatetime)
ELSE UNIX_TIMESTAMP(
LEAST([COLOR="red"]'2010-07-14'[/COLOR] + INTERVAL 1 DAY -- case 4
,lastrecordeddatetime)) -- case 2
- UNIX_TIMESTAMP('2010-07-14')
END) AS seconds
FROM ...
Ah, LEAST is a new one for me. I read up about it on the MySQL docs. It returns the smallest value out of the given values. What all would we need in the WHERE clause?
Everything is working with the day. The datetime will be inserted by PHP. When I try to edit the query for hours, I am getting incorrect values (even negative values). Here is what the hourly query I did looks like:
SELECT videoid, COUNT(*) AS count, SUM(
CASE WHEN firstrecordeddatetime >= '2010-07-14 00:00:00'
THEN UNIX_TIMESTAMP(
LEAST(lastrecordeddatetime -- case 1
,firstrecordeddatetime + INTERVAL 1 HOUR)) -- case 3
- UNIX_TIMESTAMP(firstrecordeddatetime)
ELSE UNIX_TIMESTAMP(
LEAST(firstrecordeddatetime + INTERVAL 1 HOUR -- case 4
,lastrecordeddatetime)) -- case 2
- UNIX_TIMESTAMP('2010-07-14 00:00:00')
END) AS seconds
FROM useractivity
WHERE lastrecordeddatetime > '2010-07-14 00:00:00' -- rules out case 5
AND firstrecordeddatetime < ('2010-07-14 00:00:00' + INTERVAL 1 HOUR) -- rules out case 6
GROUP BY videoid
presumably the start and end datetimes can be any valid times on the 24-hr clock, yes?
so what you have to do is set up the logic to handle several situations
starts and stops entirely within 2010-07-14
starts before 2010-07-14 and ends within
starts within 2010-07-14 and ends afterwards
starts before 2010-07-14 and ends afterwards
make sure you understand all 4 of those conditions first, yes? and work out the logic for each one in terms of which value to subtract from which value in order to get the duration specifically within 2010-07-14
i’ll do the 4th one for you: the duration is midnight to midnight, i.e. 86400
I understood the diagram. That is exactly what I need. In each case in the diagram, I need to know how many seconds were watched between 2010-07-14 00:00:00 and 2010-07-15 00:00:00.
I think I misunderstood what you wrote. Sorry about that. I think this is what you wanted me to do:
for case no. 1, i want the elapsed time from 2010-07-14 01:00:00 to 2010-07-14 02:00:00 so the duration watched on the 14th would be 1 hour
for case no. 2, i want the elapsed time from 2010-07-13 23:00:00 to 2010-07-14 01:00:00 so the duration watched on the 14th would be 1 hour
for case no. 3, i want the elapsed time from 2010-07-14 23:00:00 to 2010-07-15 01:00:00 so the duration watched on the 14th would be 1 hour
for case no. 4, i want the elapsed time from 2010-07-13 23:00:00 to 2010-07-15 01:00:00 so the duration watched on the 14th would be 24 hours
but that’s what i was trying to get you to do – think about the 4 cases
okay, here is a template for you:[indent]for case no. ___, i want the elapsed time from ________ to _______[/indent]as i mentioned earlier, i already did case no. 4 for you --[indent]for case no. 4, i want the elapsed time from midnight morning on the 14th to midnight evening on the 14th[/indent]you need to figure out the other three, and i’ll take care of putting the SQL together for all 4 cases
okay, let’s go back to the very beginning of the problem
I have an entry like this:
1
2010-07-14 01:00:00
2010-07-15 03:00:00
Lets say I want to know how much time in seconds video 1 was watched on 2010-07-14.
which of the 4 cases is this? no, don’t guess, i will tell you – it’s case 3
3. starts within 2010-07-14 and ends afterwards
correct? i mean, if you can’t follow along now, then you might want to consider outsourcing your programming
so video 1 starts within 2010-07-14 (specifically, at 1 am), and ends afterwards (specifically, on the next day)
so the time difference for the 14th is from 01:00 until midnight – you can’t count anything before 1:00 on the 14th (because the watching hasn’t started yet), and you can’t count anything after midnight on the 14th/15th (because then it’s no longer in the 14th)
I was thinking something like using ‘OR’ in between all the cases? I am not sure I understand what you mean by end-start for the first one. End of the day - start of the day? Or end time of the video - start time?
in the WHERE clause, all you need to do is rule out cases 5 and 6 in the following diagram –
2010-07-14[COLOR="red"](D)[/COLOR] 2010-07-15[COLOR="red"](D+1)[/COLOR]
00:00 00:00
| |
case 1 | [COLOR="Red"]F-------L[/COLOR] |
| |
case 2 F----[COLOR="Red"]x-----L[/COLOR] |
| |
case 3 | [COLOR="Red"]F--------o[/COLOR]-----L
| |
case 4 F-----[COLOR="Red"]x---------------o[/COLOR]-----L
| |
case 5 F-----L | |
| |
case 6 | | F-------L
the WHERE clause is now easy to write –
WHERE L > D -- rules out case 5
AND F < D+1 -- rules out case 6
note that both conditions must be true, i.e. it has to be AND not OR
now all that remains for you to do is substitute for F and L and D, finish writing the entire query, and alternatively substituting actual date values (using php or whatever you’re using) for D and D+1
i’d be interested in seeing the final query when you’re done