Getting Distinct Date from mysql db with timestamp and around half million records

Hi
Wondered if anyone could shed some light on how to go about doing this.
I need to get the distinct date is Sat 3rd June from the DB which is stored as a timestamp and its ID for say the next 7 days from today
ie 4th, 5th, 6th…

There are approx 500,000 records with a mixture of dates from now until several month ahead. I’ve tried using SELECT DISTINCT date but realised the timestamps are all different even though
many are the same date. Also wonder if the DISTINCT is going to be really slow to access even though I have set and index on it.

Is there a better way of doing this ?

Steve

What do you mean " the timestamps are all different…"? Are they NOT in the same field?

There are many here who have sharper SQL skills than me, but I would try simply a “WHERE” clause that names the date(s) you are looking for.

Probably gonna get blasted for this answer, because of the size of the data set, but…

SELECT DISTINCT CAST(your_datetime_field AS DATE) AS thedate FROM…

Hi when I mean timestamps are different, I mean the hours and minutes of each day are different

you won’t get a meaningful id, unless you want the first one or the last one for each date

what do you need the id for?

ok, i’ve not explained this very well. I have a table with around 500,000. These records contain date and order details. I need to display the orders and date for say next seven days like
Sat 3rd June
Order 1
Order 2
Order 3

Sun 4th June
Order 4
Order 5
Order 6

The order date is stored as a timestamp for that particular order time.
I can’t used SELECT DISTINCT orderDate FROM tbl WHERE orderDate>=$today AND orderDate<=$sevenDaysFromNow as the timestamps are not exactly the same as they will have ordered at different times during the day. I can’t think how to split the days and show the orders for that specific day.

Steve

actually, the only ambiguity seems to be whether you want to return dates, or orders :slight_smile:

my question was, why do you want to return any ids?

as for retrieving orders for the next seven days, that’s trivial…

 WHERE orderDate >= UNIX_TIMESTAMP(CURRENT_DATE)
   AND orderDate  < UNIX_TIMESTAMP(CURRENT_DATE + INTERVAL 8 DAY)

Not much of a MySQL man myself, but…

select [whatever], date(orderDate) as days
from tbl
where orderDate between date(now()) and date(now()) + 6
order by date(orderDate) asc

<hr>

I’d also consider this:

If you can afford it, consider making a temp table:

create tbl_orders_ahead
as
select [whatever]
from tbl
where orderDate between date(now()) and date(now()) + 6

It’ll be faster to get reports from the temp table then the full table, when using functions in your query.

Thanks for comments. my table is below, what I’m looking for is to get the distinct orderDates and display each ID which corresponds to that date

ID orderDate
2 1339166700
3 1339249500
4 1339164000
5 1339249500
6 1339254900

select id, date(orderDate) as day
from tbl
where date(orderDate) between date(now()) and date(now()) + 6
order by date(orderDate) asc

okay, i think i (finally) see what you’re looking for

quite frankly, you should do the date conversion and grouping in php

you said you wanted it to print like this –

Sat 3rd June
Order 1
Order 2
Order 3

Sun 4th June
Order 4
Order 5
Order 6

this indicates that you are after the orders, and not distinct dates

i think your first few problem descriptions misled a few people, including me

okay, the query for what you want would be…

SELECT id
     , orderdate
     , other_order_data
  FROM daTable
 WHERE orderDate >= UNIX_TIMESTAMP(CURRENT_DATE)
   AND orderDate  < UNIX_TIMESTAMP(CURRENT_DATE + INTERVAL 8 DAY)
ORDER
    BY orderdate

then when you bring this into php, you have to do “current versus previous” logic on the orderdate, in order to print each new date as a header for the orders for that date

that’s definitely application-side versus database-side logic

UNIX TIMESTAMP… OK!

It really shows I’m not a MySQL fan.

select id, date(orderDate) as day
from
(
select id, orderDate
from tbl
where orderDate between UNIX_TIMESTAMP(CURRENT_DATE) and UNIX_TIMESTAMP(CURRENT_DATE + INTERVAL 6 DAY)
)
order by date(orderDate) asc

Hope you don’t mind Rudy, for stealing the date part.

when you steal the date part, don’t attribute it to me…

because you b0rked it when you changed it to use BETWEEN :slight_smile:

did you understand my recent comments about how the “grouping” should be done in application logic?

Thanks for info, got it sorted

I can’t say I’ve kept up with your recent comments…

But as far as I know there is no performance issue with BETWEEN in MySQL. Then again, I’m not doing much MySQL. So you may be right…

Anyway, what you’re doing is a pre-optimization. Not sure if it’s a scenario fit for all cases. Or for all RDBMSs. Or for all versions of the same RDBMSs.

BETWEEN reads better which is something SQL is intended for. I can agree upon further hints or pre-optimization only on case by case evaluations and after proper testing of each case. Me trying too hard do the job of the RDBMS can also mean me staying in the way of RDBMS doing a proper job. It’s possible MySQL needs that.

I believe the only implication of BETWEEN is that it isn’t “a <= b < c” but is “a <= b <= c” -> hence any order for 12.00am could show up in a list of items for the previous day, rather than its own.

thanks, jake, that’s exactly the problem

made worse because the issue involves a discrepancy of an entire day’s worth of orders, since CURRENT_DATE means 00:00

that, plus the difference between his 6 and my 8 :wink:

So, it’s mathematics and choice of SQL functions, not SQL performance. Not a real problem. I’m sure one can do with BETWEEN what one can do with >= and <, don’t you agree? :wink:

Anyway, I can live with that. As I’ve said, not a MySQL fan. I could do some testings to see if maybe I got it right but…

Also, I interpreted “next 7 days from today” as 7 days total, which means today + another 6 days. If I got it wrong, it’s not really SQL relevant, is it? :wink:

No one was questioning performance or SQL, just merely a logic issue.

If you take UNIX_TIMESTAMP of today’s date, it returns 12am this morning (For me today is Monday 4th June). Add 6 days, you get Sunday 10th June - but that morning. So with BETWEEN UNIX_TIMESTAMP(CURRENT_DATE) AND UNIX_TIMESTAMP(CURRENT_DATE + INTERVAL 6 DAY), you would get all orders from this morning to Sunday 12am (on the dot), but nothing else for Sunday. So if you were (for example) grouping it by date later on, it would claim that there was 1 order on Sunday when there could be any amount.

With the +8 days, it returns everything up until 12am next Tuesday, but with < rather than <=, you get everything up until (and including) 11.59pm on Monday night, so that’d return everything from this morning until a week tonight.

Of course, those requirements are up to the OP to decide upon.