SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    Always A Novice bronze trophy
    K. Wolfe's Avatar
    Join Date
    Nov 2003
    Location
    Columbus, OH
    Posts
    2,182
    Mentioned
    65 Post(s)
    Tagged
    2 Thread(s)

    Indexes and date field on massive table

    So I'll be creating a clustered index on a massive table (cust_id, action_date). My primary goal will be to keep a query such as this running at its best:

    Code:
    select
    *
    from
    tbl
    where
    concat(month(action_date), year(action_date) = 112
    Should I maybe think about creating a column with MMYY in it and index that?

  2. #2
    SitePoint Enthusiast
    Join Date
    Aug 2011
    Location
    OH, USA
    Posts
    72
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think MMYY column mayl help for search, but clustered index should be unique and it is not looks like MMYY is unique...
    you can set database partition by MMYY or action_date and that help significantly

  3. #3
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    The query above will be very slow because using any kind of functions will prevent the index from being used. Better search for date ranges (generated by the client code) like this:

    Code MySQL:
    select
    *
    from
    tbl
    where
    action_date BETWEEN('2012-01-01', '2012-01-31')

    Also, the index (cust_id, action_date) will not work here because action_date must be the first column of the index to work if you search by it. So you may change it to (action_date, cust_id).

  4. #4
    Always A Novice bronze trophy
    K. Wolfe's Avatar
    Join Date
    Nov 2003
    Location
    Columbus, OH
    Posts
    2,182
    Mentioned
    65 Post(s)
    Tagged
    2 Thread(s)
    Sorry if the example was misleading, ill ALWAYS be doing a join on cust_id and running aggregate functions per cust_id, I will want the order to be cust_id, then action_date

  5. #5
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by K. Wolfe View Post
    Should I maybe think about creating a column with MMYY in it and index that?
    no, no, no

    declare an index on action_date

    then return a month's worth of data with an open upper end date range --
    Code:
    WHERE action_date >= '2012-01-01'
      AND action_date  < '2012-02-01'
    this is best practice, because it works on both date and datetime columns

    btw, lemon juice, BETWEEN is not a function
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  6. #6
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Sorry, my typo, I often make this BETWEEN mistake when I write my own SQL!

    Quote Originally Posted by r937 View Post
    Code:
    WHERE action_date >= '2012-01-01'
      AND action_date  < '2012-02-01'
    this is best practice, because it works on both date and datetime columns
    Is comparing date to datetime documented somewhere, how it works? I used to do this a few years ago and one day MySQL version changed and comparing dates to datetime also changed so I decided it's best not to rely on it and in such cases I always do this:
    Code:
    WHERE action_date BETWEEN '2012-01-01 00:00:00' AND '2012-01-31 23:59:59'

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    "comparing dates to datetimes also changed" -- i would like to see proof of this

    as far as i know, and i have tested it often, what i posted works perfectly for both dates and datetimes

    also, your approach using BETWEEN has two inconvenient drawbacks -- first, you will always have to do a leap year calculation (for the last day of the month), and second, you might miss some rows between 23:59:59 and midnight (assuming mysql supports subseconds)


    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    "comparing dates to datetimes also changed" -- i would like to see proof of this

    as far as i know, and i have tested it often, what i posted works perfectly for both dates and datetimes
    This was happening very long time ago so I don't think I can prove anything - the difference might have been between 5.0 and 4.1 or even 4.0, one was on a Unix hosting the other on Windows. Things have changed so for any practical purposes I think what you say is true nowadays.

    Quote Originally Posted by r937 View Post
    also, your approach using BETWEEN has two inconvenient drawbacks
    They are not so inconvenient

    Quote Originally Posted by r937 View Post
    -- first, you will always have to do a leap year calculation (for the last day of the month),
    MySQL will allow me to do this:
    Code:
    WHERE action_date BETWEEN '2012-02-01 00:00:00' AND '2012-02-31 23:59:59'
    Quote Originally Posted by r937 View Post
    and second, you might miss some rows between 23:59:59 and midnight (assuming mysql supports subseconds)
    MySQL 5.6.4 seems to support microseconds - if I have the very unlikely need for them I might do this:
    Code:
    WHERE action_date BETWEEN '2012-01-01 00:00:00' AND '2012-01-31 23:59:59.999999'
    But I'm giving you credit for better elegance of your way of comparing

  9. #9
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Lemon Juice View Post
    MySQL will allow me to do this:
    Code:
    WHERE action_date BETWEEN '2012-02-01 00:00:00' AND '2012-02-31 23:59:59'
    EEEWWWWW

    just because mysql allows it, doesn't mean it's something you should do

    seriously? you would actually do that? the 31st of february?

    please don't say yes, because... well, just because
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  10. #10
    SitePoint Guru bronze trophy
    Join Date
    Dec 2003
    Location
    Poland
    Posts
    930
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    Well, you provoked me to it so I found out a working solution, however non-standard I've never used 31st of Feb. because I've never had the problem with calculating leap years


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •