SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Guru quenting's Avatar
    Join Date
    Dec 2002
    Location
    Switzerland
    Posts
    735
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    optimizing select distinct ?

    I'm doing a select distinct on a varchar(15) field in tables with few columns but that can have up to several million records.
    The query typically takes around 15 secs to run on the clean table (only a primary key). I tried adding an index on the column, but then the query takes up to 23 secs ??
    How can I optimize a select distinct, if not by using an index ?
    The largest message boards on the web !
    unblog.fr, hosting 700000 french blogs

  2. #2
    SitePoint Wizard HarryR's Avatar
    Join Date
    Dec 2004
    Location
    London, UK
    Posts
    1,376
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If your open to possibly changing your database model you could consider this:

    Instead of having a varchar(15) column, you could consider having another table which holds the distinct entries, and linking them with a primary/foreign key (with an index on the foreign key).

    How this will effect your data all depends on the variety of entries you have for that column, a 4 byte comparison is always going to be faster than a 15 byte comparison etc.

    I've used this to speed up a database which holds around 2 million records on several different columns, managed to get from 2 minute queries!, down to under a second per query.

    I hope this helps, but if it doesn't try experementing with your data,
    Regards.

  3. #3
    SitePoint Guru puco's Avatar
    Join Date
    Feb 2005
    Location
    Slovakia
    Posts
    785
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    What database server are you using? Are you trying to optimize the following query:
    select distinct col from tab?
    Martin Pernecky

  4. #4
    SitePoint Guru quenting's Avatar
    Join Date
    Dec 2002
    Location
    Switzerland
    Posts
    735
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    mysql 3.23, and yes that's a query as simple as this.

    Thanks harry for your feedback. I was hoping for something that would not impact the data model. Generating a different table might be as resource consuming as running the query itself.
    The largest message boards on the web !
    unblog.fr, hosting 700000 french blogs


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •