SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Evangelist hessodreamy's Avatar
    Join Date
    Apr 2005
    Location
    uk
    Posts
    525
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)

    Most efficient way to check if value(s) exist in a table?

    Having a brain block day, hence the noobish question.

    Say I wanted to know if a value existed within a table, I'd usually be inclined to use a count(*):
    Code:
    SELECT COUNT(*) FROM myTable WHERE myValue='x'
    (Or even SELECT COUNT(*)>0 if I was bothered about getting a boolean result).

    Which is all well and good but, in cases where I don't care how many there are, just that there is at least one, is there a more efficient way? A way that would stop a table scan as soon as it found the first value, return true, and not need to scan any more?

    I did consider using EXISTS and a subquery:
    Code:
    SELECT EXISTS (
    SELECT * FROM myTable WHERE myValue='x'
    )
    But an explain suggested that this was making the process more complex, not simpler.

    I thought about using a LIMIT 1:
    Code:
    SELECT myValue FROM myTable where myValue='x' LIMIT 1
    But it looks the same as using a COUNT.

    Is there a more efficient way?

    As it happens I've got an index on this column so I'd not be expecting much of a difference (but might do with massive tables), but the question was still bugging me.

  2. #2
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,506
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    LIMIT 1 isn't the same as doing a COUNT, because it stops when it finds the first value, just like you want.

  3. #3
    SitePoint Evangelist hessodreamy's Avatar
    Join Date
    Apr 2005
    Location
    uk
    Posts
    525
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guido2004 View Post
    LIMIT 1 isn't the same as doing a COUNT, because it stops when it finds the first value, just like you want.
    That's what I thought, but the EXPLAIN didn't seem to bear that out:

    Code:
    EXPLAIN SELECT COUNT(*)>0 FROM myTable WHERE myValue = 'x'
    Code:
        id  select_type  table            type    possible_keys                          key         key_len  ref       rows  Extra        
    ------  -----------  ---------------  ------  -------------------------------------  ----------  -------  ------  ------  -------------
         1  SIMPLE       myTable          ref     myValue                                myValue     253      const        1  Using where
    Code:
    EXPLAIN SELECT * FROM myTable WHERE myValue = 'x' LIMIT 1
    Code:
        id  select_type  table            type    possible_keys                          key         key_len  ref       rows  Extra        
    ------  -----------  ---------------  ------  -------------------------------------  ----------  -------  ------  ------  -------------
         1  SIMPLE       myTable          ref     myValue                               myValue      253      const        1  Using where
    The process seems to be the same in each case. But that could be my limited understanding of the output of EXPLAIN.

    In theory, though, you reckon a LIMIT 1 would run faster than COUNT(*) on massive datasets? (EDIT: it seems to, but not by much, even on tables with 65,000 rows)

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,336
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    your index is 253 bytes??

    the output is the same, yes, but the process is still different

    in the first query, the actual count is obtained, which requires scanning the index (or other such magic), whereas in the second, a single index entry is retrieved
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Evangelist hessodreamy's Avatar
    Join Date
    Apr 2005
    Location
    uk
    Posts
    525
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    I get that the LIMIT approach would work in theory. I also see that it seems to make very little difference, even on large tables (when I wrote 65,000 rows, I actually meant 65,000,000).
    So, in the absence of better ideas, I'll count that as the best way. Though in reality I'll probably just carry on using COUNT(*)!
    Cheers.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •