SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Member
    Join Date
    Nov 2008
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Limit Select Count() :Help Need

    Hi I trying to use count() statement to find duplicate content before insert new data into table... to do that I am using the following statement.

    $query = "SELECT COUNT(*) AS total FROM table where picturelink='$picturelink[$i]'";

    $result = mysql_query($query, $db);
    $resrow = mysql_fetch_row($result);
    $duplicate_check[$i] = $resrow[0];

    As my database are getting bigger the select count statement taking longer time than I accepted ... and taking high cpu load.. So I planed to use

    Limit clause with id desc .. as I wanted if the data is not duplicate withing last 5000 items , it could be proceed

    So I write down following

    $query = "SELECT COUNT(*) AS total FROM table where picturelink='$picturelink[$i]' Order By Id Desc Limit 0, 5000";

    $result = mysql_query($query, $db);
    $resrow = mysql_fetch_row($result);
    $duplicate_check[$i] = $resrow[0];

    But Limit clause having no effect on query... So please advice How I can use limit clause in select count

  2. #2
    SitePoint Evangelist ldivinag's Avatar
    Join Date
    Jan 2005
    Location
    N37 33* W122 3*
    Posts
    414
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    1. are you using the right indices? use the EXPLAIN command...

    i.e.: EXPLAIN SELECT COUNT(*)...

    2. do you have a PK? if yes, try SELECT COUNT(<pk field name>)...
    leo d.

  3. #3
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,496
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by desis View Post
    Hi I trying to use count() statement to find duplicate content before insert new data into table [...] as I wanted if the data is not duplicate withing last 5000 items
    You only want to insert the new data if the picturelink doesn't exist yet? In that case there's no need to use the COUNT. You can do a simple SELECT (without the COUNT), which should be much quicker.

    PHP Code:
    $query "SELECT picturelink FROM table where picturelink = '$picturelink[$i]' Order By Id Desc";
    $result mysql_query($query$db);
    $duplicate_check[$i] = mysql_num_rows($result); 
    Btw, LIMIT 0, 5000 doesn't limit the query to the first 5000 rows in the table, but to the first 5000 rows that match the query criteria.
    You can do it with a subquery (don't know if there's a better/faster method):

    PHP Code:
    $query 
      
    "SELECT picturelink 
       FROM 
         (SELECT picturelink 
          FROM table 
          ORDER BY Id DESC
          LIMIT 0, 5000) AS a
       WHERE picturelink = '
    $picturelink[$i]'";
    $result mysql_query($query$db);
    $duplicate_check[$i] = mysql_num_rows($result); 

  4. #4
    SitePoint Wizard
    Join Date
    Mar 2002
    Location
    Bristol, UK
    Posts
    2,240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Desis, have you tried using INSERT ... ON DUPLICATE KEY UPDATE to avoid duplicate values? The picturelink field would need to be set as a key field, of course

    Quote Originally Posted by guido2004 View Post
    You only want to insert the new data if the picturelink doesn't exist yet? In that case there's no need to use the COUNT. You can do a simple SELECT (without the COUNT), which should be much quicker.
    Really? I was always under the impression that you should always use COUNT() unless you actually want to use the data you're counting, which I don't think the OP is.

  5. #5
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,496
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by SJH View Post
    Desis, have you tried using INSERT ... ON DUPLICATE KEY UPDATE to avoid duplicate values? The picturelink field would need to be set as a key field, of course



    Really? I was always under the impression that you should always use COUNT() unless you actually want to use the data you're counting, which I don't think the OP is.
    But if the COUNT is slowing him down (because the query has to work through the whole table before it can return the result), and he isn't interested in the number of rows found, but only if there is a row or not, then maybe a query without COUNT would be faster.

    And since the OP wants to know if there is a duplicate within the last 5000 rows of the table (and not the entire table), the unique key solution doesn't work (I thought about that too).

  6. #6
    SitePoint Wizard
    Join Date
    Mar 2002
    Location
    Bristol, UK
    Posts
    2,240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by guido2004 View Post
    But if the COUNT is slowing him down (because the query has to work through the whole table before it can return the result), and he isn't interested in the number of rows found, but only if there is a row or not, then maybe a query without COUNT would be faster.
    Any SELECT query, regardless of whether you're using COUNT() or not, would have to "work through the whole table". SELECTing the field itself will be (marginally) slower because it would have to return the value of the column, rather than a number.

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,247
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    what is the fastest query possible? the one you don't actually run

    in this particular problem, you shouldn't be doing a SELECT at all



    by the way, SELECT COUNT(*) FROM myisamtable does not actually count the rows, but this is a fluke of the myisam table type
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    From Italy with love silver trophybronze trophy
    guido2004's Avatar
    Join Date
    Sep 2004
    Posts
    9,496
    Mentioned
    163 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by SJH View Post
    Any SELECT query, regardless of whether you're using COUNT() or not, would have to "work through the whole table". SELECTing the field itself will be (marginally) slower because it would have to return the value of the column, rather than a number.
    Hmm, true. But if I limit the result to 1?

    Quote Originally Posted by r937 View Post
    what is the fastest query possible? the one you don't actually run

    in this particular problem, you shouldn't be doing a SELECT at all

    Then how? Since the OP only cares about a duplicate in the last 5000 rows (older rows may contain duplicates), a unique key doesn't do the trick.

  9. #9
    SitePoint Wizard
    Join Date
    Mar 2002
    Location
    Bristol, UK
    Posts
    2,240
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hmm, true. But if I limit the result to 1?
    It would still have to look for that row. Granted, if the first table row matched the condition, then it would only need to look at one row, but if the only matching row was the last one in the table, the query would have to traverse all the records.

    I think he only brought in the 5000 row limit as a concession because of performance issues. The way I understood his original post was that it would have been nice to seek out duplicates in every row in the table, in which case ON DUPLICATE KEY UPDATE would work fine. Need clarification from OP, I think...

  10. #10
    SitePoint Member
    Join Date
    Nov 2008
    Posts
    2
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks to all for your kind help..what I did I just alter my table and make the picture field unique index and then dump the query blindly...that solved my problem of mysql overloading...becuase dumping and selecting at the same time using more server resourses..thanks again for your help


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •