SitePoint Sponsor

User Tag List

Page 1 of 3 123 LastLast
Results 1 to 25 of 59
  1. #1
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool Looping through 50000+ records?? any efficient way?

    hi
    I've a system where users are assigned 8 character IDs like 'ABCD1234'. These IDs are randomly generated & then its checked whether the ID generated is not already assigned to a user. If its already assigned, then a new ID is generated again & then its checked again & so on till an un-used ID is found.
    I've following function to do this
    PHP Code:
    function getUniqueID() {
        global 
    $oDB;
        
    $isIDUnique false;    //set the loop terminator to false
        
    while($isIDUnique!=true) {
            
    $newID generateID();    //generate the 8-char ID
            //check if ID is assigned to someone
            
    $chkID $oDB->getSingleVar("SELECT usr_name FROM tbl_users WHERE usr_id={$newID} LIMIT 0, 1");
            if(empty(
    $chkID)) {
                
    //ID is not assigned to anyone
                
    $isIDUnique true;    //Break the loop
            
    } else {
                
    //ID is assigned to someone
                
    $isIDUnique false;    //Let the loop continue
            
    }
        }
        return 
    $newID

    Now the problem is that there are more than 50000 users in the table, so if a duplicate ID is generated(quite probable after 20000 records), the loop will run for as many times & hit the DB that many times as well. Now this can be quite a big problem if the loop is running for, say, 10000 times & hitting the DB that many times!!

    Can anyone suggest any improvements? Thanks for any help!!
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  2. #2
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Many possibilities

    - use auto-incement IDs and generate alpha-num IDs from those

    - in the code above, COUNT() query should be faster (assuming you have an index on usr_id)
    Code:
    SELECT COUNT(*) FROM tbl_users WHERE usr_id={$newID}
    - generate several (say 10) IDs and then check them all at once
    Code:
    $ids = implode(",", $array_of_ids)
    SELECT usr_id FROM tbl_users WHERE usr_id IN $ids
    while ... fetch
        remove usr_id from array_of_ids
    array_of_ids contains only available IDs
    - make usr_id UNIQUE KEY and repeat insert until you get no error:
    Code:
    do {
       $id = generateID();
       query("INSERT .... usr_id) VALUES (...... $id)")
    } while(mysql_errno());

  3. #3
    SitePoint Addict
    Join Date
    May 2005
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'd either:

    a.) Use an auto-incrimenting field. If you don't want users to be able to "guess" user ids, just do a hash on it, i.e. SELECT md5(userid) ...

    b.) generate a GUID instead of that hash. uniqid() should be sufficient for this sort of thing.

    Also, if you have to resort to looking it up that way, 50,000 iterations isn't that big of a deal (your'e doing this at registration time, I assume?), if the field is indexed.

    My personal preference would be for a. Method b is a waste of storage. An auto incrementing INT field will hold billions of values and only requires 4 bytes of storage per record. your 8-digit varchar requires 9 bytes per record (actually, it might even be 17, but i don't recall mysql's character storage size off the top of my head).

  4. #4
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Venlo, the Netherlands
    Posts
    141
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Out of interest, how would you generate an alpha-num?

  5. #5
    SitePoint Zealot
    Join Date
    Jul 2005
    Location
    Venlo, the Netherlands
    Posts
    141
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Etnu
    I'd either:
    a.) Use an auto-incrimenting field. If you don't want users to be able to "guess" user ids, just do a hash on it, i.e. SELECT md5(userid) ...
    MD5 is not an option Ednu, since a md5 hash isn't unique: multiple inputs can all have the same md5 hash

  6. #6
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Vennie
    Out of interest, how would you generate an alpha-num?
    base_convert($numeric_id, 10, 32);

    MD5 is not an option Ednu, since a md5 hash isn't unique: multiple inputs can all have the same md5 hash
    I think, md5s are unique for first ~100,000,000 natural numbers (check it if you have time

  7. #7
    SitePoint Addict
    Join Date
    May 2005
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Vennie
    MD5 is not an option Ednu, since a md5 hash isn't unique: multiple inputs can all have the same md5 hash
    There are no collisions on any hash between 0 and 4294967295 in the md5 hash algorithm. You can test this for yourself, though I wouldn't recommend running this on a server that you need to use for the next 10 years or so:

    PHP Code:
    <?php
    for($i 0$i 4294967295$i++)
    {
        
    $MD5 md5($i);
        echo 
    $MD5."\n";
        for(
    $j $i 1$j 4294967295$j++)
        {
             if(
    $MD5 == md5($j))
             {
                 echo 
    'That was pretty inefficient, wasn\'t it?';
             }
        }
    }
    ?>

  8. #8
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by stereofrog
    - use auto-incement IDs and generate alpha-num IDs from those
    I know how to use auto_increment but only with Numerical IDs, not with AlphaNum IDs like I'm generating. If you can tell me how, then I think that it'll solve my problem, as the usr_ID here is a PrimaryKey in the table.

    Quote Originally Posted by stereofrog
    - in the code above, COUNT() query should be faster (assuming you have an index on usr_id)
    Code:
    SELECT COUNT(*) FROM tbl_users WHERE usr_id={$newID}
    Will that make a lot of difference? Since I'm getting only 1 var(the username which can be atmost 50 chars).

    Quote Originally Posted by stereofrog
    - generate several (say 10) IDs and then check them all at once
    Code:
    $ids = implode(",", $array_of_ids)
    SELECT usr_id FROM tbl_users WHERE usr_id IN $ids
    while ... fetch
    remove usr_id from array_of_ids
    array_of_ids contains only available IDs
    This is quite possible, but then, the problem is only marginally reduced, since the possibility of generating similar IDs in any consequent batch is quite good. Like suppose I'm checking 100 IDs at a time. The Attempt-1, Attempt-2, Attempt-3 fail, now in the fourth attempt, I can get the same set of IDs as I got in Attempt-1. Ofcourse this can be checked against as I can keep storing the failed IDs in a seperate array so that the new batch can be matched against it, but then, it'll be memory intensive to keep an array with some 20000 or 30000 values & checking against them, chances of the script timing out on the user are quite high, as the script is already running on the timeout limit at 60seconds, it was timing out on the user in the default 30 seconds.

    Quote Originally Posted by stereofrog
    - make usr_id UNIQUE KEY and repeat insert until you get no error:
    Code:
    do {
    $id = generateID();
    query("INSERT .... usr_id) VALUES (...... $id)")
    } while(mysql_errno());
    Are you suggesting improvements or what?? First, the usr_id is a PrimaryKey so no UNIQUE index is required, but the repeat INSERT is worse than repeat SELECT as its more resource intensive!!
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  9. #9
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by Etnu
    a.) Use an auto-incrimenting field. If you don't want users to be able to "guess" user ids, just do a hash on it, i.e. SELECT md5(userid) ...
    Well, the IDs aren't hidden from the users, infact the users know the IDs assigned to them, as they need it for various purposes like referrals etc.(don't ask, its just too complicated & I didn't design this system & have no power to override it, in my opinion, its not just right, since why are there usernames if IDs are to be used in their place!!).

    Quote Originally Posted by Etnu
    Also, if you have to resort to looking it up that way, 50,000 iterations isn't that big of a deal (your'e doing this at registration time, I assume?), if the field is indexed.
    Well I don't know what you consider to be a big deal, the field has just a PrimaryKey Index on it, I dunno if another Index(Full-Text??) would help, but hitting the DB 50,000 times repeatedly in space of few seconds is a big deal to me!! And yes, this is happening at the registration time.

    Quote Originally Posted by Etnu
    An auto incrementing INT field will hold billions of values and only requires 4 bytes of storage per record. your 8-digit varchar requires 9 bytes per record (actually, it might even be 17, but i don't recall mysql's character storage size off the top of my head).
    I know that & I prefer it that way as well, an auto-increment with an INT value, but like I said, I didn't design this system, and the alpha-num ID wasn't my idea & I don't have a say in this matter as well, & it can't be changed right now, not with 50000+ users already on board, to say nothing of the fact that I'll not be able to urge the bosses for converting, it would be disastrous!! The storage space is not a problem on the other hand!!
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  10. #10
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by Vennie
    Out of interest, how would you generate an alpha-num?
    I use the following function:
    PHP Code:
    function getRandomAlphaNumID() {
        
    $randomAlphabet null;
        
    $randomNumber null;
        for(
    $i=1$i<=4$i++) {
            
    $randomNumber .= rand(0,9);
            
    $randomAlphabet .= chr(rand(65,90));
        }
        return 
    $randomAlphabet.$randomNumber;


    Quote Originally Posted by stereofrog
    base_convert($numeric_id, 10, 32);
    erm, how would you create an alphanum string with that?? What I mean is that this can & will output pure alphabet strings as well, without any numerics, while the IDs in our system needs to be exactly in the same format, ABCD1234.
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  11. #11
    SitePoint Zealot
    Join Date
    Aug 2005
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by asp_funda
    Well I don't know what you consider to be a big deal, the field has just a PrimaryKey Index on it, I dunno if another Index(Full-Text??) would help, but hitting the DB 50,000 times repeatedly in space of few seconds is a big deal to me!! And yes, this is happening at the registration time.
    I'm on my first cup of coffee, so if my questions are stupid, please be nice.

    Your ABCD1234 format has room for 4billion+ unique values. You have 50,000 in use. How many duplicate values are actually being generated on average each time getUniqueID() runs?

    I can't imagine your db is being hit more than 2 or 3 maximum for each getUniqueID run, and I think most of the time it would only need to generate 1 ID. I can't really understand why it would be hit 50,000 times at once with the number of possible ID's and current number of ID's in use.

    It creates a value. It hits the db once to see if it already exists. If it does, it creates another. With so many possible values and so few (relatively) in use, you really shouldn't be creating many duplicates at all at this point. The chances of you creating 50,000 matches in a row out of 4billion+ values is pretty low (reallllly low).

    What's the maximum number of ID's that you think you will actually need as time goes on?

  12. #12
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by Snortles
    Your ABCD1234 format has room for 4billion+ unique values. You have 50,000 in use. How many duplicate values are actually being generated on average each time getUniqueID() runs?
    Yes, I know the number of combinations possible. I don't know how many duplicates are being generated on average.

    Quote Originally Posted by Snortles
    I can't imagine your db is being hit more than 2 or 3 maximum for each getUniqueID run, and I think most of the time it would only need to generate 1 ID. I can't really understand why it would be hit 50,000 times at once with the number of possible ID's and current number of ID's in use.
    I haven't really tested it, it was just my thoughts on significant increase in time of registration & I tried this on the development server(Win with IIS) by running a script that sent 1 request per 2-seconds & the server froze after running at 90%+ CPU for 20 seconds & fulfilling 17-18 requests!!

    Quote Originally Posted by Snortles
    It creates a value. It hits the db once to see if it already exists. If it does, it creates another. With so many possible values and so few (relatively) in use, you really shouldn't be creating many duplicates at all at this point. The chances of you creating 50,000 matches in a row out of 4billion+ values is pretty low (reallllly low).
    yes, it looks like that in theory but its a bit different in practical.

    Quote Originally Posted by Snortles
    What's the maximum number of ID's that you think you will actually need as time goes on?
    Well, I think that we won't be having any more than 100,000 or 150,000 users, & you can keep 50,000 as reserve on outside chance, its quite unlikely that we will have any more than 200,000 users.

    Quote Originally Posted by Snortles
    I'm on my first cup of coffee, so if my questions are stupid, please be nice.
    See, I was nice, no? But your questions weren't stupid!!
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  13. #13
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by asp_funda
    I know how to use auto_increment but only with Numerical IDs, not with AlphaNum IDs like I'm generating. If you can tell me how, then I think that it'll solve my problem, as the usr_ID here is a PrimaryKey in the table.
    Use auto_increment. Use the following function to generate an ID in your format from a number

    PHP Code:
    function int2alnum($v) {
        static 
    $a '0123456789abcdefghijklmnop';
        static 
    $b 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
        return
            
    strtr(substr(
                
    '0000' base_convert(intval(substr($v0, -4)), 1026),
                -
    4), $a$b) .
                
    sprintf("%04d"substr($v, -4));

    This works in range from 0 (expressed as AAAA0000) to 4569759999 (=ZZZZ9999).

  14. #14
    Dinah-Moe Humm mudshark's Avatar
    Join Date
    Dec 2003
    Posts
    1,072
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    stereofrog, out of curiousity: what purpose does "static" in front of $a and $b serve? I'm sure there is one...

  15. #15
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    static tells php it should allocate memory and initalize the var only once, not everytime you call the function.

  16. #16
    Dinah-Moe Humm mudshark's Avatar
    Join Date
    Dec 2003
    Posts
    1,072
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Aah okay, thanks for clearing that up.

  17. #17
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by stereofrog
    Use auto_increment. Use the following function to generate an ID in your format from a number

    PHP Code:
    function int2alnum($v) {
        static 
    $a '0123456789abcdefghijklmnop';
        static 
    $b 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
        return
            
    strtr(substr(
                
    '0000' base_convert(intval(substr($v0, -4)), 1026),
                -
    4), $a$b) .
                
    sprintf("%04d"substr($v, -4));

    This works in range from 0 (expressed as AAAA0000) to 4569759999 (=ZZZZ9999).
    Thanks but I still don't get how to use it!! I mean I will edit the usr_id field in the table to have the auto_increment value, right? Then?
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  18. #18
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The idea is that you have only auto_increments in DB, thus eliminating the whole problem this topic is about. Whenever you need to show an 'alnum' ID, you just format user's (numeric) ID using the function above, e.g.
    PHP Code:
    print "User ID=" int2alnum($real_user_id); 

  19. #19
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by stereofrog
    The idea is that you have only auto_increments in DB, thus eliminating the whole problem this topic is about. Whenever you need to show an 'alnum' ID, you just format user's (numeric) ID using the function above, e.g.
    PHP Code:
    print "User ID=" int2alnum($real_user_id); 
    while that's a really good solution, but its a bit late to change the IDs in the system, no? We need to look at some other solution, I'm not going to pull this off, as it'll mean that users won't be able to have the same IDs that they have & thus a number of things will have to be changed to suit this change & this is not possible!!
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!

  20. #20
    SitePoint Wizard dreamscape's Avatar
    Join Date
    Aug 2005
    Posts
    1,080
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How about instead of checking in the db after generating the ID, you select all current IDs from the database into a result array. Then in your ID generator, instead of checking against the database, you check against the result array. That would only be 1 query no matter how many times the ID generator has to try again.

    I've never tested, so I can't be sure, but I would bet that in_array(), even on an array with 50,000 elements, would be faster than query the database 10,000 times

  21. #21
    SitePoint Wizard TheRedDevil's Avatar
    Join Date
    Sep 2004
    Location
    Norway
    Posts
    1,198
    Mentioned
    4 Post(s)
    Tagged
    1 Thread(s)
    Im just wondering why its too late to change the system now?

    You only got approx 50k users, which is not much at all.

    Basically depending on the solution you chose (and you do have alot of solutions for this specifical problem) you can load the current IDs into a new table which only link the ID to its user, then delete the field from the database. Use either the real id field or create a new one.
    Then update the script so it first check for the new id type, if it dont find any match there it check the new "linking" table if it matches the old id type.

    No offence is meant but for me it seems like your dropping all the "easiest" and most "effective" solutions due to you think/feel they wont work due to ... ... ... instead of seeing the most obvious solution to fix the problem/mess (whatever you want to call it).

    On the other hand, as it has been mentioned I dont really think you got a problem at all, just improve your query and ID generator slightly and you will be good to go.

    This is quite possible, but then, the problem is only marginally reduced, since the possibility of generating similar IDs in any consequent batch is quite good. Like suppose I'm checking 100 IDs at a time. The Attempt-1, Attempt-2, Attempt-3 fail, now in the fourth attempt, I can get the same set of IDs as I got in Attempt-1. Ofcourse this can be checked against as I can keep storing the failed IDs in a seperate array so that the new batch can be matched against it, but then, it'll be memory intensive to keep an array with some 20000 or 30000 values & checking against them, chances of the script timing out on the user are quite high, as the script is already running on the timeout limit at 60seconds, it was timing out on the user in the default 30 seconds.
    Actually if you sit down and consider this solution for 2 minutes Im sure you will notice how stupid you sound in that answer...

    As your thinking on it Im sure it will hit you, that why dont I just make sure the query is not run more than x times?

    Just rewrite your generator and you are ready to go, in 98% of the cases the query will only be run once with twenty matches in it. And of course the function can be even more optimized and more ids can be added to the array. Btw please note that I have not tested the function below, so there might be a syntax error.

    PHP Code:
    function RandomID($id_array = array()) {
        
    $id_temp '';
        
    $id_real '(0 ';

        for(
    $nr=1;$nr <= 20;++$nr
            {
        
    $id_temp chr(rand(65,90)).chr(rand(65,90)).chr(rand(65,90)).chr(rand(65,90)).rand(0,9).rand(0,9).rand(0,9).rand(0,9);
                
            if (
    in_array($id_temp ,$id_array)) --$nr;
            else 
    $id_array[] = $id_temp;
            }

        foreach(
    $id_array as $key => $value)
            {
            
    $id_real .= ", '".$value."'";
            }

        
    $id_real .= ')';

      return array(
    $id_real$id_array);
        } 
    And just incase your trying to figure out which of the ids that are in use and which are not, just use the $id_array again to remove the ids found. If every id is found, just include the array to the function again to exclude those ids.

    And just a side note, if your script manage to timeout on those few connections you might want to beat up whoever made the database and did the optimalizing or the guy who wrote the php script. Ive made update scripts for big matrix sites which update up to a few million rows every hour, and that script finishes that job in a few minutes (0-3min depending on the update needed) i.e. first finding out what rows need to be updated, doing the updates in a temp table, then moving those updates to the real table from the temp one.

  22. #22
    SitePoint Zealot
    Join Date
    Aug 2005
    Posts
    123
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If it was me, before I went rewriting functions or redesigning my database, I would make sure I knew where the problem is. For me, that would mean actually logging how many times duplicate ID's are generated each time getUniqueID is being run.

    You say it's being run 10,000 times each time you try to create a new ID but I don't see how that's possible. If it is only running once or twice (like I think it should be) then the problem isn't createUniqueID, it's somewhere else.

    If it is running 10,000 times, then you aren't creating ID's in the range you think you are. Instead of creating ID's in the range of AAAA0000 - ZZZZ9999 it's creating them in a smaller range that's filling up with those 50,000 ID's you already have.

  23. #23
    SitePoint Evangelist ldivinag's Avatar
    Join Date
    Jan 2005
    Location
    N37 33* W122 3*
    Posts
    414
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    no one's brought up:

    is that an index?

    if no, then index the table on that column. it wont have to search every row, even if it not in the table...

    you can find out how many rows are being "looked at" by using the EXPLAIN command -- well at least in MySQL.

    dont know about the other DBs out there.

    so if you

    Code:
    EXPLAIN SELECT usr_name FROM tbl_users WHERE usr_id="aaaa1234" LIMIT 0, 1
    it will display how many rows were used.

    i had a similar issue. searches took like 20 seconds. really unacceptable. so i decided to EXPLAIN the sql statement. sure enough, it went through the entire table.

    then i looked at the table schema. no index on the column i was searching on.

    made a simple ALTER TABLE...

    searches dropped down to 2-3 secs...

    doh...
    leo d.

  24. #24
    SitePoint Addict
    Join Date
    May 2005
    Posts
    255
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by asp_funda
    Well, the IDs aren't hidden from the users, infact the users know the IDs assigned to them, as they need it for various purposes like referrals etc.(don't ask, its just too complicated & I didn't design this system & have no power to override it, in my opinion, its not just right, since why are there usernames if IDs are to be used in their place!!).

    Well I don't know what you consider to be a big deal, the field has just a PrimaryKey Index on it, I dunno if another Index(Full-Text??) would help, but hitting the DB 50,000 times repeatedly in space of few seconds is a big deal to me!! And yes, this is happening at the registration time.

    I know that & I prefer it that way as well, an auto-increment with an INT value, but like I said, I didn't design this system, and the alpha-num ID wasn't my idea & I don't have a say in this matter as well, & it can't be changed right now, not with 50000+ users already on board, to say nothing of the fact that I'll not be able to urge the bosses for converting, it would be disastrous!! The storage space is not a problem on the other hand!!
    Change it. There's really very little reason why you can't. The things listed here will help, but ultimately you're going to wind up dealing with this in other areas for some time to come. I'd even advocate changing it if it was 3rd party software -- this is just bad design. You don't make your primary key a field that you can't guarantee to be unique!

  25. #25
    SitePoint Wizard silver trophybronze trophy asp_funda's Avatar
    Join Date
    Jun 2003
    Location
    ether
    Posts
    4,497
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Cool

    Quote Originally Posted by dreamscape
    How about instead of checking in the db after generating the ID, you select all current IDs from the database into a result array. Then in your ID generator, instead of checking against the database, you check against the result array. That would only be 1 query no matter how many times the ID generator has to try again.
    Yes, that would be quite good as well, but then I thought of it as well & the thing is that in case of a new signups, you'll miss the new IDs in the array & that's risky!! Infact we are getting some signups so fast(like 3-4 signups simultaneously in 1 sec) that sometimes its a problem assigning IDs & I'm thinking of moving the new signups to a seperate table to act as a buffer & then let a cron run every 2 minutes or so assigning IDs to everyone in the buffer table and placing them in the users table(& emptying the buffer). The users can ofcourse be notified of their new IDs by email!!

    Quote Originally Posted by dreamscape
    I would bet that in_array(), even on an array with 50,000 elements, would be faster than query the database 10,000 times
    yeah, quite possibly so, since the DB read will be like reading a file & an array is stored in the memory, so it'll be faster.
    Our lives teach us who we are.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Me - Photo Blog - Personal Blog - Dev Blog
    iG:Syntax Hiliter -- Colourize your code in WordPress!!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •