SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Guru mwolfe's Avatar
    Join Date
    Mar 2005
    Posts
    912
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    dumbfounded by this problem (turns out random aint so random)

    Alright, this is the strangest thing i think i've come across in php/mysql yet..
    I can't figure out why its happening.
    I was trying to figure out a solution to a post by somoene else in this forum earlier today. I came up with a solution, but i also found another way of accomplishing the same thing that involved only an individual query
    see this post: http://www.sitepoint.com/forums/showthread.php?t=271457

    Now, i was curious as to how much the "query only" method took compared to the method i had come up with originally. So what i did was i generated a sample data for the tables.. Here is the table structure


    mysql> describe exams;
    +----------+-------------+------+-----+---------+----------------+
    | Field | Type | Null | Key | Default | Extra |
    +----------+-------------+------+-----+---------+----------------+
    | id | int(11) | NO | PRI | NULL | auto_increment |
    | examname | varchar(32) | NO | MUL | | |
    | date | date | NO | | | |
    +----------+-------------+------+-----+---------+----------------+
    3 rows in set (0.03 sec)

    mysql> describe examusername;
    +-----------+-------------+------+-----+---------+----------------+
    | Field | Type | Null | Key | Default | Extra |
    +-----------+-------------+------+-----+---------+----------------+
    | id | int(12) | NO | PRI | NULL | auto_increment |
    | username | varchar(32) | NO | MUL | | |
    | examgiven | varchar(32) | NO | | | |
    | score | int(4) | YES | | 0 | |
    +-----------+-------------+------+-----+---------+----------------+
    4 rows in set (0.00 sec)


    In case you havent read the other thread, what these tables are for is keeping track of exams taken by each user.. exams holds a list of all exams, and examusername contains the username and examgiven, which is the name of the exam they took.

    Note that i set a unique index on examname from exams, and i set a unique key on (username, examgiven) for the examusername table.

    here is the data generator i came up with (first time doing this)

    PHP Code:
    <?php
    <?php
    $conn 
    mysql_connect('localhost''matt''xxx');
    if (!
    $conn) {
        die(
    "could not connect to database");
    }
    mysql_select_db('exams') or die("Unable to select database");   

    $num_users 1000;
    $num_exams 1000;
    for (
    $i=0$i<$num_exams$i++) {
        
    mysql_query("INSERT INTO exams values(NULL, 'exam_$i', NOW())");
    }

    for (
    $i=1$i<40000$i++) {
        
    $rand1 rand(1$num_users-1); //random number for user
        
    $rand2 rand(1$num_exams-1); //random number for exam
        
    $score rand(1100);
        
    $q1 "INSERT INTO examusername (username, examgiven, score) values ('user_$rand1', 'exam_$rand2', $score)"

        
    $res mysql_query($q1);
        if (!
    $res) {
            echo 
    mysql_error();
        }

    }
    echo 
    "database updated";
    ?>
    ?>
    I thought for sure i was just doing something wrong with the inserts, because after i had a certain amount of entries, i would get errors generated from duplicate keys, which seemed normal after 10,000 entries or so that you might get a duplicate entry across two columns with 1000 distinct entries each. However, every insert after a certain point was a duplicate. I knew though, that i did not have nearly every combination of numbers because the table had less than 40,000 entries and techinically i believe there are a million combinations.. SO i just randomly (with my brain, not php) tried to insert one, and sure enought it inserted just fine, but when i ran the script, i would get duplicate entries for every single insert.. At the time i was only inserting like 1000 entries each time i loaded the page. But i changed the value now to 40,000. The last 3 times i've tried it, at insert 32,768 i start getting that error..
    so right now it dawns on me, i thought i saw that number somewhere (php website). I thought that meant though you could only generate random numbers in the range(1, 32768), not that you couldnt generate random number that many times before they would start over.. What does php use to seed its random function, its obviously not very random if this is the case.. i wouldnt want to do any statistical analysis using this function.. WHat does it mean to set rand_max.. is that a value in php.ini or something?

  2. #2
    SitePoint Guru mwolfe's Avatar
    Join Date
    Mar 2005
    Posts
    912
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    update. So i changed the script to get random values using mt_rand instead of rand. This works much better. I get a lot of duplicates, but i'm not sure i'm getting any more than expected amount (which would have been a good problem for the final exam i took yesterday). If anyone wants to calculate the 95 % CI for the expected value of duplicates keys from the 80,000 insert to 120,000 th insert, let me know, cus i probably couldnt do it..

  3. #3
    SitePoint Guru mwolfe's Avatar
    Join Date
    Mar 2005
    Posts
    912
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    since my original problem has been fixed, could anyone help me understand what is going on with my benchmarks.. I wonder if it has to do with how php processes a page.
    i save the time at the beginning of the page using $time = microtime(); then at the end of the page, i do, $finish_time = microtime() - $time; and i echo that.
    However, it seems my results are so sporatic. One time it will say .01 seconds, then .5 seconds, then .34 seconds, then -.03 seconds etc, and i'm not hitting the back button or anything like that.. It just doesnt seem right. I'm wondering if anyone has ideas about this as well...I added in mysql_free_result($result) after i was finished with queries to make sure that mysql wasnt like "caching" the results or something, not that it even does such a thing, but it usually seems like the second time around i get quicker results.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •