SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Zealot Xia's Avatar
    Join Date
    Aug 2003
    Location
    Belgium
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Ordering by Rating

    Hey,

    I have two columns in a item table, rating and numRating

    rating is the sum of all ratings, one rating can range from 1 to 5
    numRating is the number of total ratings

    Now, how can I order this EFFECTIVELY by rating?

    Just doing an order by numRating, or rating won't do the trick since if 1/5 was rated 40 times that'd be at the top for an ORDER BY numRating query, similar for the other ORDER BY clause.

    So how can I do this?

    Thanks!

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    something like the true Bayesian estimate formula?

    here's an example: http://www.wowwebdesigns.com/formula.php
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Zealot Xia's Avatar
    Join Date
    Aug 2003
    Location
    Belgium
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Right now I have this query :

    Code:
    SELECT 
     tblitem.itemID, 
     tblitem.itemName, 
     tblauthor.authorName, 
     ( ( tblitem.numberofratings / ( tblitem.numberofratings + 5 ) ) * tblitem.rating + ( 5 / ( tblitem.numberofratings + 5 ) ) * $averageRating ) AS bayesianRating 
    FROM 
     tblitem, 
     tblauthor 
    WHERE 
     tblitem.categoryID = $categoryID AND 
     tblitem.authorID = tblauthor.authorID AND 
     tblitem.live = 1
    ORDER BY 
     bayesianRating 
    LIMIT 
     $start,
     $limit
    Where $averageRating = C

    ( Reference: )
    weighted rank (WR) = (v / (v+m)) * R + (m / (v+m)) * C

    where:
    R = average for the design (mean) = (Rating)
    v = number of votes for the design = (votes)
    m = minimum votes required to be listed in top 25 (currently 6)
    C = the mean vote across dimension
    But, when the nr of ratings is 0, bayesianRating will be qual to $averageRating (c), how can I stop this with this query ?

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    don't you have a scripting language?

    how did you get $averageRating? that could only
    come from a GROUP BY query that you must have run
    before this query

    so use your scripting langage to do an if=0 test
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Zealot Xia's Avatar
    Join Date
    Aug 2003
    Location
    Belgium
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No, after that query gets executed bayesianRank will equal $averageRating, and since I use a ORDER BY clause it's impossible to use a scripting language afterwards (without crossing the $start, $limit LIMIT)

    Is it only possible if I do it in two queries? first where numberOfRatings <> 0 and second where numberOfRatings == 0?

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,216
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    sorry, no idea

    i've never done this myself, you know
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Member
    Join Date
    May 2004
    Location
    TX
    Posts
    3
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I remember having the same problem about 2 years ago on a music site I was working on. It was still early in my programming days but it works fine. I Just Selected Everything out of the rating table and used PHP to handle the calculations and sorting.

    heres the code I used

    Code:
    $query = "SELECT artists.name, artists_ratings.num_votes, artists_ratings.vote_total FROM artists, artists_ratings Where artists.artist_id = artists_ratings.artist_id";
    $result = mysql_query($query) or die (mysql_error());
    
    $artist_name = array();
    $rating = array();
    $votes = array();
    
    while (list($artist, $num_votes, $vote_total) = mysql_fetch_array($result))
    {
    for ($i=0; $i<count($result); $i++)
    	{
    	$average = $vote_total / $num_votes;
    	$average = round($average, 2);
    
    	array_push ($artist_name, $artist);
    	array_push ($rating, $average);
    	array_push ($votes, $num_votes);
    	}
    }
    
    array_multisort($rating, SORT_DESC, $artist_name, $votes);
    
    //  List the Sorted results //
    for ($x=0; $x<count($artist_name); $x++)
    {
         echo "$artist_name[$x] - $rating[$x] - $votes[$x]<br>";
    }

  8. #8
    SitePoint Zealot Xia's Avatar
    Join Date
    Aug 2003
    Location
    Belgium
    Posts
    189
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I was able to get it to work using this as part of my query :

    Code:
    $sql = ...
    if ( tblitem.numberOfRatings > 0, 
    ( ( tblitem.numberOfRatings / ( tblitem.numberofratings + 5 ) ) * tblitem.rating + ( 5 / ( tblitem.numberofratings + 5 ) ) * $averageRating ), 
    0 ) 
    AS bayesianRating...;
    So if numberOfRatings = 0 bayesianRating will be 0

    Now however, I find the results awkward:

    - Ratings range from 1 to 10.
    - Average Rating (C) for a particular category is 8.06

    As you can see in the query, 5 is the least amount of votes (m), so if a particular item has less votes, it's rating should be skewed downhill, and if it has more votes, it should go up.

    "(In other words, if a particular design has only a few votes above the minimum required votes to be listed in top 25 (m), the average score is decreased a little if it is above the mean, or increased a little if it is below the mean in accordance with the normal distribution rule of statistics.)"
    I will show you the first 10 results of this category: (this is the bayesian rating) (rounded to two decimals)

    1) 8,54 (rated 3 times)
    2) 8,26 (rated 4 times)
    3) 8,23 (rated 5 times)
    4) 8,16 (rated 9 times
    5) 8,05 (rated 1 times)
    6) 8,05 (rated 1 times)
    7) 8,04 (rated 2 times)
    8) 8,02 (rated 15 times)
    9) 7,85 (rated 3 times)
    10) 7,43 (rated 4 times)

    Here are the actual AVERAGES (R) of these items from the db:
    1) 9,34 (rated 3 times)
    2) 8,5 (rated 4 times)
    3) 8,4 (rated 5 times)
    4) 8,22 (rated 9 times
    5) 8 (rated 1 times)
    6) 8 (rated 1 times)
    7) 8 (rated 2 times)
    8) 8 (rated 15 times)
    9) 7.5 (rated 3 times)
    10) 6.64 (rated 4 times)

    In the case of nr 8, here are the parameters:

    C = 8.06
    R = 8
    v = 15
    m = 5

    weighted rank (WR) = (v / (v+m)) * R + (m / (v+m)) * C

    WR = (15 / 20 ) * 8 + ( 5 / 20 ) * 8.06
    = 8.015

    Changing m to 10 would hopefully better this?

    WR = ( 15 / 25 ) * 8 + ( 10 / 25 ) * 8.06
    = 8.024

    The order of all ten items is still the same (just a slightly higher number) after changing m to 10.

    Am I doing anything wrong here??


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •