SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    MySQL Fulltext and scores.

    Hello,
    I am trying to understand why my fulltext queries are returning these scores, they don't make sense to me.

    First off, here is my fulltext create
    Code:
    ALTER TABLE main ADD FULLTEXT(col1, col2, col3);
    In my table i have
    Code:
    main('Hello world / even more stuff', '', '')
    main('Hello world / even more stuff', 'testing', 'tester')
    main('Hello world / even more stuff', 'Hello world', 'Hello world')
    Now here is my query that gets the scores
    Code:
    SELECT
    	MATCH(col1, col2, col3) AGAINST('+"hello world"' IN BOOLEAN MODE) as blnScore,
    	MATCH(col1, col2, col3) AGAINST('hello world') as intScore
    
    FROM
    	main
    WHERE
    	MATCH(col1, col2, col3) AGAINST('hello world')
    ORDER BY blnScore DESC, intScore DESC

    You would think that the row with 'Hello world' in every column would have the highest score, since the index is on every column, but that is not the case.

    Code:
    main('Hello world / even more stuff', '', '') //Score - 6.48660516738892
    main('Hello world / even more stuff', 'testing', 'tester') //Score - 6.34852838516235
    main('Hello world / even more stuff', 'Hello world', 'Hello world') //Score - 3.5588550567627
    Any ideas of what the problem could be?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    how many total rows in the table?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Those are the only three rows.

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    load your table with 3000 rows, and try again

    the mysql manual specifically states that "For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results."
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Sorry, i meant in my test scenario(hello world) there was only three.
    In my actually database is is over 1m

  6. #6
    SitePoint Member
    Join Date
    Sep 2008
    Posts
    15
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    r937, do you have any more recommendations?

  7. #7
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,270
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    other than you shouldn't put any stock in the actual number, no

    the "relevance" score is only supposed to return the most relevant rows first -- the number itself doesn't mean anything
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  8. #8
    Keeper of the SFL StarLion's Avatar
    Join Date
    Feb 2006
    Location
    Atlanta, GA, USA
    Posts
    3,748
    Mentioned
    72 Post(s)
    Tagged
    0 Thread(s)
    Also, the words "Hello" and "world" are in more than 50% of your rows (100%), and are therefore excluded from your search as per the Natural Language search rules... which makes your search somewhat irrelevant.

  9. #9
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,032
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    What storage engine is the table using?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  10. #10
    Utopia, Inc. silver trophy
    ScallioXTX's Avatar
    Join Date
    Aug 2008
    Location
    The Netherlands
    Posts
    9,082
    Mentioned
    153 Post(s)
    Tagged
    2 Thread(s)
    Quote Originally Posted by SpacePhoenix View Post
    What storage engine is the table using?
    Since MyISAM is the only engine supporting full text, I'm guessing MyISAM
    Rémon - Hosting Advisor

    SitePoint forums will switch to Discourse soon! Make sure you're ready for it!

    Minimal Bookmarks Tree
    My Google Chrome extension: browsing bookmarks made easy


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •