SitePoint Sponsor

User Tag List

Results 1 to 2 of 2
  1. #1
    SitePoint Addict DA Master's Avatar
    Join Date
    Apr 2004
    Location
    /etc/php.ini
    Posts
    398
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Fulltext Searching

    Right, I'm using this query...

    SELECT * FROM srls_sites WHERE MATCH (site_name) AGAINST ('test');

    With the site_name field set as FULLTEXT. However I execute the query and it doesn't work. No results are returned.

    I'm running queries through phpMyAdmin as the PHP is not finished yet.

    Please see the link below for a dump of the table contents...

    http://www.phplanet.co.uk/database.html

    And a screen dump of the table structure (sorry about size)...

    http://www.phplanet.co.uk/dump1.jpg

  2. #2
    SitePoint Addict silent's Avatar
    Join Date
    Jun 2004
    Location
    Roaming North America
    Posts
    220
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    From MySQL Manual:
    ...
    Such a technique works best with large collections (in fact, it was carefully tuned this way). For very small tables, word distribution does not adequately reflect their semantic value, and this model may sometimes produce bizarre results. For example, although the word ``MySQL'' is present in every row of the articles table, a search for the word produces no results:

    mysql> SELECT * FROM articles
    -> WHERE MATCH (title,body) AGAINST ('MySQL');
    Empty set (0.00 sec)

    The search result is empty because the word ``MySQL'' is present in at least 50% of the rows. As such, it is effectively treated as a stopword. For large datasets, this is the most desirable behavior--a natural language query should not return every second row from a 1GB table. For small datasets, it may be less desirable.

    A word that matches half of rows in a table is less likely to locate relevant documents. In fact, it will most likely find plenty of irrelevant documents. We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that rows containing the word are assigned a low semantic value for the particular dataset in which they occur. A given word may exceed the 50% threshold in one dataset but not another.
    Basically, you'll have to populate the table with some data before expecting your query to work properly...


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •