SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Member
    Join Date
    Feb 2006
    Posts
    4
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    FULLTEXT index and multiple languages

    Hello!

    I have created a script which download news from several news sources every day. I'm using fulltext index on the news text so I can find relevant news by searching for spesific keywords.

    The problem is; My news sources are in different languages. Should I make one table for each langugage so I get an own fulltext index for them? What is the most optimal solution to get the best search results when you are handling text with different languages?

    I hope you understand my question and there are some expreinced MySQL GURUs who can help me answer the question!

  2. #2
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    upgrade to mysql 5 and use utf8 for your character encoding. make sure to really read up on it as PHP does not yet easily support uft8 in all its native functions.

  3. #3
    SitePoint Member
    Join Date
    Feb 2006
    Posts
    4
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm not thinking on the character encoding. I'm thinking on the word weight in the index. For example will a word count zero if it exists in 50% of the rows or above. With multiple languages I will mess this up because the probability that a word exists in more than 50% of the rows will sink dramaticly (different languages, different words). How can I solve this best?

    As I can see, the best solution must be different tables for the different languages.

  4. #4
    reads the ********* Crier silver trophybronze trophy longneck's Avatar
    Join Date
    Feb 2004
    Location
    Tampa, FL (US)
    Posts
    9,854
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    use IN BOOLEAN MODE, which ignores that limit.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •