SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Zealot
    Join Date
    Feb 2003
    Posts
    156
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    matching "related topics"

    'Simple' Scenario: We have an article database. Now when displaying an article I would like to show the surfer a selection of related articles.

    I do have a simple solution implemented, but I am not too content about it. Current implementation: For each article the writer enters a number of keywords seperated with a space (limited to 50 chars, field has an index in db). When displaying, I explode() the keywords and do a series of like %keyword% on all other articles' keywords_field. Speed is ok, but results could be better.

    How would I best go about that? What would be a better solution?
    Best would be a db-independant solution, or else sth. that works with mysql3.23 (saying that because of the fulltext-search-functionalitry in mysql4, can't use that).

  2. #2
    + platinum's Avatar
    Join Date
    Jun 2001
    Location
    Adelaide, Australia
    Posts
    6,441
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I was trying to do something like this with a search feature - I used http://www.php.net/manual/en/function.metaphone.php - It's very effective sometimes, but crazy other times

  3. #3
    SitePoint Zealot
    Join Date
    Feb 2003
    Posts
    156
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for the answer, that's definitely an interesting function I did not know about.

    However my question was geared in a totally different direction. I wanted to know wether there is a better way than matching each keyword I explode from the current wit a "like %keyword%" to all other keyword fields?
    This is not for a search function, but to match similar topics.

  4. #4
    + platinum's Avatar
    Join Date
    Jun 2001
    Location
    Adelaide, Australia
    Posts
    6,441
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's probably the most accurate way I think.

    You could assign each topic a category - like fruit, vegetables, meat, dairy, etc

    And then just select the latest topics from each category? (of course it would be difficult if the articles don't relate at all to each other )

  5. #5
    Sidewalking anode's Avatar
    Join Date
    Mar 2001
    Location
    Philadelphia, US
    Posts
    2,205
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Split the keywords off into their own table.
    Code:
    article id | keyword | weight (optional)
    The optional weight column would help with sorting; another option for sorting would be to try to get the best overlaps of keywords.
    TuitionFree a free library for the self-taught
    Anode Says... Blogging For Your Pleasure

  6. #6
    SitePoint Zealot
    Join Date
    Feb 2003
    Posts
    156
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    We do have categories, however the current method is already a bit more accurate than just category matching.

    Quote Originally Posted by anode
    Split the keywords off into their own table.
    Code:
    article id | keyword | weight (optional)
    The optional weight column would help with sorting; another option for sorting would be to try to get the best overlaps of keywords.
    This looks like a nice idea. It will complicate things a bit, but the results may very well be worth it. However letting the user enter the weight would make it complicated to use.
    I think giving weight according to the number of occurences of each key word might be an option (the more it appears the less weight).
    I'd have to recalculate the weight for all entries once in a while...

    I like the idea. I think I'll try it, if none comes up with a better idea.

  7. #7
    Sidewalking anode's Avatar
    Join Date
    Mar 2001
    Location
    Philadelphia, US
    Posts
    2,205
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by R. U. Serious
    I think giving weight according to the number of occurences of each key word might be an option (the more it appears the less weight).
    That's really clever;it would certainly focus more "specific" topics togther.
    TuitionFree a free library for the self-taught
    Anode Says... Blogging For Your Pleasure


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •