SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Addict
    Join Date
    Nov 2006
    Posts
    206
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Determining most matches in multiple search

    I am wondering, if a search term was "hello there old friend" and each word was a searchword. Is there a way to find out which result contains the most of these words in one field. The following SQL as an example
    Code SQL:
    SELECT 
    DISTINCT `ContactID` 
    FROM 
    `contact` 
     
    WHERE 
          `Notes` LIKE '%hello%'  ||
          `Notes` LIKE '%there%'  ||
          `Notes` LIKE '%old%'  ||
          `Notes` LIKE '%friend%'  
     
    ORDER BY 
    CASE WHEN `Notes` LIKE '%hello there my old friend%' THEN 1 ELSE 0 END
    + CASE WHEN `Notes` LIKE '%hello%' THEN 0 ELSE 1 END
    + CASE WHEN `Notes` LIKE '%there%' THEN 0 ELSE 1 END
    + CASE WHEN `Notes` LIKE '%my%' THEN 0 ELSE 1 END
    + CASE WHEN `Notes` LIKE '%old%' THEN 0 ELSE 1 END
    + CASE WHEN `Notes` LIKE '%friend%' THEN 0 ELSE 1 END

    ? If the case when words would it be recommended to break apart the search term so that all permutations and combinations can be found such as
    case when `Notes` like hello there my old
    """"""""""""""""""""""""""""""" there my old friend
    """"""""""""""""""""""""""""""" hello there my
    """"""""""""""""""""""""""""""" there my old
    """"""""""""""""""""""""""""""" my old friend
    """"""""""""""""""""""""""""""" hello there
    """"""""""""""""""""""""""""""" there my
    """"""""""""""""""""""""""""""" my old
    """"""""""""""""""""""""""""""" old friend
    """"""""""""""""""""""""""""""" hello
    """"""""""""""""""""""""""""""" there
    """"""""""""""""""""""""""""""" my
    """"""""""""""""""""""""""""""" old
    """"""""""""""""""""""""""""""" friend

    Would such a technique be strenous on the mysql query, since a user could technically enter as many words as they would like; it would take longer I suppose to search a phrase of 10 words as oppose to 5 words. Which I guess would add 55 or 15 more lines of code respectively if my memory on statistics is correct

    Is there something better?
    Last edited by prince_mallow; Jan 20, 2011 at 11:01. Reason: Edit in code
    Please...Never describe anything to me using foo and bar.

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,341
    Mentioned
    63 Post(s)
    Tagged
    3 Thread(s)
    this technique would not be noticeably slower than just searching for the phrase, so go ahead and do it

    the performace is marred by even one LIKE with a leading wildcard, which requires a table scan

    once you're doing a table scan anyway, you might was well rip apart each row and analyze it however you want



    p.s. please use the standard sql OR keyword, instead of those nasty proprietary mysql double pipes
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Addict
    Join Date
    Nov 2006
    Posts
    206
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I always love it when you respond rudy ;D I've replaced the pipes, and ripping the table apart like crazy now haha
    Please...Never describe anything to me using foo and bar.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •