SitePoint Sponsor

User Tag List

Results 1 to 14 of 14
  1. #1
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Check for multiple keywords in string

    Hi guys

    I hope this is possible.

    Basically, I have the following code (I have simplified it as to explain this easier) in my asp page


    ----------------
    <%
    description = "the little brown fox jumped over the lazy dog"
    %>

    <%=description%>
    ---------------


    This simply displays the "the little brown fox jumped over the lazy dog" sentence on my page. However, I would like to cross match this sentence with an array of words. So, I now have the following "wordmatch" value at the top of my page:


    --------------
    <%
    wordmatch = "brown, frog, fox, dog, bridge, moon"
    description = "the little brown fox jumped over the lazy dog"
    %>

    <%=description%>
    -------------

    How would I check the "description" value for the 5 words above and display any matches it has found on my page? So, I would now just get the following text displayed on my page

    -------------
    brown, fox, dog
    ------------

    Any help would be fully appreciated

    Best regards

    Rod from the UK

  2. #2
    SitePoint Guru
    Join Date
    Jun 2007
    Posts
    690
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

  3. #3
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hI wEBER123456

    Thank you so much for your response to my thread.

    The link you provided "almost" provided a solution to my problem. However, my description string "the little brown fox jumped over the lazy dog" does not contain any commas so this script willnot work. The script you provided uses

    strARR1 = "5 10,15,20,25,30"

    and compares it against...

    strARR2 = "5,6,7,8,9,10"

    It only works is the keywords in the "strARR1" string are separated by commas.

    If you have any ideas how to get around then then that would be fully appreciated

    I look forward to hearing from you

    Best regards

    Rod from the UK

  4. #4
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Webber132456

    I think I have managed to adapt the aspkey sample code you provided and have "almost" got it to do exactly what I need:

    However, I am really struggling as I am unable to match multiple keywords. Here's my script:

    -------------------START------------------
    <%

    '------------------------------------------------------------------
    'create value lists to be compared
    '------------------------------------------------------------------


    strARR1 = "sportback 2.0 tdi s line manual satellite navigation bluetooth phone prep xenon headlamps dvd entertainment efficient dynamics electric heated leather seats"
    strARR2 = "manual|xenon headlamps|abs|heated|heated leather seats"

    Dim ARR1, ARR2


    '------------------------------------------------------------------
    'Convert lists into arrays
    '------------------------------------------------------------------


    ARR1 = split(strARR1," ")
    ARR2 = split(strARR2,"|")


    '------------------------------------------------------------------
    'Compare arrays and find values common to both
    '------------------------------------------------------------------


    strOutput = ""
    for i = 0 to ubound(ARR1)
    for j = 0 to ubound(ARR2)
    if ARR1(i)=ARR2(j) then strOutput=strOutput & ARR1(i) & ","
    next
    next

    strOutput = Left(strOutput, Len(strOutput)-1)
    strOutput = strOutput & ""


    '------------------------------------------------------------------
    'Show on screen
    '------------------------------------------------------------------

    %>

    <br><br>
    <b>Original string/b> <%=strARR1%><br><br>
    <b>Keyword list/b> <%=strARR2%><br><br>
    <b>Matching keywords/b> <%=strOutput%>

    --------------------------------------END--------------------------------

    This works almost perfectly. However, it only finds "manual" and "heated" as matches when, I would also expect to see "xenon headlamps" and "leather heated seats" in the list. It's as though the script will not allow for more than one keyword in the array.

    Have you any ideas how this can be fixed? Any help would be fully appreciated.

    I look forward to hearing from you

    Best regards

    Rod from the UK

  5. #5
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    ARR1 = split(strARR1," ")

    Tell me - what do you think that line is doing? Once you understand what's happening, the reason why it is failing will be obvious.
    Ian Anderson
    www.siteguru.co.uk

  6. #6
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Ian

    Thanks for your response to my post.

    Of course, it's pretty obvious!

    Thanks for your help

    Best regards

    Rod from the UK

  7. #7
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Ian

    I think I have found a more efficient script which can be seen here:

    ---------------------------------START------------------------------
    <%
    ' initial array of words:
    dictwords = Array("navigation","manual","xenon headlamps","abs","heated","heated leather seats")



    ' convert it to a dictionary:
    Set dict = Server.CreateObject("Scripting.Dictionary")
    For w = 0 To UBound(dictwords)
    dict.add dictwords(w), 0
    Next
    %>


    <%
    ' process a description:
    description = "sportback 2.0 tdi s line manual satellite navigation bluetooth phone prep xenon headlamps dvd entertainment efficient dynamics electric heated leather seats"
    words = Split( description, " " ) ' but what about splitting on punctuation??
    %>

    Matches:

    <%

    For w = 0 To UBound(words)
    word = words(w)
    If dict.exists( word ) Then
    %>

    <%=word%>,
    <%
    dict.Remove( word )
    End If
    Next
    %>
    -----------------------------------END--------------------------------

    Like the last script it does exactly what I need. However, I have the same problem as before where it won't detect multi-word phrases like "xenon headlamps"or "leather heated seats" in the list. I was able to fix this in the last script but have no idea how to sort the same issue out in this new script.

    Have you any ideas how I can rectify this?

    Again, thanks so much for your help on this.

    I look forward to hearing from you

    Best regards

    Rod from the UK

  8. #8
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    The problem here is exactly the same - you are splitting on the space character so getting an array of single words. If the elements in Description were separated by another character (e.g. comma) then you could split on that character and you'd get the desired result.

    description="sportback, 2.0 tdi, s line, manual, satellite, navigation, bluetooth, phone, prep, xenon headlamps, dvd, entertainment, efficient dynamics, electric, heated leather seats"
    words = Split( description, "," )

    For w = 0 To UBound(words)
    word = Trim(words(w)) 'Remove any extraneous spaces from the start/end of each element
    Ian Anderson
    www.siteguru.co.uk

  9. #9
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Ian

    Thank you so much for your reply.

    Your entirely right and I think this is my main problem, the description will not be separated by another character, the description will just be words. For example, my description would be:

    "sportback 2.0 tdi s line manual satellite navigation bluetooth phone prep xenon headlamps dvd entertainment efficient dynamics electric heated leather seats"

    I need to be able to detect both "manual" and "Bluetooth phone".

    Is it possible?

    I look forward to hearing from you

    Best regards

    Rod from the UL

  10. #10
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,191
    Mentioned
    191 Post(s)
    Tagged
    2 Thread(s)
    I imagine there are even longer sequences you might be interested in as well?

    Without any delimiters any programmatic solution won't be so easy.

    I woulld start be figuring out how I could "filter". eg. only words 4 or more characters long.
    I would put together a list of other "exempt" words (long enough but of no interest)
    I would create (rather tedious and time consuming) a "dictionary" of sorts of words that might be part of a sequence and their position in it.
    Then it would be a more than likely take a long running script to check strings for words, word pairs, word triplets, .......

    It might be easier to start from the other direction and determine exactly what words you're interested in first and then query for them?

    Either way this sounds like it will take considerable human effort, though the second approach would save on resource use.

  11. #11
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I see.

    Thanks for your help on this.

    I think I'm going to have to tackle this from another angle.

    Best regards

    Rod from the UK

  12. #12
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Code:
    <%
    Randomize
    Function ArrayShuffle(arr)
        Dim index
        Dim newIndex
        Dim firstIndex
        Dim itemCount
        Dim tmpValue
        
        firstIndex = LBound(arr)
        itemCount = UBound(arr) - LBound(arr) + 1
        
        For index = UBound(arr) To LBound(arr) + 1 Step -1
            ' evaluate a random index from LBound to INDEX
            newIndex = firstIndex + Int(Rnd * itemCount)
            ' swap the two items
            tmpValue = arr(index)
            arr(index) = arr(newIndex)
            arr(newIndex) = tmpValue
            ' prepare for next iteration
            itemCount = itemCount - 1
        Next
        ArrayShuffle=arr
    End Function
    
    
    sDesc="electric mirrors, alloys, power assisted steering, quattro, spoiler, 8-way stereo speaker"
    Response.Write "<p>" & sDesc & "</p>"
    
    
    'Get the string into an array
    aDesc=Split(sDesc,",")
    For i=0 to Ubound(aDesc)
        aDesc(i)=Trim(aDesc(i)) 'remove extraneous spaces
        Response.Write "" & aDesc(i) & "" & "<br>" 'Demo only - delete this line
    Next
    
    
    aDesc=ArrayShuffle(aDesc) 'Shuffle the array
    
    
    'Below section is for demo only - delete
    Response.Write "<br>"
    For i=0 to Ubound(aDesc)
        Response.Write "" & aDesc(i) & "" & "<br>"
    Next
    
    
    sDesc=Join(aDesc,", ") 'Recreate the string
    Response.Write "<p>" & sDesc & "</p>"
    %>
    Ian Anderson
    www.siteguru.co.uk

  13. #13
    SitePoint Enthusiast
    Join Date
    Jan 2014
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi Ian

    Absolutely perfect - thank you so much!

    Best regards

    Rod from the UK

  14. #14
    SitePoint Wizard siteguru's Avatar
    Join Date
    Oct 2002
    Location
    Scotland
    Posts
    3,629
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    That was actually meant to be posted in your other thread! lol
    Ian Anderson
    www.siteguru.co.uk


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •