SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Addict
    Join Date
    Jul 2004
    Location
    Brooklyn, NY
    Posts
    316
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    What is the best method for replacing multiple keywords?

    I would like to write a function that replaces all instances of keyword X with the keyword Y in the specified string.
    This is not a problem, as I could simply use the replace() method.

    However, I have an array with all the keywords I need to replace along with their own associated counterpart.
    A -> B
    C -> D
    E -> F
    etc..

    The keyword->replace keywords are stored in an array as so:
    keyword['original']='modified';

    The array is coded into the Javascript file.

    All in all there are approximately 200 keywords, so what is the best approach or architecture for this function?

    Some methods I came up with..
    • Should I split() the words and then compare all the words individually?
    • Should I run a replace() function for all the keywords in my array?
    • Is there a faster method?


    Like most things, efficiency is crucial

  2. #2
    SitePoint Guru
    Join Date
    Sep 2006
    Posts
    731
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by knopix View Post
    Should I run a replace() function for all the keywords in my array?
    That would be the simplest algorithm:
    Code:
    for( var r in keywords )
     myString=myString.replace(new RegExp(r,'ig'), keywords[r]);
    Is there a faster method?
    Like most things, efficiency is crucial
    I can think of a more efficient algorithm involving creating a replacement string just once, but it's still unlikely to be faster than the native code above.
    Tab-indentation is a crime against humanity.

  3. #3
    SitePoint Guru
    Join Date
    Apr 2006
    Posts
    802
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There is no way to efficiently replace 200 different words in a string,
    but javascript would be even less efficient than normal if you used it
    to create 200 separate regular expressions and sieved the whole text through each,
    200 times....

    I would go with your original idea, but instead of a split, use a single regular expression
    to exec through the string, word by word, just once.
    It would be easier if your array was an object, so that's how I coded the example:

    Code:
    var keywords= {looks:'seems',was:'were',could:'might',string:'thing'}
    var str= 'Your array looks like an object. If it was an object, you could run an exec on the string-';
    
    var M,S= '',ax=0,tem;
    var Rx= /\b(\w+)\b/g;
    while((M= Rx.exec(str))!= null){
    	tem= M[1];
    	S+= str.substring(ax,M.index);
    	S+= keywords[tem] || tem;
    	ax= Rx.lastIndex;
    }
    S+= str.substring(ax);
    //returned value:
    Your array seems like an object. If it were an object, you might run an exec on the thing-

    This is equally good for short strings and long text files. One regular expression, one trip through the text.

    It's no trick to use the case insensitive flag to find matches, if you need to,
    but unless all your keywords have duplicates capitalized and upper case,
    you may need to convert the return to match the case of the string you are replacing.

    Code:
    var keywords= {looks:'seems',was:'were',could:'might',string:'thing'}
    var M,S= '',ax=0,tem,temL,temp;
    var str= 'Your array looks like an object. Could be, if it was, you could run an exec on the string-';
    
    var Rx= /\b(\w+)\b/ig;
    while((M= Rx.exec(str))!= null){
    	tem= M[1];
    	S+= str.substring(ax,M.index);
    
    	temL= tem.toLowerCase();
    	temp= keywords[temL];
    	if(temp){			
    		if(tem.toUpperCase()==tem) temp= temp.toUpperCase();		
    		else if(temL!=tem) temp=temp.charAt(0).toUpperCase()+temp.substring(1);
    	}
    	S+= temp || tem;
    	ax= Rx.lastIndex;
    }
    S+= str.substring(ax);
    // returned value:
    Your array seems like an object. Might be, if it were, you might run an exec on the thing-
    Last edited by mrhoo; Mar 29, 2008 at 01:58. Reason: case insensitive


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •