SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    SitePoint Member
    Join Date
    Feb 2010
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    JS Regular Expression question

    Hi everyone, I'm new to the site and looking for an answer to this:

    I want to match the word(s) that begin with a hashtag (#) on a given string. So far with a code like

    var mystring = "Going to the #museum and then to the #gym";
    var mypattern = /(#)([a-zA-Z0-9])*/;
    var myresult= mypattern.exec(mystring);

    myresult contains

    #museum,#,m

    but:

    1. I only want #museum
    2. It's not detecting #gym

    The main reason for all this is, I want to get all text that is not a hashtag, so in this example I want to get
    "Going to the "
    "and then to the "
    as two separate string. I thought using regular expressions would help me instead of searching the string for #.

    Any help is appreciated!

  2. #2
    SitePoint Guru
    Join Date
    Apr 2006
    Posts
    802
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try it with a global match instead of exec.

    Exec finds one match at a time, so you can do something to the matched text.
    If you are just collecting them, a match will get them all at once.

    var mystring = "Going to the #museum and then to the #gym";

    var mypattern = /(#[a-zA-Z0-9]+)/g;

    var myresult= mystring.match(mypattern);

    /* returned value: ['#museum','#gym'] */


    By the way, # is called the 'octothorp' , from a word meaning eight-points.

  3. #3
    SitePoint Member
    Join Date
    Feb 2010
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by mrhoo View Post
    Try it with a global match instead of exec.

    Exec finds one match at a time, so you can do something to the matched text.
    If you are just collecting them, a match will get them all at once.

    var mystring = "Going to the #museum and then to the #gym";

    var mypattern = /(#[a-zA-Z0-9]+)/g;

    var myresult= mystring.match(mypattern);

    /* returned value: ['#museum','#gym'] */


    By the way, # is called the 'octothorp' , from a word meaning eight-points.
    It worked! Thank you mrhoo for the solution and the correction in English (I am from Peru)

  4. #4
    SitePoint Guru
    Join Date
    Apr 2006
    Posts
    802
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    de nada. Mucho gusto en conocerle.

    You can use the same pattern in an exec,
    but you must loop through the string.

    Code:
    var result='', pat, rx = /(#\w+)/g;
    while((pat=rx.exec(mystring))!=null){
    	result+=pat[1].substring(1)+' ';
    }
    result.slice(0,-1);
    
    /*  returned value: (String)  'museum gym' */

  5. #5
    SitePoint Member
    Join Date
    Feb 2010
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    mrhoo:

    Is there a way (on regular expressions or any other) that I can transform:

    "Going to the #museum and then to the #gym";

    into an array that looks like:

    ["Going to the ","#museum"," and then to the ","#gym"]

    ?

  6. #6
    SitePoint Guru
    Join Date
    Apr 2006
    Posts
    802
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Sure- you want to return everything,
    keeping the flagged words separate but in source order.

    In regular expression syntax a pipe (|) means 'or':

    var rx=/([^#]+)|(#\w+)/g;

    ([^#]+)= match anything not an #
    (#\w+)= match an # plus any number of word characters (a-zA-Z0-9 and _)

    Code:
    var s="Going to the #museum and then to the #gym";
    s.match(/([^#]+)|(#\w+)/g)
    /* returns (Array): ['Going to the ','#museum',' and then to the ','#gym'] */

  7. #7
    Unobtrusively zen silver trophybronze trophy
    paul_wilkins's Avatar
    Join Date
    Jan 2007
    Location
    Christchurch, New Zealand
    Posts
    14,684
    Mentioned
    99 Post(s)
    Tagged
    4 Thread(s)
    People often say that you need to learn about regular expressions, and you don't bother to because you don't know what you're missing.

    This thread is a perfect example of why you need to learn about regular expressions.

    Thank you mrhoo.
    Programming Group Advisor
    Reference: JavaScript, Quirksmode Validate: HTML Validation, JSLint
    Car is to Carpet as Java is to JavaScript

  8. #8
    om nom nom nom Stomme poes's Avatar
    Join Date
    Aug 2007
    Location
    Netherlands
    Posts
    10,272
    Mentioned
    50 Post(s)
    Tagged
    2 Thread(s)
    By the way, # is called the 'octothorp' , from a word meaning eight-points.
    Huh, I didn't know that either, but Engrish has lots of names for that thing (number, pound, hash, (cross)hatch). For programming I'm going to keep calling it hash(mark) or shebang, depending on where it is, so people know what I'm talking about.

  9. #9
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Thailand
    Posts
    278
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Came up with this match, which removes the trailing spaces. (If you'd want that???)

    Can it be simplified?

    var s="Going to the #museum and then to the #gym";
    console.log (s.match(/(\w+ ?)+(?= #)|(#\w+)/g)); // 'Going to the','#museum','and then to the','#gym'

    edit: forget it! it fails with "Going to the #museum and then to the #gym and then the pub"

    RPG

  10. #10
    SitePoint Member
    Join Date
    Feb 2010
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by pmw57 View Post
    People often say that you need to learn about regular expressions, and you don't bother to because you don't know what you're missing.

    This thread is a perfect example of why you need to learn about regular expressions.

    Thank you mrhoo.
    Thank you indeed! I always thought regular expressions were very helpful, I guess I have to get me some tutorials and/or books. Any links recommended? I couldn't find any on the Links/Tutorial thread.

  11. #11
    SitePoint Guru
    Join Date
    Apr 2006
    Posts
    802
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This site is devoted to Regular expressions in javascript:
    http://www.regular-expressions.info/javascript.html

    But try to find the Book, 'Mastering Regular Expressions',by Jeffrey EF Friedl,
    published by O'Reilly. It's a keeper, and covers other programming languages as well as javascript.

    They are extremely useful in server scripts.

  12. #12
    SitePoint Member
    Join Date
    Feb 2010
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thank you mrhoo for the tip!

    I have another question revolving on the same problem, but since it's not related to regular expressions anymore, I will make a another post (if I can't find the answer already in the forums).

  13. #13
    SitePoint Addict
    Join Date
    Nov 2008
    Location
    Thailand
    Posts
    278
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Just to fix the above.

    /(\w+ ?)+(?= #)?|(#\w+)/g


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •