SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Member
    Join Date
    Sep 2006
    Posts
    6
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    parse links from html

    hi, i need a regex help

    for now i have
    suburl = /<a[^>]+href\s*=\s*("|\')([^"|\']+)[^>]*>(.*)<\/a>/g.exec(HTML)

    i get suburl[2] = url , suburl[3] = link name, but not workin fine in some situations.
    for instance if i have "<a href="xxx">yyyy</a><a>dddddd<a>" the result is messed up with both yyyy and dddd

    problem is in the (.*)<\/a> i guess

    any suggestion

  2. #2
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Make it match ungreedy

    (.*?)

  3. #3
    SitePoint Member
    Join Date
    Sep 2006
    Posts
    6
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    cool thats it

    thx man!


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •