SitePoint Sponsor |
|
User Tag List
Results 1 to 3 of 3
Thread: parse links from html
-
Oct 25, 2008, 07:19 #1
- Join Date
- Sep 2006
- Posts
- 6
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
parse links from html
hi, i need a regex help
for now i have
suburl = /<a[^>]+href\s*=\s*("|\')([^"|\']+)[^>]*>(.*)<\/a>/g.exec(HTML)
i get suburl[2] = url , suburl[3] = link name, but not workin fine in some situations.
for instance if i have "<a href="xxx">yyyy</a><a>dddddd<a>" the result is messed up with both yyyy and dddd
problem is in the (.*)<\/a> i guess
any suggestion
-
Oct 25, 2008, 09:15 #2
- Join Date
- Jul 2008
- Posts
- 5,757
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
Make it match ungreedy
(.*?)
-
Oct 25, 2008, 11:45 #3
- Join Date
- Sep 2006
- Posts
- 6
- Mentioned
- 0 Post(s)
- Tagged
- 0 Thread(s)
cool thats it
thx man!
Bookmarks