Regex help please

Hello all,

I’m trying to get some information from a web page but I’m not very good at regular expressions.


<a href="javascript:func('1','56','123kUhygtY6','663330463','400','400')" title='Link'>Content</a>

I need to get the values in the javascript function that precedes Content (Ex: ‘1’,‘56’,‘123kUhygtY6’,‘663330463’,‘400’,‘400’).

Any help would be greatly appreciated and if you could explain it that would help me learn a lot.

[a-zA-Z0-9] is the same as \w
{1,} is the same as +

So you can shorten it like so

/"javascript:func\(\‘(\w+)\’,\‘(\w+)\’,\‘(\w+)\’,\‘(\w+)\’.+?Content/

Does exactly the same thing (well, technically \w also matches underscores whereas you version did not, but I guess you can live with that, right? :))
I changed .{1,} at the end with .+? to make it non-greedy.

The only problem I see is that regex will not work for

<a href=“javascript:func(‘1’,‘56’,‘123\'kUhygtY6’,‘663330463’,‘400’,‘400’)” title=‘Link’>Content</a>

But I don’t know if that will ever happen. If not, I’d leave it as it is.

Excellent! Thanks for all your help!


preg_match('/"javascript:func\\(([^\\)]+)\\)"/',$str,$matches);
var_dump($matches);
/*
array(2) {
  [0]=>
  string(65) ""javascript:func('1','56','123kUhygtY6','663330463','400','400')""
  [1]=>
  string(46) "'1','56','123kUhygtY6','663330463','400','400'"
}
*/

To break it down:

“javascript:func\( – Match "javascript:func( litteraly
([^\)]+) – Match everything except ) one or more times, and store it in backreference #1
\)” – Match ) Literally

:slight_smile:

Ah, I misread


preg_match('/<a[^>]+>([^>]+)<\\/a>/', $str, $matches);
var_dump($matches);
/*
array(2) {
  [0]=>
  string(98) "<a href="javascript:func('1','56','123kUhygtY6','663330463','400','400')" title='Link'>Content</a>"
  [1]=>
  string(7) "Content"
}
*/

:slight_smile:

Thanks for your help. I’m actually trying to match on the word ‘Content’ like in the snippet I posted above.

Is there a way to get the individual values or would I use another function like explode to get those?

This works. It gets the first four arguments which is what I want but it’s kind of ugly…

/"javascript:func\(\‘([a-zA-Z0-9]{1,})\’,\‘([a-zA-Z0-9]{1,})\’,\‘([a-zA-Z0-9]{1,})\’,\‘([a-zA-Z0-9]{1,})\’.{1,}Content/

lol

Sorry I need to search for the word ‘Content’ and then retrieve the information that precedes it.


[COLOR=#000000] [COLOR=#0000bb]preg_match[/COLOR][COLOR=#007700]([/COLOR][COLOR=#dd0000]'/"javascript:func\\(([^\\)]+)\\).Content"/'[/COLOR][COLOR=#007700],[/COLOR][COLOR=#0000bb]$str[/COLOR][COLOR=#007700],[/COLOR][COLOR=#0000bb]$matches[/COLOR][COLOR=#007700]);[/COLOR][/COLOR]

?

Ok, I have this working:

/“javascript:func\(([^\)]+)\)\” title=\‘Link\’>Content/

But what if title=\“Link\” isn’t there. How can I match if it may or may not be there?