Javascript Regex making Dot match new lines

My understanding of RegEx is that

(Dot). Matches any character except newline or another Unicode line terminator.

I am trying to grab the contents of a custom html tag which may have several spaces and new lines in between it and its end tag.

I am currently using a RegEx expression like this

<mytag id=“1”[^>]>(.?)</mytag>

and then using the ‘gim’ parameter to make it global and case insensitive in addition to being multiline.

Problem is that when my opening and closing tag span several lines then I get an error. On the other hand if the opening and closing tag is all on one line the above regular expression works.

And I have tried RegExp.multiline = true; All to no avail. However it is possible I was using it incorrectly.

How do I need to modify the expression or set a new parameter in javascript to make sure that newlines are included in the wild card match sequence between the tags? I want to grab everything between the tags including tabs, line spaces, carriage returns etc…

Well, actually “m” has nothing to do with how dot matches. “m” means “match ^ (and $) before (after) each newline”. Illustration:


var test =
	"LINE 1\
" +
	"LINE 2\
" +
	"LINE 3";
alert(
	"Normal modus\
" +
	test.replace(/^L/g, "--$&"));
alert(
	"Multiline modus\
" +
	test.replace(/^L/gm, "--$&"));

What you’re looking for is Perl’s “s” modifier (= “dot matches all”). Unfortunately javascript doesn’t support it, the workaround is to use something other than dot to denote “anything”, for example [\S\s] idiom, which means “space or not space”.


var html = "<tag> foo \
 bar </tag>";

// this doesn't match
// because dot stops when it encounters
// a newline
one = /<tag>.*<\\/tag>/;
alert(html.match(one));

// this matches
// because [\\S\\s] includes
// newlines as well
two = /<tag>[\\S\\s]*<\\/tag>/;
alert(html.match(two))

I tried using the following code implimenting the \S\s modifications with no success. Can you tell me what’s causing the code not to match in this case?


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<script language="javascript">
function runTest(){
	myHTML = document.myForm.myText.value
	var re = new RegExp('<myTag>[\\S\\s]*</myTag>',"gi");
	myArray = re.test(myHTML);
	alert(myArray);
	}
</script>
<html>
<head>
	<title>Test</title>
</head>

<body>
This is the test
<form name="myForm">
<textarea name="myText" id="myText" col=20 rows="4">
<html>
<head>
</head>
<body>
This is text <myTag>This is working
</myTag> bla
</body>
</html>
</textarea></form>
<a href="#" onClick="runTest();">myLink</a>
</body>
</html>

I fiddled around with it and with your help found a resource that contained what I was looking for. I need to use (
|.)* in the middle of the start and end tag literals and it seems to work.