You should probably manually specify the newline. This string will contain different EOL sequences, depending on the platform on which the code runs.
$txt = "Company: My Company\
"
. 'Contact Name: John Smith';
Then I use the following functions to match the data I want until a new line/carriage return.
preg_match('/Company(.*?)\
/',$txt,$company);
print '<td>'.prep($company[1]).'</td><br>';
You boldly assume [preg_match](http://php.net/preg_match)()
will succeed before probing the matches array. As noted above, variations in EOL terminators may be the issue here. Also, your regex could be improved, since you don’t seem to want the whitespace:
if (preg_match('/Company \\W* (\\w [^\\r\
]*)/x', $txt, $company)) {
print_r($company);
}
The [\\W*](http://php.net/manual/en/regexp.reference.backslash.php)
predefined character class looks for non-word characters zero or more times (so, it would include the colon and whitespace after “Company”). The \\w
character class is the opposite: it requires word characters (i.e., characters part of a Perl “word”) and is locale-sensitive. I explicitly test for this once to act as a boundary between non-word and word characters. From there, [^\\r\ ]*
slurps everything that isn’t a newline character. I didn’t match the ending EOL, since you don’t seem to need it. With regexes, you should match only what you really need.
Lastly, the [/x](http://php.net/manual/en/reference.pcre.pattern.modifiers.php)
modifier allows ignoring whitespace and easily inserting comments in your patterns. If you want to explicitly match whitespace characters, you’d have to escape them (including blanks, or ASCII 32). With the free-form syntax, it’s generally best to stick with single-quoted strings*. Using the more free-form syntax in regexes helps keep them readable; it’s a good idea to use /x
often.
A similar approach could be taken for your other regex. You might also consider putting your patterns in an array and looping through it, especially if you anticipate adding more.
Also, nitpicking, but the arbitrary indent for the print
statement here is odd, considering it’s not enclosed within a block.
- Compare the following regexes:
<?php
$data = 'foo bar';
/* \
is interpreted as an actual newline by the time
libpcre compiles it. So, with the /x modifier in use,
"\
" is just ignored. */
$re = "/foo \
\\W* bar/x";
/* '\
' in PHP single-quoted strings is NOT an escape
sequence, so libpcre will get '\
' and use it as an
escape sequence. */
$re2 = '/foo \
\\W* bar/x';
var_dump(preg_match($re, $data));
var_dump(preg_match($re2, $data));