Preg_Split syntax to keep from matching empty space delimiters

The following preg_split matches whole words and punctuation, but it is also matching the empty spaces between words. I don’t need the extra spaces. What is the syntax for capturing words and punctuation but not spaces?


preg_split('/(\\W+)/', $commentString, -1, PREG_SPLIT_DELIM_CAPTURE);

I’d probably use the /\s+/ regex to split a string by whitespace (\s denoting whitespace). At the moment, you’re splitting the string according to that’s not alphanumerical and underscores, and are capturing it all with the parentheses.

preg_split('/\\s+/', $commentString, -1, PREG_SPLIT_DELIM_CAPTURE);

Sorry for the confusion … what I meant was I wanted to keep the punctuation as a part of the array but as a field entry unto itself … and not connected as a trailing character to the last word of the respective sentence. I stumbled across



preg_split("/(\\w+\\W+)/", $commentString, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);


and it works similarly to your more concise sample, but both match the punctuation and attach it to the last word in the array. I found a regex library at http://weblogtoolscollection.com/regex/regex.php?page=3, but I’m having difficulty understanding the codes / syntax. I have tested about 10 different variations, all of which almost work.

What about the following regex?


preg_split('/\\s|(\\W+)/', $a, -1, PREG_SPLIT_DELIM_CAPTURE);