I am trying to understand how things like carriage returns work in a PHP/HTML form.
If a user goes into one of my web forms and hits < return >, does that create a hidden character?
And is there any way to replicate that carriage return in my code?
(All of this relates to a larger question, but I’m not sure where to begin, so I figured I would ask a basic question, although it may sound cryptic!!)
A carriage return character can be entered as “\r” - it is NOT the same as the “\n” new line character.
On some systems sending a carriage return by itself will return to the start of the current line while sending a new line character by itself will move down a line but keep the horizontal position the same.
Most systems assume that you actually want both and so use one or the other to mean both.
So if I have PHP code that checks for a carriage return/new line and I hit < enter > then my code should detect it, but if I typed “\n” in a Text Area and hit < enter > then my code would treat the “\n” that I typed as a string?
That happens whether I use \n or \\n in my regex which seems strange, because the first one should check for the control character and the second one should check for a literal.
It seems they are treated interchangeably, but I’m not certain?!
If you hit the tab key the browser will almost certainly take you to the next field (although that isn’t always where you might expect it to be). Spacebar will just generate a space character.
The \n is called an escape sequence. It allows us to use ordinary printable characters on our keyboard to represent an un-printable character.
Just as strings allow us this convenience, so to do regular expressions. For simplicity, let’s now look at just regexes.
/\n/ // matches a single character, the line feed character
/\\n/ // matches two characters, the backslash and the lowercase "n"
But now the complication. We need to give this regular expression to a regex library for processing. How do we pass along this regex? As a string!
"/\n/"
"/\\n/"
There are now two processing stages: when evaluating the string and when evaluating the regex. When we have "/\n/", then the string interpretation converts the \n escape sequence into a literal line feed character, and that literal character is what gets sent to the regex engine. It’s now the regex engine’s turn to interpret, but there’s no escape sequence anymore. There’s just the literal line feed.
Whereas when we have "/\\n/", then the string interpretation converts the double backslash (an escaped backslash) into a single backslash, so the value that gets sent to the regex engine is /\n/. It’s now the regex engine’s turn to interpret, and it sees the \n as an escape sequence, so will match a line feed character.
This double layer of processing can get tricky, but you get used to it after a while.
Interesting description, Jeff. I think I followed you.
So, since this relates to my other thread on Email Header Injection, which of these would be better to use to replace a Newline character with a zero-length string?