I’ve never read the book, so I don’t know whether it covers this or not, but I will explain the entire system to you. You probably already know a lot (or most) of this, but understanding all of it will give you a intuitive idea about what is going on and how to fix it. Bear with me if you can, since this will really demystify any questions that you might have.
HTTP and Request-Response Cycles
Everything that happens on a website is fundamentally a conversation between two entities: a user agent and a web server. The user agent is anything that wants to receive information. The web server provides the information that the user agent requests. In order to have this communication, they speak to each other in a common language called HTTP (HyperText Transfer Protocol).
That’s the generalization of the system. In this particular case, the user agent is a web browser such as Internet Explorer or Firefox being operated by a human. This might not always be the case… sometimes the user agent is a robot that acts without human interaction.
The user agent and web server exchange information in single events called requests. Each request made by a user agent results in exactly one response from the web server. Each request from the user agent includes (to simplify) a URL (Uniform Resource Locator) - essentially a “name” for the information that is being requested. The web server returns the information identified by this name in its response. All of these requests work in the same way. The web browser does not interpret the meaning of the URLs it requests in any way - only the web server knows what the URLs mean.
Note that I haven’t said anything about what exactly is being exchanged in these request-response cycles. The information can be any kind of data like text, movies, music, or images.
Here’s what a typical HTTP conversation might look like in theoretical terms, in order from first communication to last:
- Client: Give me MyImage.jpg
- Server: Here you go:
...some data here...
- Client: Give me AnotherFile.txt
- Server: I can’t find any data by that name.
- Client: Give me MyPage.html
- Server: Here you go:
...some data here...
You’ll notice that all of the URLs in this sample communication are filenames. That’s because, by default, most web servers simply store a bunch of files in folders, and the URLs sent by the client refer to this folder structure. On your own web server, you’re probably saving (or uploading) files to a particular directory that you can then see by using your web browser. This works since the URLs that you type into your browser match perfectly to files on the server.
HTML and Linking
The thing that makes the World Wide Web what it is is called HTML (HyperText Markup Language). These are data files stored on the server just like any other. The difference with these files is that web browsers - a particular type of user agent operated by humans - can read these files and create visual structures called web pages from them.
Here’s what a simple request-response cycle looks like when a human is operating a web browser:
- Client: Give me /index.html
- Server: Here it is:
html <html> <head> <title>My Page</title> </head> <body> <h1>Hi!</h1> </body> </html>
- * Client parses the HTML and displays a web page to the user
At this point, the web browser reads this HTML document and displays a simple web page. I won’t go into the details of HTML here.
The thing that makes web pages so interesting is that they can link to each other. This is done by using the <a> tag as in:
<a href="myOtherPage.html">Click here!</a>
When the web browser turns this code into something visible, it displays the familiar blue hyperlink. But what happens, exactly, when the user clicks on this link?
The fundamental principle to understand here is that the contents of the href attribute actually refer to a URL to be used in another request to the web server. In fact, the link might even be to information stored on a different web server altogether. So, if the user clicked on the link above, their web browser would make a new request to the web server for the information identified by “myOtherPage.html”. In the same way, the HTML might call for images to be displayed (with the “img” tag). When the web browser encounters these tags, it makes additional requests to the web server for the images identified by the URLs identified in the “src” attributes. When a bunch of similar web pages are linked together in this way, we call them a website.
Here’s a simple request-response cycle for a user browsing two web pages on a website:
- Client: Give me /index.html
- Server: Here it is:
...some HTML data with <img> and <a> tags...
- * Client parses the HTML data and discovers that it needs two images from the server
- Client: Give me /images/companyLogo.png
- Server: Here it is:
...some PNG image data...
- Client: Give me /images/ceoPicture.jpg
- Server: Here it is:
...some JPEG image data...
- * Client parses the HTML and images and displays the web page to the user
- * User clicks on the link in the page
- Client: Give me /myOtherPage.html
- …
PHP and Script Versus Output
Here’s where it gets tricky: what exactly is PHP?
PHP is a technology that allows the information received from the web server to be dynamic. In other words, a user agent requesting the same URL twice can get different information both times, or two user agents requesting the same URL can each get different information. You probably already know why this is useful since you purchased a book about PHP.
Here’s how this works, given the massive amount of background information leading up to this point:
PHP lives on the web server - it has nothing to do with the user agent. It wedges itself between the time that the web server receives a request for a URL and the time that it returns a response. When the user agent requests a URL which the web server determines should be dynamic, the web server runs a PHP script and returns its output in the response. Most web servers determine that any request for a file with a “.php” file extension should be handled by PHP, but this can be configured in a multitude of ways.
When PHP is told that it must handle a request, it opens the requested file (which would normally simply be read and send back to the user agent without modification). After the file is open, PHP executes the contents, which basically means “runs the code in the file.” Once the code is executed, it outputs some data, usually HTML markup. PHP then sends this output to the user agent as the web server’s response. Note that the original contents of the file are never sent to the user agent - it only sees the output after execution.
Here’s how a request-response cycle for some information might look when PHP is involved:
- Client: Give me /index.php
- * The server notices that this file should be handled by PHP. Control is passed to the PHP engine.
- * The PHP engine reads the contents of this file, called a PHP script. The script is “executed”.
- * The output of the execution is now ready to be sent to the user agent. PHP shuts down.
- Server: Here you go:
...some data output by the script...
The key thing to note here is that the content of the PHP script file and the data actually received by the user agent are two different things.
A PHP script can output data in several different ways. The simplest is like this:
<?php
echo 'Hello, World!';
?>
The output of this script, in its entirety, as seen by the user agent is:
Hello, World!
Another way that a script can output content is by containing content outside of the “<?php” and “?>” tags. For example, look at this script:
<?php
echo 'Hello, World!';
?>
Goodbye, World!
<?php
echo 'Hello, again, World!';
?>
The output of this script is:
Hello, World!
Goodbye, World!
Hello, again, World!
The significance of this second output method is that scripts which return HTML content can be written exactly like HTML pages normally are, but with PHP content sprinkled around. For example:
<html>
<head>
<title><?php echo 'My Title'; ?></title>
</head>
<body>
<?php echo '<p>My Body</p>'; ?>
</body>
</html>
Outputs the following HTML content for the user agent:
<html>
<head>
<title>My Title</title>
</head>
<body>
<p>My Body</p>
</body>
</html>
Applying Knowledge to the Example
One more piece of information needs to be clear before we tackle your problem with this new understanding: the $_GET superglobal in PHP, as the book probably explains, gets data from the user agent that has been stored in the URL used to request the script. This allows a mechanism for the user agent to send data back to the script, rather than the usual one-way communication. The URL is formatted in such a way (with the “?” symbol) that the web server still knows what script is being requested even with the extra data attached.
So with all of this explained, the “welcome1.html” and “welcome1.php” files that you initially posted are designed to generate the following request-response cycle:
- Client: Give me welcome1.html
- Server: Here you go:
html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Query String Link Example</title> <meta http-equiv="content-type" content="text/html; charset=utf-8"/> </head> <body> <p><a href="welcome1.php?name=Kevin">Hi, I’m Kevin!</a></p> </body> </html>
- * Client parses HTML content and displays the page with the hyperlink
- * User clicks on the hyperlink
- Client: Give me welcome1.php?name=Kevin
- * The server notices that this file, welcome1.php, should be handled by the PHP engine.
- * The PHP engine reads the file and begins executing the code.
- * The PHP script outputs all of the HTML code up to the first <?php tag.
- * The PHP script reads $_GET[‘name’] from the request URL and prints “Welcome to our web site, Kevin!” in the output.
- * The PHP script outputs all of the remaining HTML code after the ?> tag.
- * The PHP engine sends the output to the user agent and shuts down.
- Server: Here you go:
html <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Query String Link Example</title> <meta http-equiv="content-type" content="text/html; charset=utf-8"/> </head> <body> <p> Welcome to our web site, Kevin! </p> </body> </html>
So, when you view the source code of the page in the web browser, you’re seeing the response as generated by PHP.
Including Versus Linking
A term that you used - “include” - refers to something different that happens in the context of a PHP script. When a script uses the include (or require) functions, it tells the PHP engine that it should execute the PHP code in another file and merge that script’s output with this script’s output at that point. So, you might have two files like this on the web server:
file1.php
echo '1';
include 'file2.php';
echo '3';
file2.php
echo '2';
A request-response cycle for file1.php looks like this:
- Client: Give me file1.php
- * Web server starts the PHP engine and begins executing file1.php
- * The first “echo” statement is executed and added to the output.
- * The PHP engine encounters the “include” statement. It then halts execution of file1.php, opens file2.php, and begins executing.
- * The “echo” statement in file2.php is executed and added to the output.
- * The PHP engine reaches the end of file2.php. It returns to the point where it left off in file1.php and moves on to the next line.
- * The second “echo” statement in file1.php is encountered and added to the output.
- * The output is sent to the user agent and the PHP engine shuts down.
- Server: Here you go:
123
So there’s two different ways in which things can be “linked” together, and they are very different:
- The hyperlink is something that is found in the HTML content received by a web browser - it’s something that happens entirely on the user’s computer. It specifies another resource that can be retrieved from the web server by clicking on the text.
- The include is something that is found in a PHP script parsed by the PHP engine - it’s something that happens entirely on the web server. It specifies another PHP script that should be executed and merged with the current output.
The End
Sorry about the extremely long post. If you read through it, though, you probably have a pretty solid grasp of the technology that underpins this entire system. Hopefully this helped to clear up some of the questions you had and resolved some terminology issues.
Of course, if you need a particular section distilled, feel free to ask.