Basic Structure of a Web Page

Basic Structure of a Web Page

While this reference aims to provide a thorough breakdown ofthe various HTML elements and their respective attributes, you also needto understand how these items fit into the bigger picture. A web page isstructured as follows.

The Doctype

Thefirst item to appear in the source code of a web page is the doctypedeclaration. This provides the web browser (or other user agent) withinformation about the type of markup language in which the page iswritten, which may or may not affect the way the browser renders thecontent. It may look a little scary at first glance, but the good news isthat most WYSIWYG web editors will create the doctype for youautomatically after you’ve selected from a dialog the type of documentyou’re creating. If you aren’t using a WYSIWYG web editing package, youcan refer to the list ofdoctypes contained in this reference and copy theone you want to use.

The doctype looks like this (asseen in the context of a very simple HTML 4.01 page without anycontent):

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"""><html><head><title>Page title</title></head><body></body></html>

In the example above, the doctyperelates to HTML 4.01 Strict. In this reference, you’ll see examples ofHTML 4.01 and also XHTfML 1.0 and 1.1, identified as such. While many ofthe elements and attributes may have the same names, there are somedistinct syntactic differences between the various versions of HTML andXHTML. You can find out more about this in the sections entitledHTML Versus XHTML and HTML and XHTML Syntax.

The Document Tree

A web page couldbe considered as a document tree that can contain any number of branches.There are rules as to what items each branch can contain (and these aredetailed in each element’s reference in the “Contains” and “Contained by”sections). To understand the concept of a document tree, it’s useful toconsider a simple web page with typical content features alongside itstree view, as shown in Figure 1.

Figure 1. The document tree of a simple web page
Document Tree

If we look at this comparison, we can see that thehtml element in fact contains two elements:head and body.head has two subbranches—a metaelement and a title. The bodyelement contains a number of headings, paragraphs, and ablockquote.

Note that there’s some symmetry inthe way the tags are opened and closed. For example, the paragraph thatreads, “It has lots of lovely content …” contains three text nodes, thesecond of which is wrapped in an em element (foremphasis). The paragraph is closed after the content has ended, and beforethe next element in the tree begins (in this case, it’s ablockquote); placing the closing </p>after the blockquote would break the tree’sstructure.


Immediately after the doctype comes the htmlelement—this is the root element of the document tree and everything thatfollows is a descendant of that root element.

If theroot element exists within the context of a document that’s identified byits doctype as XHTML, then the html element alsorequires an xmlns (XML Namespace) attribute (thisisn’t needed for HTML documents):

<html xmlns="">

Here’s an example of an XHTML transitional page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"""><html xmlns=""><head><title>Page title</title></head><body></body></html>

Thehtml element breaks the document into two mainsections: the head and the body.


The head element contains metadata—information that describes the document itself, or associates it withrelated resources, such as scripts and style sheets.

Thesimple example below contains the compulsory title element, whichrepresents the document’s title or name—essentially, it identifies whatthis document is. The content inside the title may beused to provide a heading that appears in the browser’s title bar, andwhen the user saves the page as a favorite. It’s also a very importantpiece of information in terms of providing a meaningful summary of thepage for the search engines, which display the titlecontent in the search results. Here’s the title inaction:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"""><html xmlns=""><head><title>Page title</title></head><body></body></html>

In addition to thetitle element, the head may alsocontain:

  • base

    defines baseURLs for links or resources on the page, and target windows in whichto open linked content

  • link

    refers to aresource of some kind, most often to a style sheet that providesinstructions about how to style the various elements on the webpage

  • meta

    providesadditional information about the page; for example, which characterencoding the page uses, a summary of the page’s content, instructionsto search engines about whether or not to index content, and soon

  • object

    represents ageneric, multipurpose container for a media object

  • script

    used either toembed or refer to an external script

  • style

    provides anarea for defining embedded (page-specific) CSS styles

All of these elements are optional and can appear inany order within the head. Note that none of theelements listed here actually appear on the rendered page, but they areused to affect the content on the page, all of which is defined inside thebody element.


This is where the bulk of the page is contained. Everything that you can see in the browser window (or viewport) is contained insidethis element, including paragraphs, lists, links, images, tables, andmore. The body element has some unique attributes of its own, all of whichare now deprecated, but aside from that, there’s little to say about thiselement. How the page looks will depend entirely upon the content that youdecide to fill it with; refer to the alphabetical listing of all HTMLelements to ascertain what these contents might be.

Learn HTML5 Online

Get all SitePoint books and courses with a Learnable membership. Start building future-proof websites that are faster, more powerful, and easier to maintain.

No Reader comments

Comments on this post are closed.