A typical HTML file consists of the following parts:
- The document type declaration (information on which form of HTML is being used)
- Header (Containing the title, for example)
- Body (displayed content, text with headings and references)
The formula:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
<html>
<head>
<title>Site description< /title>
</head>
<body>
</body>
</html>
Explanation:
The first line looks especially confusing for beginners. This somewhat complicated piece of information is a document type declaration. We will describe it in more detail later on.
The HTML file’s entire remaining content is contained within the <html> and </html> tags. The html element is also described as the core element of a HTML file. The introducing tag for the header, <head>, follows the introductory <html> tag. The header information is included in between this tag and the closing </head> tag. The most important header information concerns the title and is marked with <title> and </title> tags. The body text, defined through the <body> and </body> tags, follows below. The file’s actual content, as in what is displayed by the web browser, is notated in the body.
Take Note
If you want to use multiple frames, then the framework in files, where a frameset has been defined, looks different. We will discuss these complexities and advise you only to involve yourself with framesets after becoming more familiar with the basics of HTML.
The fundamental framework of a XHTML file
If you wish to correctly write XHTML, then the fundamental framework looks quite similar. Only the beginning is slightly different.
The formula:
<?xml version=“1.0” ?>
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”
http://www.w3.org/1999/xhtml>
<html xmlns=http://www.w3.org/1999/xhtml>
<head>
<title>Site description</title>
</head>
<body>
</body>
</html>
Explanation:
The relationship to XML should be defined even before the document type in XHTML files. The first line with the question marks inside the arrows serves to make this distinction. Notate this line just like in the example. It concerns a so-called XML declaration.
A valid document type for XHTML files must be given with the document type declaration.
In the introductory <html> tag, the XML naming space must be given with an attribute named xmlns. Use the information as shown in the example above.
The file is declared as a XHTML using these methods. The following source text is technically only normal HTML, although you still have to pay attention to the differences between standard HTML and XHTML. However, you only need to occupy yourself with these differences after first becoming familiar with standard HTML.
Document Type Declaration
HTML is only one from many in the family of markup languages, although it is the most prominent. HTML itself has a long history and has been developed into many different versions. The document type declaration determines which version and which markup language you are using. A type of reading software, such as a web browser, can then orient itself based on this information.
The rules for HTML have been formulated with the help of SGML, while the rules for XHTML were formulated with help of XML. According to the rules of a SGML or XML based markup language, a HTML or XHTML file is only first valid if it provides a specific document type and then if its source code completely adheres to the document type’s regulations. Every document type declaration comes with document type definitions. These define which elements a HTML document type may include, which elements may be interlocked with other elements, what attributes belong to each element, whether entering this attribute is necessary or not, and so on.
As an HTML novice, you might fail to see the point behind all the attention given to declaring the document type. But it is exactly these document types that precisely define the rules of various languages, and have proved to be a major advance in programming. The concept of files independent from software, that also adhere to rules, is only made viable through document types. Without official rules to fall back on, languages like HTML would quickly fractionalise into various dialects and splits. The same is true with natural languages: without certain grammatical rules, the same written language would eventually diverge in various directions, and be indiscernible from one group to the next. Because software is much less intelligent than humans, and requires much more exact information in order to understand what it is receiving, adhering to rules and standards is all the more important.
An example of document type declaration:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
http://www.w3.org/TR/html4/loose.dtd>
Explanation:
Notate the document type at the beginning of the HTML file before the tag in capitalised letters, as shown above. The exclamation mark follows the first arrow. The information DOCTYPE HTML PUBLIC follows afterwards. This means that the file is referenced to the publicly accessible HTML-DTD type. The information included in quotations marks can be interpreted as such:
W3C is the publisher of DTD. Information such as DTD HTML 4.01 Transitional means that you are using the file in the “HTML” document type, version 4.01 and its transitional variant. The EN indicates the language, in this case English. The piece of information concerns which language will be used to define the elements and attributes, not for the file’s contents.
The document type declaration then includes the web address of the document type definitions. This information is not mandatory. Simply entering:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
is also acceptable. Reading software can use the rules noted in the document type definitions to check the HTML file. However, most of today’s browsers already come equipped with the main document types, so that this isn’t necessary. Because browsers also have to deal with worst language disfigurations imaginable, they also possess the ability to display even mistake ridden HTML pages halfway decently. But with XML document files, it is very common that a parser stops loading the website if rules are broken, and instead only loads an error message. This is already the case with XHTML sites that are completed with the application/xhtml+xml MIME type.
The strict variant for HTML:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”
http://www.w3.org/TR/html4/strict.dtd>
Use this entry if you do not wish to use certain elements and attributes that were used in earlier HTML standards, and have since become replaceable. The interlocking regulations for HTML elements in the strict variant are naturally stricter, and structured more cleanly. For example, in this variant one cannot simply notate text in between the <body> and </body> tags. All content must be included in so-called block elements, such as headings, paragraphs, graphs, etc.
The transitional variant for HTML:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
“http://www.w3.org/TR/html4/loose.dtd”>
You can use this entry if you need to use some of the elements or attributes not allowed in the strict variant. The rules for element interlocking are somewhat milder in the transitional variant. One is allowed to include naked text outside of any element in between the <body> and </body> tags. Moreover, this variant is necessary in order to edit links with target attributes, and correct direct framesets, for example.
The variant frameset for HTML:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Frameset//EN”
http://www.w3.org/TR/html4/frameset.dtd”>
This entry is envisioned for special HTML files, in which framesets are defined.
Older document type declarations:
It can become reasonable to refer to older HTML versions in some certain instances. However, this should only be done if the technical circumstances demand it. The following older entries are available:
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 2.0//EN”>
Use this document type if you wish to refer to HTML 2.0
<!DOCTYPE HTML PUBLIC “-//IETF//DTD HTML 3.2//EN”>
Use this document type to refer to HTML 3.2.
Some tips for using different document types
If you have become somewhat bewildered through the whole chaos of HTML and XHTML, HTML language variants, XML and document type declarations – you don’t have to be. The clutter has resulted from the large amount of development within the language.
For your first foray into the HTML language, you should only use the first presented fundamental framework as it is notated above. Then learn how to work with additional elements and attributes, as well as style sheets. After becoming a little more familiar with the language, it will make much more sense which document type to choose.