Hypertext Markup Language: HTML
A fundamental component of the original conception of the World Wide Web
is that it should be a collaborative enterprise. The web was designed so
that everyone would be able to add content. To make this feasible, the
language for constructing web pages (HTML) was designed to be simple,
powerful, and easy for non-computer specialist to master.
In this chapter we give an introduction to the design
of static webpages using HyperText Markup Language (HTML),
Cascading Style Sheets (CSS), and eXtensible Markup Language (XML).
As you will see, these languages are quite easy to master
as they are each based on a few simple ideas.
We begin with HyperText Markup Language (HTML).
In this section we provide a sufficient introduction so that you
will be able to design static web pages using the most common
markup elements. For more information you can visit the
official HTML website at http://www.w3c.org or peruse
any of the many HTML texts.
HTML is built upon one fundamental idea which is the notion
of expressing the layout for a web page using HTML elements.
As we will see below there are a few dozen basic elements
which are used to express the basic layout of the page
including line breaks, headings, layout of images, tables,
and lists.
References
- Official HTML page from the World Wide Web Consortium (W3C)
- Web Accessibility Checklist
- HTML 4.0 Elements from WDG
- CSS reference pages by WDG
Outline
- Simple HTML elements
- HTML elements and attributes
- Style and class attributes
- Hyperlinks
- Images
- Headings
- Grouping Text
- Preformatted text
- Lists
- Tables
- rowspan and colspan in tables
- Comments
Simple HTML elements
Simple HTML elements have the form TAGNAME is the name of the tag.
A complete list of the HTML 4.01 elements is available at the URL
of the official source for HTML, the World Wide Web Consortium:
http://www.w3.org/TR/html4/index/elements.htmlThere are 90 different standard tag names, but we will only discuss the core set of about two dozen tags shown in the figure below.
This list does not include "frames" or "forms". The former is a mechanisms for combining two or more webpages into a new "framed" webpage. The latter is used in writing webpages that interact with a server and we will cover "forms" in more detail when we discuss servlets.
A web page consists of an html element that contains a head and body element. The head element in turn contains a title element.
For example, the following text defines a simple "Hello World" webpage
EXERCISE: Creating your first web page.
Try playing around with the example in the "Live HTML Demo" above by changing the text in the "title" element to see how that changes the webpage. Likewise, change the text in the "body" element and see how that changes the webpage.
Next, use your favorite editor to create a file containing the "Hello World" web page shown above. Store your file as "text" with the name "first.html" on your disk. You can view your page by starting up a browser (e.g. Netscape, Internet Explorer, Amaya, Opera, etc.) and selecting "Open File" from the "File" menu. Select the "first.html" file that you just created and you should see a simple page with the words "Hello World" in black on a white or gray background.
Caution! this simple exercise has many pitfalls and it may take you a while to complete it. This is the kind of task that is best done with someone helping you in person as the details vary from computer to computer. Some of the problems that may arise are:
- Some operating systems hide the extensions (.html, .txt, .jpg, .mov) that specify what type of data is in the file. These operating systems also will automatically add extensions (e.g. .txt) to files. So you may think you have a file named "first.html" when it is actually called "first.html.txt" Ouch!
- You may have difficulty saving the file as text. Many word processors store the document you create with a lot of extra information besides the characters that you have typed. For example, they might store the font you have used and the margins and tab setting, etc. The "first.html" file needs to be stored in a simpler format which contains only the characters you typed. This is called "text format" and is usually listed as a choice on the "save as" window.
HTML elements and attributes
The most general form of HTML tags is as follows:
A1,A2,...,An are the names of attributes that are allowed
for that tag, and V1,V2,...,Vn are values that those attributes
can accept. In general, the attributes should always be enclosed in
double quotes as this will simplify migration to XHTML which is poised
to become the successor to HTML4.0 as the next interational standard.
For example, the tags h1,h2,h3,h4,h5,h6 create "headings" as you
might use for chapters, sections, subsections, paragraphs, etc. These tags do
not have any required attributes. The img tag on the other hand, is used to
to include an image in a webpage and it has two required attributes: "src" which
is the URL of the image to be displayed and "alt" which is a textual description of
that image. Below is an example that uses the h1 tag and the img tag:
src attribute of the image tag specifies the name of the
image file to display, the alt attribute specifies the closed-captioned
reading of the image, and the width specifies the size to
make the picture (in pixels). In this case, the attributes are used to
provide information needed to properly display the element.
Note also that the br tag does not have a matching close tag.
This is indicated in the HTML language by including the backslash before the closing
brace in the tag. There are only a few tags that do not have matching close tags...
EXERCISE: using heading and img tags.
Try modifying the h1/img example above by changing the URL of the image in the "src" attribute. For example replace the entire URL with the filename "dew.jpg" which refers to the image file dew.jpg in the folder containing this webpage. Also try adding some h2 or h3 elements and adding extra text before the "img" tag. Also, you can remove the "width" attribute from the "img" tag as it is optional -- what happens?
Style and class attributes
One of the most common problems encountered when writing HTML pages is that each tag has its own set of attributes, and one must know which attributes are allowed for which tags. The CSS language, discussed in detail in the next section, was developed partly in response to this problem. It provides a uniform method of specifying the "style" (e.g. color, font, border, etc.) of any HTML tag. For example, a CSS method for specifying that a webpage should have red letters on a black background is the following:
The style attribute always takes the following form:
PROPERTY's are CSS properties, including the following:
VALUE's are selected from
the possible options for that PROPERTY. This will become clearer as we
see more examples, but the key points to notice are:
- the style attribute is enclosed in double quotes
- the style attribute consists of a sequence of PROPERTY:VALUE pairs separated by semicolons (;)
- the property and value are themselves separated by a colon (:)
Hyperlinks
The anchor element<a ...> is used to specify jumps from one part of a
webpage to another part of a different webpage. These jumps are called hyperlinks.
To specify a hyperlink to a JUMPNAME which is inside a webpage whose address is URL,
you use a tag of the form
mailto link which allows the user to send an email
by clicking on the link:
Finally, a link can be to a non-text multimedia object (such as a sound or movie file). Clicking on the link will cause the browser to show the multimedia object if it has the neccessary viewer, e.g.
Images
The image element has the form:
/>"
at the end of the tag.
The only required attribute is the src, but it is a good idea
to include a description of the image for the vision-impaired in the
alt attribute. This may even be mandatory if you want the page
to meet minimum Federal Accessibility Standards.
The width, height attributes are optional and they can be used to
rescale the size of your image. Giving only the width will cause the
height to scale proportionately. Giving height and width may result in a
picture that looks stretched or flattened. The units are in pixels
which are the smallest points which can be shown on the screen. Typical computer
screens are between 500 and 2000 pixels across (as of 2004).
Headings
There are six levels of headings; from the largest h1, to the smallest
h6. Their general form is
CONTENT
style attribute
can be used to specify the font size, background color, and text color.
Grouping Text
HTML offers several elements that can be used to structure your document into "spans of words," paragraphs, and divisions. When these elements are combined with CSS, they allow the web page designer to specify the style of different sections of the webpage. The span element is used to group together some part of a line (or lines)
of text. It has the form:
style attribute to a short inline segment of words and/or images
as in
span element was introduced as a hook on which to attach CSS to small
segments of text.
The br elemeent is used to insert line breaks into the page. It has the form:
CONTENT
p,
table, div, or h1.
elements).
Finally, the most general way to separate the content in a
page is to use the div element, which has the form:
p and div elements. The div
tag is similar to the p tag except that a div element
can contain a wider variety of tags. The p element should
only be used for paragraphs containing text and images.
Preformatted text
Browsers, by default, will reformat any text that you provide so that it fits the page nicely. Thus, if you type a paragraph as one long line, the browsers will generally add appropriate line breaks. Sometimes however, one wants the browser to respect the formatting and not to insert any line breaks or remove any spaces or tabs. This effect is provided using the pre element which has the form
Pre
Formatted
Content
pre element typically contains
images and text.
Lists
HTML offers several different types of lists. We consider only two types here: ul and ol
The ul element is used for "unnumbered lists" and has the form:
- Content ....
- Content
ul element must contain a sequence of li elements,
and each li element can contain any of the HTML elements that can appear
in the body. These lists are rendered with asterisks or bullets or some other
non-alphanumeric list item markers. CSS can be used to specify the type
of list item maker used, e.g.
- Content ....
- Content
ol lists are used for "ordered lists" and have the same format:
- Content ....
- Content
Content in a list can contain any HTML elements you want,
including lists, links, images, and tables.
Tables
Tables are a very useful formatting tool for web pages. They provide a mechanism for presenting tabular data and specifying how the table should appear on the page.
The general form of the table element is illustrated by the example below, where
- the
tableelement can only containtrelements - The
trelements correspond to the rows of the table and they in turn can only containthortdelements. - The
thortdelements correspond to column headers (or table data respectively) and they can contain anything:
text, links, images, lists, or tables!
The cellspacing attribute is used to
specify how much space should appear between the cells of each table.
The cellpadding attribute specifies how much space should appear
within a cell around the content of that cell.
The border attribte specifies a simple border around the table and each of its cells.
Tables can also have a width attribute which specify how wide the table should be.
Here is another example of using tables which shows how to create a calendar using a table and how to use tables to layout a page with text and pictures next to each other...
Note that the width="50%" attribute of the table, makes the table
stretch halfway across the page.
rowspan and colspan
One final feature of HTML tables is the ability for the HTML designer to specify that
a given table cell (td or th) should occupy more than one rows or cols. This is done
by adding the rowspan or colspan attribute to the td or th.
For example, this can be useful
when creating a table where the first row stretches all the way and contains a title:
When a cell extends down one or more rows it occupies cells in some of the rows beneath it. Thus, the
cells in the corresponding tr elements will refer to the non-occupied cells in that row.
Try to use rowspan/colspan to create the following table:
| 1 | 2 | 3 | 4 | 5 | |
| 6 | 7 | 8 | |||
| 9 | 10 | 11 | |||
| 12 | 13 | 14 | 15 | 16 | 17 |
Comments
Comments are pieces of text that are, for the most part ignored by the browser. The only exception is in the <style> tag which appears in the
head, the CSS style is enclosed in a comment so that older browsers
will ignore it. This was needed because older browsers would treat the CSS
specification as text and show it on the screen.
Comments are used to provide information to the person maintaining
the web page, not the person viewing it.
You add comments to an HTML page using the following syntax:
