HTML Authoring Guide

HTML Authoring Guide
VERSION 980414

There is a lot of information on the World Wide Web about how to put your documents into the right formation for the World Wide Web. This document lists some of these sources.

Skip directly to the list of more information

Introduction

HTML (hypertext markup language) is fundamentally different from most text formatting: In most text formatting, the author describes how he wants each part of the document to look. In HTML, the author describes what purpose a given piece of text serves -- primary header, major header, minor header, subheader, numbered list, unnumbered list, normal text, emphasized text, and so on. This is called "generalized markup"; if you ever hear of anything called "SGML" (standard generalized markup language), it might be useful to remember that HTML is a specific type of SGML.

Since the way the author writes the document is different, so is the way a reader reads the document. With a text document, the reader picks up a piece of paper and reads something which looks exactly the way the author wrote it. With an HTML document, the author labels text with its function, and the reader needs to interpret those functions. A reader does this by using a "client" program which takes the document and displays it on his screen, representing each type of HTML in different ways depending on the way the reader has set his program. The way the document looks when the reader sees it can vary tremendously, depending on what program he is reading the document with and the settings he has given that program.

This doesn't mean that the author has no idea how his text will look; there are standard settings: all header text is bold, major headers are much larger than normal text, other levels of headers are varying sizes smaller than that, bold is bold, strong emphasized text is bold, underline is underline, italics are italics, basic emphasized text is italics, cites and addresses are italicized as well, lists have their items indented and so on. But the base font is whatever the reader chooses, and he can change the default settings; for example, if a reader finds italics too much of a nuisance to read, he can reset cites and addresses to be underlined instead. For this reason, emphasis, strong emphasis, cites, and other functional labels are generally preferred over italics and bold and other direct control labels: it lets the reader choose the settings he finds most pleasant to read, while still communicating the type of text something is.

Differences from writing other documents

All of this makes writing documents in HTML very different from writing "normal" documents. The tools that one uses to write an HTML document are entirely different: instead of deciding how each part of the document looks, the author decides what type of text it is. Furthermore, instead of putting a lot of effort into making the document look exactly the right way with a single viewer (the printed page), an author needs to put a little effort into making the document look acceptable with several different viewers (the most common "client" programs readers will use to display the document).

Beneath all of the differences between HTML documents and "normal" word-processing documents, there are some important similarities. Word-processing documents use labels such as "start bold text here" and "end bold text here". Some word-processing programs (for example, WordPerfect) will display these labels; others (for example, Microsoft Word) hide this information from the authors. However, these labels are there, hidden or not. In HTML, these labels are very explicit; they are simply ordinary text in between less-than and greater than signals, such as this: <an HTML command here>.

HTML will change

HTML is an evolving language. You don't need to worry about this since future HTML will be "backwards-compatible": anything that is written in proper HTML now will remain "legal" in future versions of HTML. However, you should be aware that HTML will evolve and expand; tools that are not available now - tables and charts, for example - may be available in the future.

There is more information on HTML elsewhere:

Here are some places you can find information about writing HTML documents, basic CGI forms, software tools, and publishing at the University of Chicago:

HTML for Beginners -- start here if you're learning HTML
More advanced HTML -- miscellaneous HTML documentation
Reference Guides to HTML -- lists of tags and special characters
HTML Style -- how to design documents for the World Wide Web
CGI Forms -- HTML codes, forms theory, and sample programs
Collections of HTML Documentation -- other pages like this one
Software for HTML authoring -- HTML editors and translators
General WWW authoring -- assorted pages on WWW and HTML
New and non-standard HTML -- HTML 2.0, HTML 3.0, and Netscape
Miscellaneous other information -- assorted other useful pages

Return to the top of this document