A Guide to HTML Authoring

Or So how do I write a Web page then?




Contents


What's this "META" tag I keep hearing about?

The official HTML RFC1866 is particularly unhelpful on this tag, describing it as "Associated meta-information".

One of the most useful versions of the META tag should be HTTP-EQUIV. Whenever a web server serves out a file, it first adds a few header lines, including, for example, a line telling the browser what sort (mime type) of file is coming. It is these header lines that tell a browser how to deal with the file it receives. (If for example a server hasn't been told to treat files with an htm extension (and not just html extensions) as being HTML files, your browser will read them as text files and you'll see all the tags!). In principle, the HTTP-EQUIV attribute should either cause the server to add a header line or cause the browser to read it as if a header line had been added. In practice I'm not sure whether it does and the specification (para 5.25) says that the method by which meta-information is extracted is "unspecified and not mandatory".

Another use is offered by the various search engines. Both Alta Vista and Infoseek suggest the use of tags for keyword searching and descriptions rather than their automatic indexing. Thus you might include in the head section of your page:

<META NAME="description" CONTENT="The description you want">
<META NAME="keywords" CONTENT="The keywords you want searches to pick up">

The Netscape browser also recognises an HTTP-EQUIV function of REFRESH, which can be used to load a new page a particular time after loading the first page or (if the URL argument is a sound file) to play music once the page is loaded. If for example you have an introduction page at http://www1.url/intro.html and a second page at http://www2.url/page_2.html, you might include in the head section of the page at http://www1.url/intro.html a line:

<META HTTP-EQUIV="REFRESH" CONTENT="10; URL=http://www2.url/page_2.html">

This will send the Netscape browser to http://www2.url/page_2.html ten seconds after accessing http://www1.url/intro.html.

For more information on the META tag see

Return to [Contents] Contents


Help! It doesn't work

The virtual library's The WDVL Message Board may be able to answer HTML questions.

Return to [Contents] Contents


How do I check that it works?

There are several things you need to check:

Checking your HTML Tags

You can have your published pages validated against accurate HTML specifications of various types including Mozilla (Netscape) and MS Internet Explorer by inputting the URL of your page in the form at Webtechs HTML Validation Service (formerly Hal's HTML validator, by which name it is referred in this page).

Hal's validator is excellent but it does produce some cryptic error messages. Fortunately to explain it all is The Unofficial WebTechs HTML Validator FAQ.

A more general HTML checker is at Weblint.

Both Weblint and HAL will allow you to type particular lines of HTML markup into a form directly to check, so you don't need to have published the page first.

Another one, which runs on your own computer under Awk or Perl is htmlchek. There are also links to obtaining Awk and Perl from here, but neither run under Windows although there are versions for DOS and the DOS version of Perl has some sort of Windows support. I couldn't be bothered. You can also do an Online htmlcheck which may be more worthwhile.

One I have just found (for the more advanced user) that works under DOS or Win95 and Win NT amongst other platforms to allow you to check your pages locally before uploading is SP (nsgmls.exe) which is a "proper" SGML parser. If you want DTD's for it, I found a collection at the KGV Library [New].

Finally for online validating there's A Kindler, Gentler HTML Validator (KGV) which is easier to interpret but requires a DOCTYPE declaration as the first line if you're using any version of HTML other than Level 2.0. There's now also an Unofficial Kindler, Gentler HTML Validator FAQ. Beware that several of the HTML editors use invalid or unrecognisable DOCTYPE declarations. With Netscape extensions, the correct form of declaration should be:
<!DOCTYPE HTML PUBLIC "-//Netscape Comm. Corp.//DTD HTML//EN">
If you're using HTML Level 3.0, the form accepted by HAL (when last checked) is
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 3.0//EN">,
although that is not the current correct declaration for HTML Level 3.0 (the IETF are no longer the sponsors of HTML Level 3.0). Actually there are several versions that HAL will accept which are listed on the relevant HAL page, but they don't include the two correct versions
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML//EN">, and
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3 1995-03-24//EN">.
HTML Level 3.0 no longer exists and has been replaced by HTML Level 3.2. For this level, the correct declaration is <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> or (if you're also using Cascasding Style Sheets (you'll know if you are) <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML Experimental 970421//EN">.
For Level 4, use <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Draft//EN">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Final//EN">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">.

Other than to use the KGV Validator, there appears to be no benefit to specifying a DOCTYPE and I have heard that some of the larger search engines do not recognise pages with DOCTYPE declarations.

An interesting checker is Doctor HTML's Web Page Analyser. It performs a number of checks, including spelling, HTML checking and link checking. I haven't yet used it thoroughly but the HTML checker seems only to check for unclosed tags (including those that are optional) and the link checker seems to take rather a long time. I don't regard this as a substitute for the other checkers above.

See also the HTML Validators section of A Short Introduction to Interesting WWW Linksby Heikki Kantola

Checkout Yahoo's List of HTML Validation and HTML Checkers for more information (and for more software).

The new version of HoTMetaL 2.0 Free version 2.0 includes a validation command which will test your pages against HTML Level 2.0, HTML Level 3.0 and Level 2.0 with Netscape extensions. As mentioned below, though, HoTMetal writes and validates code that HAL rejects as invalid (although current browsers can still read it). The link is to the download pages, but you can also get details of the product and the Pro version by link from that page.

I used HAL (hence the logos on these pages) but was most disappointed when, having validated the pages to Netscape specifications using HoTMetaL 2.0 Free, they failed the HAL test. HAL is strict (and indeed rejects some Netscape codes such as BORDER as ambiguous - it requires BORDER=1 to distinguish from BORDER=0). With the exception of the Internet Search page (which uses a table and has a text alternative), the only tags I use which do not comply with HTML Level 2.0 are image attributes (BORDER, HEIGHT, WIDTH, HSPACE and VSPACE), the CLEAR=ALL attribute to the <BR> tag and the ALIGN=CENTER attribute which will all be ignored by browsers that don't support them. For readers in Europe, note that the HTML tags use American spelling and "CENTRE" won't work.

Checking your links

The other problem you will have is that of broken links. Even if all the links were correct when you wrote them, by the time your reader gets to them the URLs may well have changed or been dropped, and your site cannot be declared "under construction" for ever!

The easiest way to check for broken links is to use an automatic link checker, which will ensure that your URLs point to real pages on real sites.

Life is much easier in this respect if you own your own server and run Unix. I could only find one checker that would work for the Windows user who is using someone else's server (such as CompuServe) and that is Checker for MS-Windows. It only checks http:// URLs and ignores images, gopher:// and file:// URLs. It also cannot deal with internal or relative links which throw up error messages. You run it on your pages on your PC before you publish them. It will also tell you when the linked page last changed. What it won't do is check that the contents of that page are what you intended, so if the author has kindly left a page saying that his URL has moved, Checker will say that the link worked. Nevertheless I recommend it.

For those that can use them, you could also try:

NetMind Free Services has a service called URL-minder which sends you Email when your registered Web page(s) change. You fill in the URL you want to monitor in a form. It would take a while to cover all the links in your pages this way, when all you really need to know is if the URL changes. You can also provide this service to your own readers (but given my policy of continuous improvement registering these pages may prove frustrating). I think this is so good that there's a form below.

I should also mention here Peter Flynn's "http-by-mail" service. To use it, send a one-line message in the form GO [full URL preceded by http://] to the address webmail@www.ucc.ie and you will receive back both the source (HTML) and a UUENCODED copy of the page in ASCII format.

Checking your spelling

No matter how good your content, if the page looks horrid, or is littered with spelling errors, your readers will leave. No HTML validator will check your spelling, but WebSter's Dictionary will spell check (American spelling unfortunately) your pages once they're up. You fill in the URL in a form and wait a moment and then a list of misspelled words is returned. I got (for example) andrew, andrewh, co, com, dircon, and uk when I tested these pages, as well as the names of all the search engines on the search page.

Return to [Contents] Contents


How do other people find my creation?

One easy is to use Submit It! To use their words, "Submit It! is a free service designed to make the process of submitting your URLs to a variety of WWW catalogs faster and easier. Register with over 15 different catalogs, but fill out just one form." With some catalogs you need to add more but in general this seems true. But don't bother with the NIKOS web index, it's been discontinued.


Joe Thomas's wURLd Presence is another site that allows you to input your information into one form to register with sixteen search engines including Alta Vista, InfoSeek and Harvets Broker. Perhaps it is easier to use than Submit It!.

The Housernet Homepage Directory's goal is "to someday [sic] have a link to every personal homepage that exists". They've a way to go yet, but you can register your page from this link.

You can also submit your page's URL to the NCSA What's New Page (that page has details on how to submit your listing).

For more checkout Pete Page's How to Announce your New Web Site.

Return to [Contents] Contents


How do I know how many people have seen my page?

or How do I put a counter on my page?

Before putting a counter on your page, know that Internet purists do not approve. A CGI counter puts more load on servers while a gif-based counter consumes bandwidth.

If you run your own server, or have access to its logs, the recommendation is to use one of the many statistics tools available to analyse the access log of your web server. Even if you are not the webmaster of your server, your admin may give you read-only access to the log files. Also check out Tim Drozinski's FAQ on counting accesses without the need for CGI programs. Most of the techniques recommended there don't abuse the server, which is "a Good Thing" (Source: The World Wide Web Frequently Asked Questions).

Also checkout Yahoo's Index of access counter software.

If you do not have CGI access or access to your server's logs, you can either use a gif-based counter like mine (see below) or there's a shareware audit system (covering up to ten pages) available at WebAudit, which I have now adopted and paid for (shock horror!) because it will monitor hits on several separate pages.

The htmlZine Statistical Counter Service (which charges USD1 per page per month) keeps quite good statistical logs on your pages including browser and referring page information, but only tracks pages loaded by graphical browsers (it monitors a small gif on your page).

There's also the Internet Audit Bureau's IAudit service which provides you with access statistics free, but only provides code for one page and when I tried it the server was overloaded.

You can get a gif-based web-counter for your pages (like the one on my home page) to let you know how many times they have been accessed. You can choose the number style you want.

Another gif-based counter is available at John Anthony Ruchak's site (courtesy of his ISP, Microserve Information Services). It is free and very customisable and I think you can have as many accounts as you like (I haven't tried). The only restriction is that Microserve ask for a link from every page with the counter. One piece of missing advice, if you do not pick a unique name, you'll get someone else's counts and there's no indication (other than that the initial count is not 1) that you've picked a duplicate name.

For more see the Web Page Counter Links page at Surfing Links.


Return to [Contents] Contents


The URL-minder: Your Own Personal Web Robot!
Fast Form

This form is for users who already know what to do. For an explanation of what the URL-minder does for you, go to the main page.

This form registers you to receive Email when the selected URL changes.

Enter your Internet e-mail address:

Enter your Name (handles OK):


Enter or paste in a URL (http, ftp, or gopher):

Go to The URL-minder: Your Own Personal Web Robot!


This is Andrew's Web Resources - HTML Authoring Page. For further information, contact me:

by mail at:

Grinton
Aldenham Grove
Radlett
Herts WD7 7BW
England, UK

by Email at andrew at hougie dot co dot uk

or via the Feedback page.

Comments on these pages via the User Survey page are always welcome.


This page is © 1995-98 Andrew Hougie. The right of Andrew Hougie to be identified as the author of this work has been asserted by him in accordance with the Copyright Designs and Patents Act 1988.
Last revision: 22:06:38 on Sunday, 13 July, 2003
Page 11a of 16 - http://www.hougie.co.uk /html2.htm
Principal site (UK) - http://www.hougie.co.uk/html2.htm
Principal site (US) - http://www-us.hougie.co.uk/html2.htm
UK mirror - http://www-uk.hougie.co.uk/html2.htm

W3C Wilbur Checked!

Validated by W3C HTML Validation Service to HTML Level 3.2 (Wilbur) on 27 July 1997.


Media and Film Newspapers and Journals UK Politics English Law Resources Finance Resources
Internet Guide Internet Searches Internet News Servers Web Page Authoring Stock Exchanges
What's New? Sites Feedback User Survey Computer Hardware Computer Periodicals
HOME