HOME Pages
 
  David A. Cushman logo  
"A-Z" INDEX

HTML Writing Tips
and Search Engine Strategies

The statement in the buff coloured panels below was true at the time it was written, but as time has progressed and standards have moved on to XHTML it is no longer valid, I have left it here to show the logic that was used.

HTML tag notation... The rules say that tags can be upper or lower case or a mixture of both. In many cases it does not matter which is used, but in some instances there is a speed gain, by saving a few clock cycles, in not changing case unless forced to by the character concerned... I will explain this by examining my code for the page heading above:-

<H1>HTML Writing Tips <SMALL>and Search Engine Strategies</small></h1>

The first character is '<' and is upper case and so the 'H' that follows it is upper case as well. The next character is a '1' which is lower case and the final character in this group is the '>' which is upper case anyway. This tag would have had the same number of case changes if it had been written <h1>, but if we examine the opening <SMALL> tag, all the characters are upper case so there is no case change at all. The closing </small>' tag has the same number of case changes whether the word small is upper or lower case, but I generally use lower case for closing tags so that I get consistency... With opening tags as UPPER CASE and closing tags as lower case. The closing </h1> tag also has the same number of case changes regardless of case used.

!DOCTYPE Declaration... I am now convinced that this is essential and must conform to one or other of the Doctypes illustrated on the W3C page...

http://www.w3.org/QA/2002/04/valid-dtd-list.html

Document Title <title></title>...This occurs Between the <head></head> tags and the words used need to be accurately descriptive and definite because all search engines put a considerable weight to this one piece of information. This can sometimes be quite difficult to achieve if you yourself are familiar with what the site is about. A friend of mine, Tim Vaughan has a landscaping business 'Evergreen Landscaping' and part of this business is 'sprinkler repairs'. He has just started a website and one of his pages is about sprinkler repairs, but because he is knowledgeable about his subject he titled the page 'Sprinkler Repairs'. He asked my opinion of the page and I said that I thought 'garden sprinkler repairs' was more descriptive and definitive... If you put 'sprinkler' into a search engine you will get 541,000 possible replies... 'Garden sprinkler repairs' only brings back 8,210 hits, but they are all relevant. (I used Google for this example.)

The META Description Tag is used by almost all search engines. There is a 200 character limit, but many search engines do not scan the whole 200 characters, so keep your description short, to the point, plain prose and use one sentence or at most two sentences. It should reflect elements in the document title and in the body text. One of the methods that I use, is to write the description very early in the process of writing the page and copy and past it into the page body. Then as I write my text I copy and paste fragments or phrases from this description where they are appropriate. The meta description tag for this page is displayed below:-

<meta name="description" content="A tutorial on the proper use of some META tags, various HTML Writing Tips, strategies used by Search Engines to rank your pages along with strategies that a website compiler may use to aid the visibility of his/her pages in search engine rankings." />

The whole tag is actually written on only one line, as are all my meta tags, including the keyword tag which can be very long indeed. This is no longer essential as far as W3 is concerned, but many of the minor search engines will trip up on a carriage return character that occurs within the tag. Minor search engines may be minor, but they have their uses and it is silly to bar your website from being indexed by them.

meta keywords are often missed out all together. They are commonly just jotted down as general terms that relate to the website concerned. In most cases very little thought is put into the generation of such keywords. Using Tim's garden sprinkler repairs page as an example, his Keywords were:-

<meta name="keywords" content="sprinklers,sprinkler repairs,extending sprinklers,valves,timers,sprinkler installation" />

I explained that his keywords were not quite definite enough. And that he was not involved in factory sprinklers or car repairs, that his keywords would get lost in millions of unspecific repair hits or hits on auto valves. I thought his keywords should be...

<meta name="keywords" content="garden sprinklers,garden sprinkler repairs,extending garden sprinklers,garden sprinkler valves,garden sprinkler timers,garden sprinkler installation" />

Try to imagine what a person may be thinking when he (or she) is typing a search term into their favourite search engine.

Keywords should also have identical terms within the text of the document and generally the more times a keyword is repeated within the text the greater weight that it carries. There are exceptions to this and if you overdo this feature the search engines's spider may consider this as 'spamming' and may not index your site at all.

Comma delimiting of keywords... Now here is huge problem that has developed! The delimiter between keywords is the humble comma, however when we write English we naturally put a space after a comma and this practice has crept into keyword writing:-

If you do this your META keywords will not work... The comma is the delimiter, the leading spaces become part of the keyword and so anyone searching would have to include a leading space to get a match.

Some search engines are becoming smart to this, but it is still wrong. The inclusion of spaces also detracts from the 1,000 character limit that is placed on keyword fields. The problem is made worse as it has been propagated by some very poorly written 'Internet Tutorials', some of these are misleading or inaccurate on other similar topics. Recently, in an attempt to reduce spamming, much less emphasis has been attached to keywords by the major search engines... They no longer index the keywords themselves, but they do compare them to the body text and make judgments about their usage and context. For this reason it is still worth using keywords, but it places an even greater importance on the way text is written as you now have to satisfy the robot that your keyword was validly used in the page concerned. In any case there are thousands of minor search engines that still do index keywords and as those robotic systems are themselves crawled by the major engine's spiders they influence the rating of your webpage or website.

Alternative spellings or alternative terminology (US/UK English) and common miss-spellings can also be included with your keywords, but leave any miss-spellings or less important keywords to the end of the keyword list because some search engine protocols will truncate your list to less than the 1,000 character limit.

The meta keywords used on this page are shown below...

<meta name="keywords" content="HTML Writing Tips,Search Engine Strategies,proper use of META tags,rank your pages,website compiler,search engine rankings,HTML Goodies,W3C,HTML tag notation,!DOCTYPE Declaration,Document Title,Evergreen Landscaping,Google,META Description Tag,200 character limit,minor search engines,META Keywords,favourite search engine,Comma delimiting of keywords,leading spaces,1000 character limit,Internet Tutorials,major search engines,Alternative spellings,US/UK English,page heading,less than perfect eyesight,Favicons,revision dates,G8MZY,Amateur Radio Call sign,resistor colour code,Handwritten code,HTML source code,WYSIWYG application,hand coded HTML,WYSIWYG software,different screen resolutions,Search Engine Strategies,crawling spiders,Spidering or crawling,deep crawling,search term,Clickthrough,Human compilation,Paid-for rating,sponsored links,search engine collusion,active agent,ALT="Text" attributes" />

The page heading should also mirror the words used in the 'document title' and the weight of emphasis that is used in page headings is noted and graded by some (most) search engines. I use <h1><h1> as a standard for this purpose, but I also write my pages to give a high contrast and bold appearance to help those with less than perfect eyesight.

alt="text" attributes in img statements are often ignored or considered redundant. They do however, serve two major functions... One is as an indexable label that search engines can use to caption the image when it is displayed in image search facilities. In the second place some search engines enhance a page's rating if they are present and relevant. Although not an important point, I also find that they are a useful place to give credit to the provider of a photograph. In addition the 'title' attribute can serve as an extra place to get keyword repeats as well as providing 'pop-up' descriptions if the 'Firefox' browser is being used by your surfer.

I should explain a little more on alt="Text" attributes... They stem from the time when many fledgling browsers could not handle pictures at all. So the alt="Text" attribute was brought into the HTML standard. This was situated within the img tag used for images and did not affect those browsers that could actually handle the image display. The idea was that you could write a complete description of the content of the picture (where I used the word Text) which would be displayed instead of the picture itself in those browsers without graphics capability.

Text only browsers are incredibly rare these days, but the 'alt' attribute still exists. The way most current browsers use it... It will show up as a 'tool tip' a couple of seconds after you hover your mouse pointer on the picture. You can use it to give additional information about the picture. The text will also show within an image placement rectangle whilst the image itself is being loaded.

The text in the alt attribute is only on the screen for a short time or while the mouse pointer is over the picture (depending on the browser). However the alt text is recognised as text by a search engine's spider, so you can use this to amplify the strength of a keyword that already exists (by including it in your alt text) or you can add an extra keyword that has been used within the alt text.

There is another reason to use alt="Text" attributes... Some search engines and many HTML validators require it to be present before the page is indexed. Also there are validating systems used by some web space providers that will trip up if they are not used... If you do not wish to use the facility, you can overcome the validators by using the attribute... alt="&nbsp;" which may not show up at all or at worst will give a character sized rectangle as a tool tip.

Favicons are relatively new and originally were specific to the 'Internet Explorer' browser, but this is changing and by the time that this page is widely read it will be a common feature in all major and some minor browsers. For safety you should create your icons on a 16 x 16 pixel grid, using the 16 colour standard windows palette.

I have a number of different icons for different sections of the website and I use a link of the form:-

<link rel="shortcut icon" href="../ico/html/favicon.ico">

This occurs Between the <head></head> tags. The favicon.ico referred to is the one that I have generated and use for any pages that relate to HTML and the writing of it... If you add this page to your 'favourites' you will see it instead of whatever icon your browser uses as standard.

/ico/ is the directory that the icons for the computing section are stored in (in my particular case). And the /html/ is a subdirectory to enable the 'html icon' to have the name 'favicon'.

Keeping track of revision dates may appear boring, but it is very useful and I have had many comments from surfers about it. I have recently added another feature to enable me to keep track of the standard of coding that I use to create a page or update a page to. This consists of a tiny .gif image in the extreme bottom right hand corner of the page. The .gif has the legend G8MZY (my Amateur Radio Call sign followed by an issue number. The standard resistor colour code is used to code the last digit of the year and the issue number. The first of these .gifs was generated in 2003 and is issue one, thus the G8MZY is in orange for '3' and the .1, that denotes the issue number, is in brown.

Handwritten code versus code generated by WYSIWYG application software... The choice is yours, but I have yet to see any HTML source code that has been generated by a WYSIWYG application that comes close to the efficiency of well written hand coded HTML. Nor have I ever seen any software generated HTML code that was anything like as compact as hand written code. However if you peruse the source code of my website you will find plenty of errors that I have not had time to correct. Layout produced by WYSIWYG software often cannot accommodate different screen resolutions very well. I think this is mainly due to the software writer's previous knowledge of office suites and word processing packages which are fairly rigid in page format. We should remember that the browser is the arbiter in layout on a screen and we should write our code to allow the browser to use it's ability to the full and the resources available to the viewer's computer, rather than trying to force a style or layout to be imposed rigidly. There is another benefit in learning to code HTML by hand and that is the time taken to learn HTML can easily be close to the time required to learn how to use a software package and with hand coding there are no limits imposed to layout as there are with some WYSIWYG applications.

Search Engine Strategies... in utilising the information that the crawling spiders bring back.

Search
Engine
Majority
Results
Results
Provider
Paid
Entries
DirectoryBackup
AllTheWebSpiderAllTheWebOverture, Lycos--
AltaVistaSpiderAltaVistaOvertureLookSmartLookSmart
AOLSpiderGoogleOvertureOpen Directory?
Ask JeevesSpiderTeomaEspottingAsk JeevesAsk Jeeves
GoogleSpiderGoogleGoogleOpen DirectoryGoogle
HotBotClickthroughAllTheWebN/AOpen DirectoryInktomi
LookSmartHumanLookSmartLookSmartGoogleInktomi
LycosSpiderAllTheWebOverture, LycosOpen DirectoryOpen Directory
MSNHumanLookSmartOverture-Inktomi
OverturePaidOvertureOverture-Inktomi
Open DirectoryHumanOpen DirectoryN/A--
TeomaSpiderTeomaGoogle--
YahooSpiderGoogleEspottingYahooYahoo

How Search Engines obtain their information... Various methods are used by different search engines, by being aware of these methods we can ensure that we get a 'fair' rating for our pages. There is little point in trying to cheat or gain popularity by artificial methods, your pages will stand and fall on their merit and honesty, but we can get a head start on those website compilers that pay no attention to this aspect of page writing.

Spidering or crawling is the method used by most of the top ranked search engines, but there are other methods as is shown in the table at right which was produced from information generated by Searchenginewatch.com at the time this page was originally written. Most of the high visibility search engines use more than one method, the second column shows the method that is the major influence for the search engines in the list. Some spiders only look at the index page of a website, but others, particularly Google and AllTheWeb will crawl deep into a site and index information on sub pages. I do not know how deep, that this deep crawling goes, but most of my individual pages will come up in Google within about the top twenty hits, providing the search term is specific enough.

Clickthrough measures traffic to and from sites and uses this as it's yardstick for rating.

Human compilation... Apart from it being done by a human editor instead of a robot, I do not know what criteria are assessed to generate the ranking.

A 'Paid for' rating may be the best course in gaining visibility for a website that has not been compiled by hand as 'sponsored links' are always among the first few results returned.

Collusion with other search engines may occur by business agreement or a database may be hacked into by an 'active agent'. Whichever is the case comparisons and data swapping do occur and I think that this only serves to increase the visibility of the well written websites at the expense of those that are not so well written.

There is a link to Joe Burns' website HTML goodies which is included because I have his 'HTML goodies' book on my shelves and find that I am in agreement with much that he says (certainly about older HTML standards). By following this link you will find many other useful links, but I urge you to check anything you find on HTML writing, with W3C standards as indeed you should check out everything that I have said on this page.

 Written... 18/19/20/21 February 2003, Modified... 23 February 2003, Additions... 25 February 2003, New Domain... 04 November 2003, corrected... 23 February 2004, Re Written and Upgraded... 17 July 2006,
This page has actually been validated by W3C Javascript Navigational elements removed as per W3C Link Checker version 4.1 (c) 1999-2004 Requirements
Computer favicon