HTML Writing Tips and Search Engine Strategies

The statement in the buff coloured panels below was true at the time it was written, but as time has progressed and standards have moved on to XHTML it is no longer valid, I have left it here to show the logic that was used.

HTML tag notation... The rules say that tags can be upper or lower case or a mixture of both. In many cases it does not matter which is used, but in some instances there is a speed gain, by saving a few clock cycles, in not changing case unless forced to by the character concerned... I will explain this by examining my code for the page heading above:-

<H1>HTML Writing Tips <SMALL>and Search Engine Strategies</small></h1>
The first character is '<' and is upper case and so the 'H' that follows it is upper case as well. The next character is a '1' which is lower case and the final character in this group is the '>' which is upper case anyway. This tag would have had the same number of case changes if it had been written <h1>, but if we examine the opening <SMALL> tag, all the characters are upper case so there is no case change at all. The closing </small>' tag has the same number of case changes whether the word small is upper or lower case, but I generally use lower case for closing tags so that I get consistency... With opening tags as UPPER CASE and closing tags as lower case. The closing </h1> tag also has the same number of case changes regardless of case used.

!DOCTYPE Declaration... I am now convinced that this is essential and must conform to one or other of the Doctypes illustrated on the W3C page...

Document Title <title></title>...This occurs Between the <head></head> tags and the words used need to be accurately descriptive and definite because all search engines put a considerable weight to this one piece of information. This can sometimes be quite difficult to achieve if you yourself are familiar with what the site is about. A friend of mine, Tim Vaughan has a landscaping business 'Evergreen Landscaping' and part of this business is 'sprinkler repairs'. He has just started a website and one of his pages is about sprinkler repairs, but because he is knowledgeable about his subject he titled the page 'Sprinkler Repairs'. He asked my opinion of the page and I said that I thought 'garden sprinkler repairs' was more descriptive and definitive... If you put 'sprinkler' into a search engine you will get 541,000 possible replies... 'Garden sprinkler repairs' only brings back 8,210 hits, but they are all relevant. (I used Google for this example.)

The META Description Tag is used by almost all search engines. There is a 200 character limit, but many search engines do not scan the whole 200 characters, so keep your description short, to the point, plain prose and use one sentence or at most two sentences. It should reflect elements in the document title and in the body text. One of the methods that I use, is to write the description very early in the process of writing the page and copy and past it into the page body. Then as I write my text I copy and paste fragments or phrases from this description where they are appropriate. The meta description tag for this page is displayed below:-

meta keywords are often missed out all together. They are commonly just jotted down as general terms that relate to the website concerned. In most cases very little thought is put into the generation of such keywords. Using Tim's garden sprinkler repairs page as an example, his Keywords were:-

I explained that his keywords were not quite definite enough. And that he was not involved in factory sprinklers or car repairs, that his keywords would get lost in millions of unspecific repair hits or hits on auto valves. I thought his keywords should be...

Keywords should also have identical terms within the text of the document and generally the more times a keyword is repeated within the text the greater weight that it carries. There are exceptions to this and if you overdo this feature the search engines's spider may consider this as 'spamming' and may not index your site at all.

Comma delimiting of keywords... Now here is huge problem that has developed! The delimiter between keywords is the humble comma, however when we write English we naturally put a space after a comma and this practice has crept into keyword writing:-

The page heading should also mirror the words used in the 'document title' and the weight of emphasis that is used in page headings is noted and graded by some (most) search engines. I use <h1><h1> as a standard for this purpose, but I also write my pages to give a high contrast and bold appearance to help those with less than perfect eyesight.

alt="text" attributes in img statements are often ignored or considered redundant. They do however, serve two major functions... One is as an indexable label that search engines can use to caption the image when it is displayed in image search facilities. In the second place some search engines enhance a page's rating if they are present and relevant. Although not an important point, I also find that they are a useful place to give credit to the provider of a photograph. In addition the 'title' attribute can serve as an extra place to get keyword repeats as well as providing 'pop-up' descriptions if the 'Firefox' browser is being used by your surfer.

I should explain a little more on alt="Text" attributes... They stem from the time when many fledgling browsers could not handle pictures at all. So the alt="Text" attribute was brought into the HTML standard. This was situated within the img tag used for images and did not affect those browsers that could actually handle the image display. The idea was that you could write a complete description of the content of the picture (where I used the word Text) which would be displayed instead of the picture itself in those browsers without graphics capability.

Text only browsers are incredibly rare these days, but the 'alt' attribute still exists. The way most current browsers use it... It will show up as a 'tool tip' a couple of seconds after you hover your mouse pointer on the picture. You can use it to give additional information about the picture. The text will also show within an image placement rectangle whilst the image itself is being loaded.

The text in the alt attribute is only on the screen for a short time or while the mouse pointer is over the picture (depending on the browser). However the alt text is recognised as text by a search engine's spider, so you can use this to amplify the strength of a keyword that already exists (by including it in your alt text) or you can add an extra keyword that has been used within the alt text.

There is another reason to use alt="Text" attributes... Some search engines and many HTML validators require it to be present before the page is indexed. Also there are validating systems used by some web space providers that will trip up if they are not used... If you do not wish to use the facility, you can overcome the validators by using the attribute... alt=" " which may not show up at all or at worst will give a character sized rectangle as a tool tip.

Favicons are relatively new and originally were specific to the 'Internet Explorer' browser, but this is changing and by the time that this page is widely read it will be a common feature in all major and some minor browsers. For safety you should create your icons on a 16 x 16 pixel grid, using the 16 colour standard windows palette.

I have a number of different icons for different sections of the website and I use a link of the form:-

Keeping track of revision dates may appear boring, but it is very useful and I have had many comments from surfers about it. I have recently added another feature to enable me to keep track of the standard of coding that I use to create a page or update a page to. This consists of a tiny .gif image in the extreme bottom right hand corner of the page. The .gif has the legend G8MZY (my Amateur Radio Call sign followed by an issue number. The standard resistor colour code is used to code the last digit of the year and the issue number. The first of these .gifs was generated in 2003 and is issue one, thus the G8MZY is in orange for '3' and the .1, that denotes the issue number, is in brown.

Handwritten code versus code generated by WYSIWYG application software... The choice is yours, but I have yet to see any HTML source code that has been generated by a WYSIWYG application that comes close to the efficiency of well written hand coded HTML. Nor have I ever seen any software generated HTML code that was anything like as compact as hand written code. However if you peruse the source code of my website you will find plenty of errors that I have not had time to correct. Layout produced by WYSIWYG software often cannot accommodate different screen resolutions very well. I think this is mainly due to the software writer's previous knowledge of office suites and word processing packages which are fairly rigid in page format. We should remember that the browser is the arbiter in layout on a screen and we should write our code to allow the browser to use it's ability to the full and the resources available to the viewer's computer, rather than trying to force a style or layout to be imposed rigidly. There is another benefit in learning to code HTML by hand and that is the time taken to learn HTML can easily be close to the time required to learn how to use a software package and with hand coding there are no limits imposed to layout as there are with some WYSIWYG applications.

Search Engine Strategies... in utilising the information that the crawling spiders bring back.

How Search Engines obtain their information... Various methods are used by different search engines, by being aware of these methods we can ensure that we get a 'fair' rating for our pages. There is little point in trying to cheat or gain popularity by artificial methods, your pages will stand and fall on their merit and honesty, but we can get a head start on those website compilers that pay no attention to this aspect of page writing.

Spidering or crawling is the method used by most of the top ranked search engines, but there are other methods as is shown in the table at right which was produced from information generated by Searchenginewatch.com at the time this page was originally written. Most of the high visibility search engines use more than one method, the second column shows the method that is the major influence for the search engines in the list. Some spiders only look at the index page of a website, but others, particularly Google and AllTheWeb will crawl deep into a site and index information on sub pages. I do not know how deep, that this deep crawling goes, but most of my individual pages will come up in Google within about the top twenty hits, providing the search term is specific enough.

Clickthrough measures traffic to and from sites and uses this as it's yardstick for rating.

Human compilation... Apart from it being done by a human editor instead of a robot, I do not know what criteria are assessed to generate the ranking.

A 'Paid for' rating may be the best course in gaining visibility for a website that has not been compiled by hand as 'sponsored links' are always among the first few results returned.

Collusion with other search engines may occur by business agreement or a database may be hacked into by an 'active agent'. Whichever is the case comparisons and data swapping do occur and I think that this only serves to increase the visibility of the well written websites at the expense of those that are not so well written.

There is a link to Joe Burns' website HTML goodies which is included because I have his 'HTML goodies' book on my shelves and find that I am in agreement with much that he says (certainly about older HTML standards). By following this link you will find many other useful links, but I urge you to check anything you find on HTML writing, with W3C standards as indeed you should check out everything that I have said on this page.

HTML Writing Tipsand Search Engine Strategies

Search Engine Strategies... in utilising the information that the crawling spiders bring back.

Written... 18/19/20/21 February 2003, Modified... 23 February 2003, Additions... 25 February 2003, New Domain... 04 November 2003, corrected... 23 February 2004, Re Written and Upgraded... 17 July 2006,

HTML Writing Tips
and Search Engine Strategies