BBC Homepage World Service Education
BBC Homepagelow graphics version | feedback | help
BBC News Online
 You are in: In Depth: dot life
Front Page 
UK Politics 
Talking Point 
In Depth 

banner Monday, 22 January, 2001, 11:55 GMT
The tricks that win clicks
Searching the seas of cyberspace
You may not realise it, but people are fighting over you, desperately trying to get your attention, writes BBC News Online internet reporter Mark Ward

Webmasters are frantically trying all manner of tricks to make you visit their websites.

And working just as hard to stop them are the search engines who like to give you the webpages you really want, rather than those someone else would like you to see.

Tricks of the trade
Copying meta-tags
Hijacking pages
Making 'bridge' pages
'Cloaking' pages
There is a lot at stake here both for websites that want to be popular, and for search engines that want to produce accurate results.

The more unique visitors a website gets, the more it can charge advertisers. And with up to 80% of people finding a page via a search engine there are a lot of hits at stake.

Studies of web habits have shown that people tend to stick with the sites they know and come to trust. Most people are only just getting to grips with the web, so ensuring they see your site is more important now than ever. Impress them today and they could be coming back for years.

Ranking relevance

But this fight for eyeballs is getting dirty. Unscrupulous webmasters are resorting to underhand tricks to ensure that when they submit their site to a search engine it gets as high a ranking as possible.

Found any needles recently?
"All the search engines describe submission as an arms race," said Danny Sullivan, editor of the Search Engine Watch website. "A lot of resources are poured into this, especially in competitive industries such as porn or casino sites that can make millions by being the top of the listings."

The tricks webmasters use vary by the search engine they are targeting. While most search engines gather information about the web in the same way, they treat the data they gather in very different ways.

Search engines such as Altavista, Google, Inktomi, Lycos, Excite regularly trawl the web with programs called crawlers to find out just what is on the billions of pages out there. Most crawl the entire web every couple of weeks.

Only a small percentage of all the documents on the web actually contain anything interesting

Henrik Hansen, Inktomi
Once it has the information, an engine will index the pages. Some search engines analyse the index to strip the pages it does not think anyone will ever look at. The finished index is the one the search engine consults when you ask a query.

Some webmasters try to manipulate both the crawling and the analysis to make pages look more popular than they actually are.

"In the past, it used to be that search engines ranked pages by what the page said it was about," said Mr Sullivan.

The crawler noted the title of a page, looked at the frequency of words on it, and accepted that as a good guide to its subject matter. Often this was enough. A page with a title of "British Cheese" and a regular mention of Cheddar, Wensleydale and Double Gloucester is probably about, well, cheese.

Truth and consequences

Sadly not all webmasters are so honest. Some, usually those looking after porn sites, put lots of irrelevant but popular words on a page in an attempt to fool the crawler.

Sometimes a huge list of words, including the likes of Britney or Princess Diana, appears on a page in text the same colour as the background. Although you cannot see these words, the crawler can.

Crawlers also consult the "meta-tags" associated with a webpage. These summarise what a page is all about and are invisible to normal humans, i.e. those who do not normally look at a page's source HTML.

But if you dig deep enough, there are things to be found
Again, unscrupulous webmasters abuse them and some even steal the meta-tags of their rivals so any search for a competitor turns up their site too. Courts in the UK and US have declared this practice illegal.

Also illegal, but widely used, is page hijacking. This involves copying a popular page and then submitting it from your own site. Search engines will list you as a source and point people at your link even though you stole the information.

Because crawlers can be manipulated, few search engines rely on the raw results they produce to be an accurate guide. Most clean it up to make it more accurate.

The amount of cleaning up and indexing is what sets the search engines apart.

The pages culled from the basic list are the computer generated pages that hardly anyone is interested in and the mirror sites that duplicate others. "Only a small percentage of all the documents on the web actually contain anything interesting," said Henrik Hansen of Inktomi.

Popularity poll

Search engines such as Inktomi and Google analyse the index and rank pages on a particular subject by the number of other pages that link to them. They reason that people will probably link to pages they find useful. So if every page about cheese links to one particular site, it is a fair bet that the page is a good one.

The source HTML is there if you can be bothered to look
Now, even this process is starting to be manipulated. Some webmasters are creating pages that look nonsensical to people but have just the right frequency of words to fool an indexer about their subject.

Webmasters make thousands of these pages that differ enough to make the indexer think they are unique, and all point to the one that the webmaster wants you to see. Again, porn sites are big fans of this tactic.

Often these "bridge" pages are "cloaked" so you never actually see them - you just get bounced to another page which is the main page the webmaster really wants you to see.

These latter tactics fall into the grey area between aggressive self-promotion and scamming.

Search engines try to stop it when they can. "We are getting smarter all the time at catching people trying to circumvent the system," said Mr Hansen from Inktomi. The search for the perfect solution is still on.

Search BBC News Online

Advanced search options
Launch console
See also:

15 May 00 | Sci/Tech
Half the internet is going nowhere
29 Aug 00 | Business
Faster than a speeding modem
29 Mar 00 | Business
Autonomy claims the search is over
07 Jul 99 | Sci/Tech
Web engines could do better
21 Sep 00 | Business
Improving your rank
15 Nov 99 | e-cyclopedia
Cybersquatting: Get off my URL
Internet links:

The BBC is not responsible for the content of external internet sites

Links to more dot life stories are at the foot of the page.

E-mail this story to a friend

Links to more dot life stories