|HOMEPAGE | SPORT | WEATHER | WORLD SERVICE | MY BBC|
|Front Page | Programmes|
Black Holes in Cyberspace - The Invisible web
By the BBC's Annabel Colley and Matthew McDonnell
The internet is a fantastic search tool that has changed people's lives, but its vast potential remains untapped for most of us.
For every one billion "visible" web pages, there are another 500 billion hidden below. It is all legal publically available information - it is just a case of finding it.
The internet company, Brightplanet www.brightplanet.com estimate that information in the invisible web is growing ten per cent faster than information on the commonly searchable web.
Brightplanet carried out the first big study into the invisible web, or the "deep web" as they prefer to call it. Download deepwebwhitepaper.pdf And they concluded that the information it contains is often highly relevant.
According to Brightplanet: "Information held in the deep web is up to two thousand times better quality than the information easily retrieved by the search engines from the 'surface web'."
So why can we not search this invisible web with the search engines?
The missing links
Search engines use programs called spiders that 'crawl' the web, skimming between web pages via text links.
As they go, they index any text and code they come across, but they miss out on a vast reservoir of valuable, interconnected databases and documents.
That is because a lot of the best quality information is held in subject databases and the search engines cannot get into these.
It may be because a password is needed to get at these databases.
Or it could be that you need to register to get at the information which is often free of charge, but inaccessible unless you have signed up.
It could also be that the search engine does not recognise code contained within the database. This means that using most search engines, you are likely to miss out on a whole host of information such as:
Chris Sherman is co-author with Gary Price, of "The Invisible Web, Uncovering information search engines can't see"*.
Chris explains: "When an indexing spider comes across a database, it's as if it has run smack into the entrance of a massive library with securely bolted doors.
"Spiders can record the library's address, but can tell you nothing about the books, magazines or other documents it contains."
Finding the hidden treasure
Paul Pedley is head of research at the Economist Intelligence Unit in London. He runs courses for the Association for Information Management - ASLIB: www.aslib.com/training
He teaches researchers how to retrieve information held in the invisible web and has written a book on the subject - "The Invisible Web"*. He has seen an unprecedented demand for his course this year.
Search engines trying to keep up
The general search engines, like Google, remain essential searching tools so long as you know that they will retrieve only a very small percentage of information on the internet.
Many of the search engines are aware of the problem and Google is among those introducing specialist tools that will search newsgroups, PDF documents and images.
But they have trouble keeping up. The internet grows at the rate of 7.3million new pages a day - and the deep web or invisible web is the fastest growing area.
Information registered with search engines often takes up to six weeks to start appearing in searches.
There is a growing number of specialised invisible web tools which will take your query and run it on thousands of online databases.
The right tool for the job
The future may be digital, but it seems that one solution to the invisible web is to return to relying on human searching behaviour rather than on computers.
As a searcher you should think hard about your behaviour and perhaps develop a new approach to online searching.
Use the right tool for the job. Do not use a general search engine when you know there is one that specialises in a specific subject, for example the law, newspaper articles or photographs.
Gary Price, one of the world's leading experts on the invisible web, explains why it pays to know your specialist sources.
"A good librarian would not start looking for a phone number by searching the Encyclopaedia Britannica" says Price.
"Both professional and casual searchers should at least be aware that they could be missing some information or wasting time finding what could be found more easily".
"This is very similar to a good reference librarian knowing the major reference tools in his or her collection."
Knowing your sources
So how do you get to know about new sources that may be hidden in the invisible web?
Librarians want you to use their expertise. Some of the more savvy ones are now re-marketing themselves as freelance information professionals or online information brokers.
But there are just as many free newsletters and mailing lists, sometimes compiled by information professionals, to keep you updated on internet sources that the search engines may not always find. Try one of these:
*The Invisible Web
*The Invisible web: uncovering information search engines can't see
Internet links: the BBC is not responsible for the content of external internet sites.