|
INTRODUCTION
Searching on the Internet today can be compared to dragging a net across the surface
of the ocean. While a great deal may be caught in the net, there is still a wealth
of information that is deep, and therefore, missed. The reason is simple: Most
of the Web's information on dynamically generated sites, and standard search engines
never find it.
Traditional search engines create their indices by spidering or crawling surface
Web pages. To be discovered, the page must be static and linked to other pages.
Traditional search engines can not "see" or retrieve content in the
deep Web - those pages do not exist until they are created dynamically as the
result of a specific search. Because traditional search engine crawlers can not
probe beneath the surface, the deep Web has heretofore been hidden. Deep
web is the name given to the technology of surfacing the hidden value that cannot
be easily detected by other search engines. The deep web is the content that cannot
be indexed and searched by search engines. For this reason the deep web is also
called invisible web. WHAT
IS DEEP WEB?
The Deep Web is the content that resides in searchable databases, the results
from which can only be discovered by a direct query. Without the directed query,
the database does not publish the result. When queried, Deep Web sites post their
results as dynamic Web pages in real-time. Though these dynamic pages have a unique
URL address that allows them to be retrieved again later, they are not persistent.
The invisible
web consists of files, images and web sites that, for a variety of reasons, cannot
be indexed by popular search engines. The deep web is qualitatively different
from the surface web. Deep web sources store their content in searchable databases
that only produce results dynamically in response to a direct request. But a direct
query is a "one at a time" laborious way to search. Deep web's search
technology automates the process of making dozens of direct queries simultaneously
using multiple-thread technology.
The Deep Web is made up of hundreds of thousands of publicly accessible databases
and is approximately 500 times bigger than the surface Web. IMPORTANCE
OF DEEP WEB "
Public information on the deep Web is currently 400 to 550 times larger than the
commonly defined World Wide Web. " The deep Web contains 7,500 terabytes
of information compared to nineteen terabytes of information in the surface Web.
" The deep Web contains nearly 550 billion individual documents compared
to the one billion of the surface Web. " More than 200,000 deep Web sites
presently exist. " Sixty of the largest deep-Web sites collectively contain
about 750 terabytes of information -- sufficient by themselves to exceed the size
of the surface Web forty times. " On average, deep Web sites receive
fifty per cent greater monthly traffic than surface sites and are more highly
linked to than surface sites; however, the typical (median) deep Web site is not
well known to the Internet-searching public. " The deep Web is the largest
growing category of new information on the Internet. " Deep Web sites
tend to be narrower, with deeper content, than conventional surface sites. "
Total quality content of the deep Web is 1,000 to 2,000 times greater than that
of the surface Web
<<back
|