Hello again Dinkytwist,
Thank you for asking me to work on this research project.
The Invisible Web, sometimes referred to as the Deep Web, consists of
information stored in searchable databases as well as other
non-standard file types on the web. This material is generally missed
by search engine spiders because they cannot or will not index this
information.
?BrightPlanet has estimated the size of the Invisible Web to be 500
times the size of the accessible web, which would make it
approximately 550 billion web pages.?
http://www.library.usyd.edu.au/skills/isearch/invisweb.html
?The Invisible Web, sometimes referred to as Deep Web, is estimated to
be more than 500 times larger than what traditional search engines can
index.?
Nielsen BuzzMetrics
http://www.nielsenbuzzmetrics.com/release.asp?id=41
Deep Web WHITE PAPER by MICHAEL K. BERGMAN
A highly informative study regarding the size of invisible web.
Graphs, pie charts and detailed statistics are provided.
Download here:
http://www.brightplanet.com/pdf/deepwebwhitepaper.pdf
However these numbers may be overestimated as the following reports point out.
Exploring the Academic Invisible Web by Dr. Dirk Lewandowski
Download here:
http://www.ib.hu-berlin.de/~mayr/arbeiten/Lewandowski-Mayr_BC06.pdf
The Deep Web: Resource Discovery in the Library of Texas
By Kathleen R. Murray and William E. Moen
February 2004
?A quick search for information about the deep Web is certain to
retrieve the 2001 white paper reporting the results of the DeepPlanet
study that quantified and characterized the content in the surface
versus the deep Web (Bergman, 2001). The study estimates that the deep
Web is 500 times larger than the surface Web and contains higher
quality resources. The number of deep websites is estimated at 200,000
and characterized by rapid growth. Additionally, a full 95% of deep
Web content is publicly accessible, requiring no subscriptions or
licenses.?
?While there is general agreement that the deep Web is home to a
wealth of high quality and primary source data, there is not agreement
as to its size relative to the surface Web. Sherman (2001) asserts
that the DeepPlanet study is flawed in its measurement techniques and
that the deep Web is actually 2 - 50 times larger than the surface
Web.?
?Regardless of measurement approach, there is no doubt that the deep
Web is much larger than the surface Web trolled by the major search
engines and is believed to be growing rapidly (Bergman, 2001). ?
Download here:
http://www.unt.edu/wmoen/publications/TLJarticle_deepweb_PrePrintFeb2004.pdf
Search Engine Sizes by Danny Sullivan, Editor-In-Chief
January 28, 2005
http://searchenginewatch.com/reports/article.php/2156481#current
***************************************************************************
Below you will find the most recent data regarding the size of the deep web
Deep Web Research 2006
The January 2006 Issue of LLRX has a feature article written by Marcus
P. Zillman titled Deep Web Research 2006. The guide extensively
documents resources that include articles, books, websites,
presentations, search engines, and technology applications that
facilitate the challenging task of accessing information, published in
many formats, that encompass the hundreds of millions of pages
comprising the "deep web."
Surface web 8 billion pages
Deep web 900 billion pages
?The Deep Web covers somewhere in the vicinity of 900 billion pages
of information located through the World Wide Web in various files and
formats that the current search engines on the Internet either cannot
find or have difficulty accessing. The current search engines find
about 8 billion pages at the time of this writing.?
Source:
Deep Web Research Research 2006 by Marcus P. Zillman
Published January 15, 2006
About Marcus P. Zillman
?Marcus P. Zillman, M.S., A.M.H.A., is Executive Director of the
Virtual Private Library and Founder/Creator of BotSpotŪ. He is the
author of nine different Internet MiniGuides 2006, Internet Sources
Manual and eCurrent Awareness Resources 2006 Report. His Subject
Tracer? Information Blogs (45 and constantly growing) are freely
available from the Virtual Private Library, which include the latest
resources on Deep Web Research and Bot Research.?
LLRX January 2006 Issue Deep Web Research Featured Article
http://www.llrx.com/features/deepweb2006.htm
Search terms used:
Size of the dark OR invisible web
I hope the information provided is helpful!
Best regards,
Bobbie7 |
Clarification of Answer by
bobbie7-ga
on
06 Jul 2006 07:52 PDT
While not recent, the following might be helpful.
"July 12, 1999 ? A new study of coverage of the indexable Web by
search engines states that, as of February 1999, only 42 percent of
the Web is indexed by the combined search engines. The study, by Steve
Lawrence and C. Lee Giles of NEC Research Institute, appeared in the
July 8, 1999, issue of Nature (pp. 107-109). Lawrence and Giles made
headlines a year ago with their study of overlap among search engines
that showed that each Web search engine indexed a fairly discrete
corner of the Web, with little overlap among them. In that study,
Lawrence and Giles reported that the combined coverage by all Web
search engines was about 60 percent of the Web."
http://www.infotoday.com/newsbreaks/nb0712-1.htm
I would suggest you post a new question without my name in the title
so that all the researchers can get a chance to answer your question.
Sincerely,
Bobbie7
|