Google Answers Logo
View Question
Q: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers ( Answered,   3 Comments )
Subject: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers
Category: Computers
Asked by: khill_cooter-ga
List Price: $200.00
Posted: 26 Jun 2002 12:50 PDT
Expires: 26 Jul 2002 12:50 PDT
Question ID: 33696
I have a multi-part question that will need some rather specific

Question (1):  Please define in a non-technical manner (layman’s
terms) the following terms (hereinafter “fields”) as they relate to
the Internet and research:

AI (Artificial Intelligence):







Question (2):  What are the differences and similarities between the
above fields? (once again, non-technical, layman’s terms)

Question (3): Who are the top ten U.S. based individual experts in
each “field”? (Academic, Think Tank or Government based)

Question (4):  Who are the top ten companies in each “field”?   

Question (5): What are the top ten commercial product offerings which
combine in a turnkey business application all of the above data
mining/information retrieval fields with effective: (A) Filters, (B)
Knowledge Management categorization middleware, and (4) Non-technical
employee user friendliness.

Question (6)  

(A)	Please provide me with (as comprehensive as possible) contact
information for the people listed below:

(B)	As they relate to the fields defined in Question (1) above, please
notate next to each person’s name listed below what their specific
areas of expertise are:
Vasant Honavar
Pattie Maes
Keith S. Decker
Milind Tambe
Erik Selberg
Oren Zamir
Oren Etzioni
Richard Segal
Marvin Minsky
Charles Petrie
Sabit Kraus
Craig Knoblock
Steve Minton
Marcus P. Zillman

If you have any questions or need more specific examples, please let
me know.
Subject: Re: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers
Answered By: aditya2k-ga on 26 Jun 2002 16:39 PDT
Hi khill_cooter,

   Good day. Wow! That is quite an impressive list of questions that
required some work to be done, but the price matched it. As someone
said - "Where there's a will, there's a way"

   Coming to the answer, I'm sure I can be of assistance since I am
associated with the above topics, and I shall define the terms in my
own words, rather than post a link.

Artificial Intelligence : It is the branch of computer science that
deals with making computers behave like humans. In other words, making
computers think. Today, results obtained by computers depend on what
the user sends to it (input). This term was coined by John McCarthy at
the Massachusetts Institute of Technology in 1956. This includes
gaming (eg. human vs. computer chess games), and expert systems(eg.
helping doctors diagnose a problem). A good site on AI :

Artificial Intelligence Repository (old Layout) (new Layout)

A bot is short for robot, a computer program that runs automatically.
Once the human configures the bot and runs it, no further human
intervention is required, except for troubleshooting or any such
problems. A spider (defined below) is a type of bot. Bots incorporate
a bit of AI.

A spider is a program that automatically fetches web pages. It visits
web pages, either through user submission, or links from another page.
This process continues until all the pages linked from all pages are
visited. This process is known as crawling. Another term for these
programs is webcrawler. Search engines like Google use a large number
of spiders working in parallel (ie. many spiders crawling different

A crawler is another name for a spider. I've confirmed this through a
number of internet sources.

A trawler is a program that sifts through large volumes of data (eg.,
other search engines, Newsgroups, or FTP archives) looking for
something of interest. Examples include which
searches search engines (also known as meta-searching). searches through newsgroup postings.

A ferret is nothing but a trawler. This is not a technical term. It is
a term devised by FerretSoft for it's products. An example of which is
WebFerret, an offline search utility that allows you to formulate your
query offline then, when you connect, it searches the web until it has
collected the number of references you have specified. WebFerret
queries large web search engines to find sites matching your keywords.
It queries all configured search engines simultaneously and discards
duplicate results. URLs that are found can be visited immediately even
as WebFerret continues to run. New or updated search engines are added
automatically to WebFerret as they become available.

A fetcher is an informal term used for a program that fetches anything
from the internet. It could be fetching of web pages for offline
viewing. It could be fetching of e-mail from your e-mail server (eg.
MS Outlook Express). It could also be fetching files from other users
(p2p or peer-to-peer), and example of which is Kazaa or Napster.

Answer (2)

Artificial Intelligence is a technology, and not software like the
others. Bots are programs that can perform any automated function, not
necessarily related to searching. A spider and crawler are one and the
same thing. It visits pages and indexes it (stores details like the
title of the page, keywords, and a cache copy). A trawler can be
considered to be a spider, except that it doesn't visit and index web
pages. Instead, it crawls through data already indexed in any form. A
ferret is a commercial name used by a company for a trawler. A fetcher
fetches data from the internet. A fetcher returns data purely based on
what the user seeks.
In a nutshell, you can group the above as follows
 -Spiders or Crawlers
 -Trawlers or Ferrets

Answers (3,4 & 5)

There is no definitive system that ranks the top 10 people or
companies in the fields mentioned above. However, the following links
give information about people who have made a significant contribution
in this regard.

AI People
This is a directory of eminent people in the field of AI, which
includes all the fields mentioned above.

People in AI

People in AI

Pioneers in AI
Note: The original page has been removed. However, google's cache
still has it.

The following search engines use the best spidering technology :




MSN Search

The top companies for turnkey business solutions in data mining are

IBM alphaWorks
Place where one can find the latest technologies from IBM Research.

Visual Numerics
The leading provider of visualization, mathematics, analysis and
network software solutions including PV-WAVE, JWAVE, IMSL and JNL.

Spotfire's modular suite of products delivers immediate value to
research scientists engaged in discovery. At the departmental level,
Spotfire products can help you extract greater value from investments
you've already made in data generation.

Visual InsightsŪ ADVIZOR
Interactive data visualization software enabling faster business
decisions: linked ActiveX components and complete applications for
business intelligence and customer behavior analysis.

Insightful Corporation
Insightful Corporation is a leading supplier of software (S-Plus) and
services for statistical data mining, business analytics and
information retrieval enabling clients to gain intelligence from data.

Acxiom Corporation
Provides a comprehensive range of information services and products
that allow businesses to make informed marketing, merchandising, and
risk management decisions.

Partek Inc. - Pattern Recognition Software
Statistical and visual data analysis software. Widely used in life
sciences and engineering for gene expression (microarray) data
analysis, high throughput screening, and drug design including SAR and
ADME prediction.

There are some public domain software available as well :

WinMine Toolkit Home Page -
By David Chickering at Microsoft Research. The WinMine Toolkit is a
set of tools for Windows 2000/NT/XP that allow you to build
statistical models from data. The majority of the tools are
command-line executables that can be run in scripts.

CART - Salford Systems
A decision tree tool that automatically sifts large, complex
databases, searching for and isolating significant patterns and
relationships. Offers free limited capability demo for download,
product features, applications, user feedback, and associated books.

Machine Learning Library in C++
MLC++ is a standard C++ library for supervised machine learning, with
back-end and front-end tools for data mining tasks like Decision
Trees, and Clustering. Information on legal issues, mailing lists,
history, standards, platform support, and download instructions.

XmdvTool Home Page
A public-domain software package for the interactive visual
exploration of multivariate data sets. It is available on all UNIX
platforms which support XR4 or higher. The current version of the
software (3.1) supports scatterplots, star glyphs, parallel
coordinates, and dimensional stacking.

AutoClass C
An unsupervised Bayesian classification system that seeks a maximum
posterior probability classification.

StatLib - XlispStat Archive
Environment for statistical computing and dynamic graphics based on
Lisp. Contains contributed code and submission instructions.

Solutions for bots, spiders, crawlers, trawlers, ferrets, fetchers :

Google Search Solutions
Hosted search options and a search hardware product.

A complete indexing and searching system for a small domain or
intranet. Source code provided under the GPL.

UNIX Search Engine for searching entire file systems.

Provides SQL-based relational full-text retrieval, dynamic publishing,
object management, and web-indexing software.

Inktomi Search Engine
Service provides searching in hosted clusters for specific domains and
web sites.

Finds information in a related web of pages. Collects and indexes
pages based on traversal of links or subdirectories. Create a
context-sensitive search by category by linking to relevant pages.

Search and personalization software optimised for multimedia.

Provides installed and hosted site search software for web sites and

Fluid Dynamics Search Engine
Written in Perl. Online-manageable with a web browser.
Provides free private-label hosted search and application services for
web sites.

Develops web-based job and resume search software for career,
newspaper and employer sites.

Indexes as many as a few million URLs and searches for words and
phrases. Uses wildcards and Boolean operators. SWSoft released ASPSeek
under the GNU GPL.

Answer (6)

The contact details of the individuals is listed below. Their area of
expertise can be determined form their contact itself (eg. Artificial
Intellgience for Dr. Vasant Honovar)

Dr. Vasant Honovar
Artificial Intelligence Research Laboratory
Department of Computer Science
210 Atanasoff Hall
Iowa State University
Ames, Iowa 50011-1040
voice: (515) 294-1098
fax: (515) 294-0258
web :

Dr. Pattie Maes
MIT Media Laboratory
Room E15-305B
20 Ames Street
Cambridge, MA 02139
+1-617-253-7442 [Voice]
+1-617-253-6215 [Fax]
+1-617-258-6264 [Alt. Fax]
Areas of expertise : Artificial Intelligence, Human Computer
Interaction, Computer Supported Collaborative Work, Information
Filtering and Electronic Commerce

Dr. Keith S. Decker
Associate Professor
Dept. of Computer and Information Sciences
University of Delaware
77 E. Delaware Ave. (the AI/NLP GreenHouse)
Newark, DE 19716-2586
(302) 831-1959 (office)
(302) 831-4091 (fax)
Areas : Distributed AI and Multi-Agent Systems

Dr. Milind Tambe
Associate Professor,
University of Southern California
Computer Science Dept
Henry Salvatori Computer Center
232, Los Angeles,
CA 90089-0781,
Tel: 213-740-6447
Fax: 213-740-7285,
web :
Areas : Multi-agents, distributed AI, TEAMWORK in multi-agent systems,
Adjustable autonomy, multi-agent collaboration, agent modeling, plan
recognition, intelligent agents in sythetic environments, constraint
satisfaction, rule-based systems, production match

Dr. Erik Warren Selberg
Home: 4815 36th Ave. NE
Seattle, WA 98115
(206) 517-3039
(206) 915-1472 (cell + voicemail)
Areas : Search Services. Implemented MetaCrawler, one of the first
World Wide Web meta search services.

Dr. Oren Zamir
Unable to get contact information. But, if you contact Dr. Selberg or
Dr. Etzioni, they would assist you, as Dr. Zamir also worked on the
MetaCrawler project

Dr. Oren Etzioni
Associate Professor
University of Washington
Computer Science & Engineering
Box 352350
Seattle, WA 98195-2350
Office: 209 Sieg Hall
Phone: (206) 685-3035
Fax: (206) 543-2969
Areas : Artificial Intelligence and Information Retrieval, for making
the Web easier to navigate

Dr. Richard Segal
IBM Thomas J. Watson Research Center
PO Box 704, Room H2-K20
Yorktown Heights, NY 10598
Areas : Bots

Dr. Marvin Minsky
MIT Media Lab and MIT AI Lab
Toshiba Professor of Media Arts and Sciences
Professor of E.E. and C.S., M.I.T
Areas : AI, cognitive psychology, mathematics, computational
linguistics, robotics, and optics

Dr. Charles Petrie
Executive Director
Stanford Networking Research Center (SNRC)
Areas : Distributed process coordination

Sabit Kraus
No information available. Maybe you got the name misspelled.

Dr. Craig Knoblock
Information Sciences Institute
University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292
Email: knoblock @
Voice: (310) 448-8786
Fax: (310) 822-0751
Areas : Artificial Intelligence and Information Integration

Dr. Steven Minton
Information Sciences Institute
University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292
Voice: (310) 822-1511 x275
Fax: (310) 822-0751
Areas : Artificial Intelligence, especially machine learning,
planning, scheduling, constraint-based reasoning and program

Marcus Zillman
Executive Producer / Host
Author of the Internet MiniGuides
(941) 434-5113
Email Address:
Area : Bot technology

I've probably covered every aspect of your questions. Please don't
hesitate to ask for a clarification if you have one. Honestly, I must
admit that I've enjoyed answering this question, since it lies within
my domain of interest.

Have a good day


Request for Answer Clarification by khill_cooter-ga on 27 Jun 2002 08:39 PDT

Thanks for the extremely great answers thus far.  Your level of
conciseness is exactly what I am looking for on all except (3) and
(4).  Let me better define what I am looking for to achieve my end

Answer (3):  You did a good job of providing the names of “AI” people,
but I am also looking for experts in “spiders” and  “trawlers” as
well.    Based on your answer of question (2), the field “bots” would
not need to be researched under this question, but my partner feels it
should.  Are there experts in just “bots” or do they specialize in the
different fields?  The reasoning behind all the questions is my
partner and I are looking at starting a business that would directly
involve these areas and we will potentially be flying to meet with
some of these experts.  It is very critical for my time and money’s
sake, that I get your expert research opinion on the top people in
these fields.  Could you provide me with the same type of contact
information as you did on answer (6)?

Answer (4):  You have shown me some companies with “Spidering”
technologies, but once again I am also looking for the use of “AI” and
“Trawlers”.    Again, I have done research on my own, but I am looking
for your expert research opinion.

Also, many thanks to insideinfo-ga for your input on my questions.  I
wish to give you a good rating as well.

Many Thanks


Clarification of Answer by aditya2k-ga on 27 Jun 2002 09:57 PDT
Hi khill_cooter,

   Thanks for your words of praise. Since you are planning to fly out
to meet the experts, I'm going to provide the contact information of,
may I use the word 'geniuses', in this field. Bots cover a wide range
of topics. It is not possible to specialize on bots on the whole.
People specialize in certain areas where bots are of assistance. As
mentioned in my answer, a spider is a bot. Trawlers, ferrets, and
fetchers are either extensions or modifications of spiders. An expert
in spiders is definitely an expert in these other technologies. The
list of people mentioned above are eminent enough. However, I'll
provide some more.

   Also, thanks to insideinfo-ga for his input.

Dr. Jakob Nielsen
"The Guru of Web Page Usability" (New York Times)
Dr. Jakob Nielsen
Nielsen Norman Group
48921 Warm Springs Blvd.
Fremont, CA 94539-7767
Office: Luice Hwang,, tel. (408) 720-8808

Sergey Brin,
Escondido Village #22D
Stanford, CA 94305
Phone : +1-415-497-0753
Fax +1-415-725-7411 or +1-415-725-2588
Sergey Brin is a co-founder of Google

Rajeev Motwani
Department of Computer Science
Room 474
Gates Computer Science Building 4B
Stanford University
Stanford, CA 94305-9045
Phones: 650-723-6045 (office), 650-725-4671 (fax)

Doug Young
Chief Technology Officer, AltaVista Internet
AltaVista Company

Keith Golden
Autonomy and Robotics Area
Computational Sciences Division
NASA Ames Research Center
vox: 650-604-3585
fax: 650-604-3594

Dr. Neal Lesh
MERL Cambridge Research
Research Scientist
Phone:  (617) 621-7583

Companies which specialize in AI,

Online Speech Engines:
Creator of the Ramona speech engine (chatbot), but actually has an
eclectic interest in a large number of different AI technologies. Note
that for Kurzweil, AI means "Accelerating Intelligence". This site is
definitely worth checking out!

AI Consulting Companies:

AI consulting, especially to government agencies

Rule Based Engines:
Blaze Advisor Rules Engine

Java based

An expanding domain independant rules based technology. Applications
include: Financial Services, Contract Management, Expert Systems, etc.

Free for non-commercial applications

Web Based Agents:

Extempo Systems
Natural language communication on the Internet

Automated response and advice to email; now acquired by Firepond

Their Kana Classify product provides automated response and advice to

Large Scale Knowledge Bases :

Directory of AI Companies

Some of the Trawler companies :

Google Groups

Moreover News Technologies


I hope I have clarified your request. If anything further is to be
clarified, please don't hesitate to ask.

Subject: Re: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers
From: insideinfo-ga on 27 Jun 2002 05:16 PDT
I have found some info on Oren Zamir:

On this page:

Another of your notable experts.

He worked with the Phd student Oren Zamir and associate professor
Etzioni wrote of him:

Dr. Oren Zamir (1999, Openratings). Oren's dissertation, Clustering
Web Documents: A Phrase-Based Method for Grouping Search Engine
Results, investigated the use of a novel and fast clustering algorithm
to group the results of Web search engines into easily-browsed
clusters. The most distinctive aspect of the algorithm was its
treatment of documents as strings of words, represented by a suffix
tree, in contrast with the standard vector-based representation.

Oren Zamir recieved a Phd in 1999 and you can find his dissertation

In that document I found his US and Israel contact info:

U.S. address: CS Department, University of Washington, Box 352350,
Seattle, WA 98195-2350, USA
Israel address: 22 Hatana'im st. apt. 23, Ramat-Aviv, Tel-Aviv 69209,

His web page is no longer working and the physical address on campus
may no longer receive his mail, I might try the Israel address or
contact his fellow students or professors and ask of current contacts.
He worked with Oren Etzioni several times as you can see here:

Good Luck
Subject: Re: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers
From: insideinfo-ga on 27 Jun 2002 05:29 PDT
I was able to find some info on another notable resercher - Sarit
Kraus. You had that first name as Sabit which threw off the answerer
aditya2k-ga. I assumed that the last name was right and was able to
find it. I know several people with last name of kraus and could not
think of another way to spell that name. She has a home page at:

And her contact info is: 

Dept. of Computer Science Bar-Ilan University
Ramat Gan, 52900 Israel 

 Office: Room 305 Math Building 
 Dept. of Computer Science Bar-Ilan University
Ramat Gan, 52900 Israel 

Office: Room 305 Math Building 
Phone: (972) 3-531-8762 
Fax: (972) 3-535-3325 
Subject: Re: AI-Bots-Crawlers-Spiders-Trawlers-Ferrets-Fetchers
From: panos-ga on 23 Apr 2004 10:40 PDT
You might want to know that Oren Zamir is currently working for Google
in their New York offices.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  

Google Home - Answers FAQ - Terms of Service - Privacy Policy