Google Answers Logo
View Question
 
Q: Search engine tool for web site ( Answered 5 out of 5 stars,   1 Comment )
Question  
Subject: Search engine tool for web site
Category: Computers > Internet
Asked by: michael2-ga
List Price: $30.00
Posted: 25 Nov 2004 13:46 PST
Expires: 25 Dec 2004 13:46 PST
Question ID: 434062
Our corporate web site includes some asp pages as well as the usual
HTML pages.  The asp pages are News pages which are essentially just
text stored in a database.  The only reason for using asp rather than
HTML is, I think, that we can automatically include links to the two
or three most recent News pages on our home page.

I'm looking for a web site search tool we can add to our home page
which will automatically index and search both the HTML and the text
on the asp pages.  For ease-of-use it should have just a single search
field to search both.

If no such tool is available, what about an indexer, running in the
background, automatically adding the HTML text into the database as
well?  We'd then only have to search the database.

The site is fairly small (few hundred pages). I'm not necessarily
looking for soemthing free.

Michael2
Answer  
Subject: Re: Search engine tool for web site
Answered By: leapinglizard-ga on 25 Nov 2004 20:08 PST
Rated:5 out of 5 stars
 
Dear Michael2,

Due to the prevalence of server-generated content, many search tools are
indeed equipped to index dynamic pages. The difficulty for a spider, as
you may well know, is to distinguish between the server-side scripting
code -- whether it's in PHP, ASP, or some other language -- and the page
that it generates for visitors to view. The right kind of search engine
in your situation comes with a spider that scans your site not in the
form of files but as a collection of web pages. Thus, any server-side
scripts it may come across are executed by the target web server to
obtain the front-end web page that should be indexed.

I know of several purely commercial search tools, but I can't tell you
how much they cost because the publishers are cagey about pricing. This
is usually because in addition to selling you a software license, they
bundle it with the development services that are required to integrate
it with your website. In other cases, such as with the popular Atomz,
you aren't buying software at all but subscribing to a hosted service
that indexes your content for a monthly fee.


Atomz: Atomz Search
http://www.atomz.com/applications/search/


I think the superior solution, especially for a relatively small-scale
site, is to download a free search engine. Several high-quality solutions
are available at no cost along with adequate documentation. There will
still be some work involved to install the search engine, run the crawl,
and integrate the back end with your site's front end, but you'll be
in control of an open process instead of having it handled invisibly by
the software publisher for an exorbitant fee.

If your web site is hosted on a UNIX-compatible system, you'll be
able to use the bizarrely named but robustly developed ht://Dig search
engine. It is released under an open-source license that allows you to
reuse it in any application, including a commercial one, as long as you
do not impose additional restrictions on its use. ht://Dig integrates
well with HTML templates and is able to index dynamic pages.


ht://Dig: Home
http://www.htdig.org/


If you must use a non-UNIX platform or if you require a product with a
slightly less ridiculous name, you can turn to Swish-E. This free search
engine was developed under UNIX but also comes in a Windows version. Like
all good web-oriented search engines, it uses a true web spider to index
your site, so dynamic pages are parsed properly. The list of sample sites
that employ Swish-E is impressive evidence of its flexibility and ease
of application.


Swish-E: Home
http://swish-e.org/

Swish-E: Features
http://swish-e.org/current/docs/README.html#Key_features

Swish-E: Download
http://swish-e.org/Download/

Swish-E: Web Sites That Use SWISH-E
http://swish-e.org/sites.html


Perhaps the most popular single-site search engine is Namazu, developed
by Japanese programmers in a stereotypically clean and efficient style. It
integrates easily with web pages by means of a CGI script. Unfortunately,
it doesn't work properly with dynamic content because it constructs an
index based on your local file system and not on the pages served by
your site. Thus, I must regretfully advise you to ignore Namazu.


If your site is hosted on a UNIX platform and if you insist on
a commercial solution that includes technical support from the
publisher, you might be interested in WebGlimpse. $250 will buy you a
single-server, single-domain license for a company with fewer than 50
employees. Unlimited end users are allowed, of course. WebGlimpse does
handle dynamic pages correctly.


WebGlimpse: Features
http://webglimpse.net/index.php?dir=subfeatures&page=features.html

WebGlimpse: Support
http://webglimpse.net/index.php?dir=subsupport&page=techsupport.html

WebGlimpse: Purchase
https://iwhome.com/webglimpse/


I leave the final decision up to you, but my personal choice as a web
developer would be Swish-E. It is popular, available on many different
platforms, offers enough power for a small-scale web site, and is readily
integrated with any web page using a brief CGI script.

Swish-E: swish.cgi -- Example Perl script for searching with the SWISH-E
search engine.
http://swish-e.org/current/docs/swish.html


I have enjoyed addressing this question on your behalf. Should you feel
that any part of my answer deserves correction or elaboration, please
let me know through a clarification request so that I have a chance to
fully meet your needs before you assign a rating.

Regards,

leapinglizard
michael2-ga rated this answer:5 out of 5 stars and gave an additional tip of: $10.00
Excellent work, LL. That was just what I needed.

Comments  
Subject: Re: Search engine tool for web site
From: leapinglizard-ga on 26 Nov 2004 04:37 PST
 
Thank you for the rating and the generous tip.

leapinglizard

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy