Dear Michael2,
Due to the prevalence of server-generated content, many search tools are
indeed equipped to index dynamic pages. The difficulty for a spider, as
you may well know, is to distinguish between the server-side scripting
code -- whether it's in PHP, ASP, or some other language -- and the page
that it generates for visitors to view. The right kind of search engine
in your situation comes with a spider that scans your site not in the
form of files but as a collection of web pages. Thus, any server-side
scripts it may come across are executed by the target web server to
obtain the front-end web page that should be indexed.
I know of several purely commercial search tools, but I can't tell you
how much they cost because the publishers are cagey about pricing. This
is usually because in addition to selling you a software license, they
bundle it with the development services that are required to integrate
it with your website. In other cases, such as with the popular Atomz,
you aren't buying software at all but subscribing to a hosted service
that indexes your content for a monthly fee.
Atomz: Atomz Search
http://www.atomz.com/applications/search/
I think the superior solution, especially for a relatively small-scale
site, is to download a free search engine. Several high-quality solutions
are available at no cost along with adequate documentation. There will
still be some work involved to install the search engine, run the crawl,
and integrate the back end with your site's front end, but you'll be
in control of an open process instead of having it handled invisibly by
the software publisher for an exorbitant fee.
If your web site is hosted on a UNIX-compatible system, you'll be
able to use the bizarrely named but robustly developed ht://Dig search
engine. It is released under an open-source license that allows you to
reuse it in any application, including a commercial one, as long as you
do not impose additional restrictions on its use. ht://Dig integrates
well with HTML templates and is able to index dynamic pages.
ht://Dig: Home
http://www.htdig.org/
If you must use a non-UNIX platform or if you require a product with a
slightly less ridiculous name, you can turn to Swish-E. This free search
engine was developed under UNIX but also comes in a Windows version. Like
all good web-oriented search engines, it uses a true web spider to index
your site, so dynamic pages are parsed properly. The list of sample sites
that employ Swish-E is impressive evidence of its flexibility and ease
of application.
Swish-E: Home
http://swish-e.org/
Swish-E: Features
http://swish-e.org/current/docs/README.html#Key_features
Swish-E: Download
http://swish-e.org/Download/
Swish-E: Web Sites That Use SWISH-E
http://swish-e.org/sites.html
Perhaps the most popular single-site search engine is Namazu, developed
by Japanese programmers in a stereotypically clean and efficient style. It
integrates easily with web pages by means of a CGI script. Unfortunately,
it doesn't work properly with dynamic content because it constructs an
index based on your local file system and not on the pages served by
your site. Thus, I must regretfully advise you to ignore Namazu.
If your site is hosted on a UNIX platform and if you insist on
a commercial solution that includes technical support from the
publisher, you might be interested in WebGlimpse. $250 will buy you a
single-server, single-domain license for a company with fewer than 50
employees. Unlimited end users are allowed, of course. WebGlimpse does
handle dynamic pages correctly.
WebGlimpse: Features
http://webglimpse.net/index.php?dir=subfeatures&page=features.html
WebGlimpse: Support
http://webglimpse.net/index.php?dir=subsupport&page=techsupport.html
WebGlimpse: Purchase
https://iwhome.com/webglimpse/
I leave the final decision up to you, but my personal choice as a web
developer would be Swish-E. It is popular, available on many different
platforms, offers enough power for a small-scale web site, and is readily
integrated with any web page using a brief CGI script.
Swish-E: swish.cgi -- Example Perl script for searching with the SWISH-E
search engine.
http://swish-e.org/current/docs/swish.html
I have enjoyed addressing this question on your behalf. Should you feel
that any part of my answer deserves correction or elaboration, please
let me know through a clarification request so that I have a chance to
fully meet your needs before you assign a rating.
Regards,
leapinglizard |