Not exactly sure why this is happening, but a couple of folks have
started to use bots which crawl my site. No, it's not search engine
bots- they show up in the server logs as user agents NetSpider,
WebReaper, and a few others. I looked at ways of blocking crawlers,
and I found a way to modify my server's .htaccess file to disallow
these agents from visiting. But this seems very easy to work
around...what I'd really like is a way to gradually throttle traffic.
Thus, as a client requests unreasonable numbers of pages, we decide to
limit them to one page per 5 seconds or something like that. Any ideas
on ways to do that using an Apache server? I've got full access to the
server, so I can install modules, etc, if that helps. Many thanks. |