Clarification of Question by
martinbsp-ga
on
29 Oct 2004 06:48 PDT
This thread is becoming a bit convulated now, but this will be my last post.
What I am trying to do is build a search form for a specific technical area,
in this case the aerospace industry, that will allow certain members of our
technical committees to query only specific websites at the same time. So
for example, we need to keep track of the types of seminars and conferences
that are being run on the topic of aerospace.
Now, I have already spent a long time figuring out the best websites to
monitor for this information. So, instead of searching the whole of the
Yahoo! for aerospace events I can narrow my search down to only those
domains I choose. (I've chosen the Yahoo! index because Google is limited
to 10 words, the MSN index is still in it's experimental stage and finally
because, as I understand it, Yahoo! probably has the next best index after
Google).
So far so good. As you will see from the attached file I have created a
searchbox that will allow me to restrict searches to those domains I choose.
Furthermore, I have been able to restrict the search still further to a
subdomain within each individual website, e.g.:
site:imeche.org.uk path:events_research research
searches only within http://www.imeche.org.uk/events_research.asp/
Now, the problem I have is that Yahoo! omits WWW documents from the ranked
list of search results that it provides, when the documents are 'very
similar'. Yahoo! offers the possibility to "repeat the search with the
omitted results included", on the last page with search results:
"In order to show you the most relevant results, we have omitted some
entries very similar to the ones already displayed. If you like, you can
repeat the search with the omitted results included."
This is problematic, because much of the time users of the search engine
will never bother to click the "you can repeat the search with the omitted
results included" option. This means that in many cases the user will not
find the
- oldest, authentic, master version of a document
- the newest, most recent version of a document
- any variation of a document that is actually required and index by a
search engine
The above builds in the assumption that:
(a) Yahoo! has indexed that version
(b) It is ranked high enough to appear in the omitted results
(c) etc
So, I need to be able to build something into my current HTML form so that
when a user conducts a search, if there are "omitted results" these are
included these are automatically expanded and available in the search
results from the outset, without the user having to click the "you can
repeat the search with the omitted results included" option.
If it helps I have come across what appears to be an answer:
add ?&dups=1? to the URL
This solution is posited very briefly at:
http://www.unitedheroes.net/blogs/jr/archives/p/1064
but the poster does not have time to go into detail as to how this can
be implemented into existing code.
Phew!
Hope this explains things a bit better.