Google Answers Logo
View Question
 
Q: using exclusion terms in query but getting more pages back? ( Answered 4 out of 5 stars,   3 Comments )
Question  
Subject: using exclusion terms in query but getting more pages back?
Category: Computers > Internet
Asked by: wattle-ga
List Price: $25.00
Posted: 02 Nov 2002 22:31 PST
Expires: 02 Dec 2002 22:31 PST
Question ID: 97106
Hi there.  I use Google a lot for my research.  An intermittent
problem I have is when I use an exclusion term (e.g., -word) to refine
my query.
 Sometimes it actually seems to increase the number of pages I get
instead of reducing them. Can you let me know why?  Is it something
I'm doing wrong?  Any query-gurus out there experience something
similar?  I've put some examples below...
Query 1: information systems 
Google returns: 6,810,000

Query 2: information systems -imagine
Google returns more: 7,020,000

Query 3: information systems -barcelona 
Google returns less: 6,460,000

Query 4: information systems -database
Google returns less: 6,070,000

Query 5: information systems -baracuda 
Google returns an unusual amount less: 712,000

Query 6: information systems -restaurant
Google returns more again: 7,600,000

I still think Google is the best search engine around and am
continuing to use it in preference to any other one.  But this has
confused me enough to want to ask.  Let me know when you have a
chance.  Is it something wrong with the way I am writing the query?

thanks in advance.

Clarification of Question by wattle-ga on 05 Nov 2002 07:27 PST
Hi again - I really hope someone can answer this question.  
I am a new Google Answers user.  I just re-read all the terms and
noticed that we are not supposed to ask proprietary details about
Google - hence this clarification.  Please just answer the question as
best you can without going into more detail about Google than you need
to.  Perhaps this problem is pervasive across all search
engines/information retrieval schemes?  Anyway, if you can get me an
answer, I would be most grateful.  As submlime-ga noticed, it doesn't
seem to make a difference whether i use single query terms or phrases.
 I figure I can't be the first person to notice this....

thanks again.  wattle..
Answer  
Subject: Re: using exclusion terms in query but getting more pages back?
Answered By: tox-ga on 07 Nov 2002 18:25 PST
Rated:4 out of 5 stars
 
Hello there,

Thank you for your question.  I will rationalize the reasoning as
clear as I can without interfering with the google disclaimer.

I can understand how this can be puzzling as if there's a certain
number of results contain Information system, logically, how can there
be MORE results when you give restrictions!  Let me start off by
saying it is not something wrong with the way you’re writing the query
and that this is quite a unique situation more so with google than
other major search engines due to the difference in their search
systems/result display.
There are two things you need to be aware of before understanding this
"phenomenon".
First of all, the result numbers are an approximation.  That means
that if two amount of results retrieval are similar, the number can be
interchangeable.
At this time in writing, this changes frequently, information system
alone returns 8,640,000 results and all the other search terms you
have given retrieves less results except for information system
–restaurant which gives 8,730,000 results.  Note that this number is
not completely accurate since every single result is not physically
counted.  Therefore in truth, there’s more result counts with
information system but in unique cases such as this one, the
approximation gave a higher number for the second search.
In cases like this, it is relatively safe for you to assume that the
numerical result of the two searches are quite similar.  If you
actually go about counting the number of pages that google lets you
view (by going to the very last page of the search result), you will
see that you actually have 997 results for the first search and 998
results for the second search.  This again is determined by so called
“relevancy” of the results and the search system used by google which
in turn shows that there’s about 997 pages that google deems worth
viewing for the first search and 998 pages for the second search. 
While one would expect more for the first search, the so called
relevancy and filtering reduces and filters the results, which were
numerically similar to begin with, and spews out numbers that seem
completely illogical.

I hope this helps,
If you have any further questions, please do not hesitate to ask for
clarification.
Thank you

p.s. your opinion of google as the best search engine available is
shared by millions of other netizens around the world.  You seem to
have a keen sense of both researching and observation skills as you’ve
noticed these things when most others would have scrolled past them.

Tox-ga

Request for Answer Clarification by wattle-ga on 07 Nov 2002 22:04 PST
hi - thanks Tox for your answer.   i am currently snowed under so
can't give a proper response now.  i guess this was the most likely
answer - random error in the figures for number of pages returned -
but i just want to sit on it for a day or two before i accept it.  it
was interesting how you went to the last few pages of the response set
to actually check what was there.  i hadn't thought to do that.

ok - will get back in a couple of days time.  will give a rating then
too.  If anyone else out there has comments, feel free to add.

thanks again. Andrew.

Clarification of Answer by tox-ga on 09 Nov 2002 08:31 PST
Looking through the estimation algorithm on other search engine codes
and testing them led to the same conclusion; a basic estimation error
caused by the search engine when two resulting numbers are similar.
I hope this helps,
Tox-ga
wattle-ga rated this answer:4 out of 5 stars
Tox - thanks for your research and for responding to my clarification.
 They were both much appreciated.  I believe your answer is correct
too.  I have given you 4 rather than 5 points for two reasons:
1. The question took longer to be answered than I hoped.  This is
probably unreasonable on my part.  I am sure you guys get heaps of
questions and it might have just taken a while to get to it.  Anyway,
I was getting impatient and felt that the only way for it to get
answered quickly was to put more money on it - hence me increasing my
payment from $10 to $25.  Because you answered so quickly after this I
wondered whether I could have just sat patient for a couple of days
more and saved myself $15 :-)  (my self-interest and lack of money
shows here...)
2.  I felt your answer could have been a little more detailed.  For
example, is there a "confidence interval" around the 'true' figure for
total pages returned (on Google or any other search engine)?  Does
this estimation problem vary in direct proportion to the size of the
number of pages returned?  Or does the estimation problem follow an
exponential (or other) distribution with the problem getting more than
proportionally worse as the number of pages increase?  Does it matter
if the queries are commonly run queries on the search engine or rare
ones?  How does the estimation of "relevant" pages that you referred
to work and how does this spew out such odd results?  etc etc....  I
don't know if you would be able to work out such details or if I would
need to pay more for that kind of stuff - i guess I was just hungry
for a detailed answer.  I hoped to get a little more for $25, but did
appreciate the answers you gave.
thanks.  

wattle..

Comments  
Subject: Re: using exclusion terms in query but getting more pages back?
From: sublime1-ga on 03 Nov 2002 22:55 PST
 
wattle...

I thought it might make a difference if you enclosed
the phrase in parentheses, as in "information systems",
but I experienced the same, higher numbers when adding
the -modifier. This is a very intriguing question.
Subject: Re: using exclusion terms in query but getting more pages back?
From: mathee-ga on 08 Nov 2002 18:14 PST
 
This question made me quite curious and I emailed several search
engine/portal developers to see their opinion on this and tox is
correct.  The mixup in the number is in fact caused by estimation of
numbers that happened to be mixed up due to the similarities.  That's
one mystery solved...interesting question though.
Subject: Re: using exclusion terms in query but getting more pages back?
From: tehuti-ga on 03 Dec 2002 15:19 PST
 
Wattle,

I just want to explain a little about the dynamics of how questions
are answered. The researchers here are independent contractors.  We
pick up on questions as and when we want, and as and when we log in to
the site. We do not work to any daily or weekly norm, and we are not
assigned questions. If a question sits unanswered for a couple of
days, or maybe for longer, that could be because the price is not
right for the work that could be involved; because a researcher with
knowledge/interest in the subject has not yet seen it; or because the
researchers do not believe they can provide a satisfactory answer
(people don't like to get "no" as an answer!).  So yes, increasing
your price probably did increase the interest of researchers in your
question.  However, that is not to say that tox sat waiting for you to
put up the price.  Perhaps s/he only noticed your question on 7
November. Just wanted to clarify this to you and any other GA user who
might happen on this question.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy