Google Answers Logo
View Question
 
Q: Google Page Rank Technology ( Answered 4 out of 5 stars,   3 Comments )
Question  
Subject: Google Page Rank Technology
Category: Computers > Internet
Asked by: bgrorud-ga
List Price: $3.50
Posted: 12 Jul 2002 09:50 PDT
Expires: 11 Aug 2002 09:50 PDT
Question ID: 38940
Why when I do a google search for the "History of the national
football league" is the first hit the nba.com homepage?  If you look
at the cache of that page, the only word that appears on it is
history, the rest of the words in my search only appear in links
pointing to that page.  I would like to know why google's page rank
thinks this is the most relavent link.  It seems to me someplace there
may be a flaw in PageRank (Say it ain't so).

Thanks
Answer  
Subject: Re: Google Page Rank Technology
Answered By: missy-ga on 12 Jul 2002 15:40 PDT
Rated:4 out of 5 stars
 
Hello bgrorud,

When one searches on [ History of the national football league ] with
Google, the first hit is the NFL, CBS.Sportsline.com page:

Google Search Results
://www.google.com/search?q=History%20of%20the%20national%20football%20league&sourceid=mozilla-search&start=0&start=0&ie=utf-8&oe=utf-8


Searching on [ "History of the national football league" ]  (note the
quotes) produces the following results:

Google Search Results
://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=utf-8&q=%22History+of+the+national+football+league%22&btnG=Google+Search

The hit is far more relevant than the Sportsline page - it's the
history of the Superbowl.

Google explains Page Rank Technology thusly:

"In essence, Google interprets a link from page A to page B as a vote,
by page A, for page B. But, Google looks at more than the sheer volume
of votes, or links a page receives; it also analyzes the page that
casts the vote. Votes cast by pages that are themselves "important"
weigh more heavily and help to make other pages "important."

Our Search: Google Technology
://www.google.com/technology/index.html

The page goes on further to say that page ranking is determined not
only by the content of the page in question and the number of pages
linking to it, but also the *content of the pages which link to it*.

If you ran your search by the first example (no quotes), it's entirely
possible that at that time (you don't say when you ran this search),
the content of the pages linking to the NBA page included several
instances of words in your query - not necessarily in the order in
which you entered them, either.

Searching with no quotes yields a search for all instances of the
words in your query, except for common words such as "an", "of", "or",
"the".  "Votes" would be cast for *each word individually* in your
query.

For more relevant results, you need to search on a more specific
query.  By using the second example (with quotes), the *entire phrase*
is searched for, not  the individual words.  "Votes" are cast for the
entire phrase, not its individual words, yielding a much more relevant
result.

Google explains:

"Search for complete phrases by enclosing them in quotation marks.
Words enclosed in double quotes ("like this") will appear together in
all results exactly as you have entered them. Phrase searches are
especially useful when searching for famous sayings or proper names."

Phrase Searches
://www.google.com/help/refinesearch.html

(The rest of the page explains how other "operators" can be used in
your searches to obtain the best search results.)

The more specific and refined your searches, the more relevant your
results will be.

For more information and help with searching, try these pages:

Basic Help
://www.google.com/help/basics.html

Google Special Search Features
://www.google.com/help/features.html

Each of these pages will help you get the most out of your searches.

Hope this helps!

missy-ga

Clarification of Answer by missy-ga on 12 Jul 2002 20:08 PDT
Hi,

I noticed there had been activity on this question, so I had a look to
see what was posted.  I looked at cogpsych's comment, thought it odd
that he was getting the NBA result, so I executed the search again,
and took a screenshot of the result (it's full page and unedited, so
it will take some time to load):

http://blake.prohosting.com/~woozle/extras/NFL1.jpg

Then, wondering if something else might be odd, I cleared both memory
and disk cache in Mozilla, and executed the search again.  I took
another screenshot:

http://blake.prohosting.com/~woozle/extras/NFL2.jpg

!!!

OK, I will be the first one to admit that I have absolutely NO idea
why this has occurred.  I am currently confounded.  Let me go pick the
brains of my techie acquaintances (and the Google people!) to find out
precisely what's going on with this oddity.  I ask for a little
patience, please.

Confusedly yours,

missy-ga

Clarification of Answer by missy-ga on 13 Jul 2002 20:18 PDT
Hi bgrorud!

This has been quite an interesting puzzle to figure out!

First:  What I wrote previously about how page rank works - that's
still true.  To get the most relevant results for your search, use the
operators.  Otherwise, you're going to get pages ranked by the
individual words in the query, instead of by the information you're
actually looking for.  The more refined and specific the search, the
more relevant and useful your results will be.  It's all about
specificity!

Here's a bit more about how PageRank works:

Google's PageRank and how to make the most of it 
http://webworkshop.net/pagerank.html


I'm clarifying for you to explain how it is that the search results I
got were entirely different from the search results you and cogpsych
got.  I think (hope!) you'll find the answer interesting.  I was
certainly fascinated!

First, have a look at this screenshot:

http://blake.prohosting.com/~woozle/extras/googledance.jpg 

So how did I get that?

Well, after getting the odd results posted yesterday, I queried my
fellow Researchers.  Mother speculated, then Larre and Xemion
confirmed that the different results were the result of hitting
different Google servers (Google balances the search load between
several servers).  Till filled in the missing bit - the "Google
Dance".

Go here:

Google Dance Machine
http://google-dance.miniunternehmen.de/ 

The Google Dance Machine allows you to query several Google servers
worldwide and compare search results for each.  I set the Google Dance
Machine to query (from the left) www:com, www2:com, www3:com (all US),
www:de (Germany), www:ch (Switzerland), www:at (Austria), www-va2:com
(Virginia 2), www-dc2:com (Washington DC 2), www:ca (Canada) and
www:lt (Lithuania), then searched on
[ The history of the national football league ] (no quotes).  As you
can see, three of the ten Google servers queried showed the results
you received.  The other seven show the results I received when I
answered your question yesterday.

Weird, huh?

Well, no, not really, now that my colleagues reminded me that it's
more than a couple servers handling the Google traffic.  So what's the
deal?

It's the Google Dance.  
 
James Kendall at SEO Today explains the Google Dance:

"The Google Dance

Google has at least four different indexes that are manipulated to
test different results when the Google Dance is on.  During the dance
you can get a glimpse of how your site ranks in their different
indexes.  If you do a search in the default Google index around the
time of an update, and then do the same search in one of the other
indexes, you will get different results.  For example, on February 22
you could have gone to http://www2.google.com/ to preview the new
Google index and compared the current listing (www.google.com) to next
month's (www2.google.com).

The time surrounding Google's updates is normally referred to as the
Google Dance because the databases are switched around and back a
couple of times before things become stable."

It's All About Google - SEO Today, February 26, 2002
http://www.seotoday.com/browse.php/category/articles/id/173/index.php

(Mr. Kendall's article also talks a little more about PageRank, a
little further down the page.)

When all the results match up on all the servers, the Dance is over.

Looks like Google's doing the Hustle, and that's why we got different
results.

Here is a little more about The Google Boogie:

Woz's "Google Pokey":

"(to the tune Hokey Pokey)
<start the music please>

They put some pages in,
They take some pages out,
They calculate the PageRank,
Then they shake it all about,

They wait until the full moon
Then they mix the servers up,

Thats what its all about."

The Update - What is it, exactly?
http://www.webmasterworld.com/forum3/3487.htm

Google Update -*Google Dance*
http://www.linktree.info/googleupdate.php


I hope this has cleared things up!  It was certainly fun to get to the
bottom of it!

missy "googlin' all about" -ga

(My colleagues Mother, Larre, Xemion and Till all have my grateful
thanks for helping me piece this puzzle together!)

Request for Answer Clarification by bgrorud-ga on 13 Jul 2002 21:03 PDT
I am actually not requesting a clarification, I just thaought I would
give you a little background on this.  I actually could care less
about the history of the national football leaugue, the only reason
why I did the query was because some coworkers were wondering if
google was yet powering the aol search engine.  So that was the first
query that came to mind. I tried it first at search.aol.com and then
at google to see if the results were similar.  And then this whole
thing started with getting nba.com.  I appreciate the work you did,
because I knew more people than just me and my coworkers would think
this incredibly odd.  And even though it wasn't why I posted the
question, the whole thing about the google dance was cool.  I guess
with nba being listed first on some searches, under rare
circumstances, it is possible that google puts too much weight on
links pointing to a page and will give you a poor hit.

Thanks
Brad

Clarification of Answer by missy-ga on 13 Jul 2002 21:12 PDT
Hi Brad!

I'm glad you found the Google Dance interesting!

With respect to the "poor hit", just remember that to get the most
relevant results, you need to "tell" Google exactly what you're
looking for.  The difference between quotes and no quotes is sometimes
quite vast!

Happy Googling!

Missy
bgrorud-ga rated this answer:4 out of 5 stars
Put a lot more time and effort in to this than I could have, even
though I wanted to.  Thanks

Comments  
Subject: Re: Google Page Rank Technology
From: cogpsych-ga on 12 Jul 2002 19:00 PDT
 
I ran bgrorud's search earlier today when I first saw the question,
plus I just ran it again using the first link provided by missy. The
first hit both times was NBA.com, just like bgrorud found. The CBS
Sportsline page is the second hit. It is a little bizarre...
Subject: Re: Google Page Rank Technology
From: cogpsych-ga on 12 Jul 2002 19:06 PDT
 
Quick follow-up... I think I just figured out the discrepancy. If you
do a search for history of the national football league (no
quotations) on www.google.ca (Canadian site) you get the NFL
CBS.Sportsline page as the first hit. If you replace the .ca with .com
in the address bar and refresh the page, NBA.com is the first hit. I
guess PageRank Technology works better in Canada :)
Subject: Re: Google Page Rank Technology
From: bgrorud-ga on 13 Jul 2002 21:14 PDT
 
Thanks to everyone who looked in to this, it sure was fun.
Check out pictures of my 2 year old son at www.grorud.org
happy googling . . .

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy