|
|
Subject:
What English letter do most commonly searched for keywords begin with.
Category: Computers > Algorithms Asked by: gomvents-ga List Price: $10.00 |
Posted:
07 Oct 2004 17:28 PDT
Expires: 06 Nov 2004 16:28 PST Question ID: 411782 |
What English letter do most commonly searched for keywords begin with. Not just the top 50, 100, 10000000 etc., but what will fill about 40TB full of storage data. i'm looking to create a search engine database system that groups clusters by letters, IE a search for cars would pull from the C server.... if for example if words that begin with X, Y, and Z are not common I may group them on one server... The idea behind the systen is to distribute load, thanks! | |
| |
| |
|
|
There is no answer at this time. |
|
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: silver777-ga on 07 Oct 2004 22:41 PDT |
Hi Gomvents, Interesting. Sounds like a G**gle trade secret to me. Most common word hits could only be collated by experience from another search engine like the aforementioned. That ain't something I would willingly share with another, considering the data gathered. It might sound simple, but how about you start with the most frequently used letters, to determine your own number of common searches beginning with a given letter. That is, in order ETOANIRSHDLCWUMFYGPBVKXQJZ. You might find that cars have less hits than elephants. Unless of course the elephants are hitting the cars. Phil |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: frde-ga on 08 Oct 2004 01:00 PDT |
Although that initially sounds a good idea, it has drawbacks Most searches will probably consist of two or more words, which means that one query will need to attack two+ servers. You would probably be better off replicating your database across a number of identical machines. Your idea of doing a 'transformation' is quite a good one, and is certainly easily viable for the first two letters ... at least. |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: gomvents-ga on 08 Oct 2004 06:27 PDT |
I have two comments/questions for both of you... silver777-ga where did you find the information about ETOANIRSHDLCWUMFYGPBVKXQJZ ? "You would probably be better off replicating your database across a number of identical machines." We are talking about 40TB! That's too expensive of a storage situation plus I'm not sure if Linux can handle that much one one disk. (it's raid 5 and will look like one physical disk). "Most searches will probably consist of two or more words" A search of "Dog food" in my model would just go to the D server where as "Food for dogs" would go to the F server. I don't need perfect distribution, but I would likw a good idea for example is words and phrases that begin with X, Y, or Z are very rare I'd like to group them on one server etc. Thanks! |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: frde-ga on 08 Oct 2004 07:47 PDT |
Like most people specifying a system - you lied The underlying problem is distributing 40TB of data over a number of servers ie: to split a large amount of data over separate machines |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: gomvents-ga on 08 Oct 2004 10:11 PDT |
There is no lie... picture like 16 - 20 servers... some will have keyewords begining with one letter, some will have keywords beginning with several letters for example, a,b, or c VS. just words that begin with s This should make sense now... |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: silver777-ga on 09 Oct 2004 07:26 PDT |
Hi again Gomvents, I memorised the list from when I was in high school. I'm unsure as to it's origin, but I believe it to be the correct sequence of the English alphabet according to the frequency of use of letters. I just thought it might help. That differs from what you probably need though. It is not a list of "first letters in a word most commonly used". Beyond that, the rest is rocket science to me. So, on the lower end of the scale of simplifying things, what if you just use a word count? If there are say 3 times as many words starting with "C" than there are "D", could you attribute your search engines accordingly? |
Subject:
Re: What English letter do most commonly searched for keywords begin with.
From: curious_-ga on 14 Oct 2004 21:38 PDT |
Why don't you just make a reasonable estimation of the distribution of letters to start off, and simply re-arrange your partitioning of letters to servers as time goes on if your initial assumptions turn out to be false. From a design standpoint, it would make a lot more sense to design the system flexibly in the first place rather than putting a lot of stock into the notion that you will be able to obtain an accurate first letter distribution and that such a distribution will remain constant over time (considering that a large percentage of web searches are for Britney spears, that biases the B heavily... but what will be the name of the next pop star? -- you get the point)... If you insist on dividing the work up without any experimental evidence, my guess is that you would obtain better accuracy by hashing the entire search term using a hashing function that took into consideration more than just the first letter. Look into Zipf's law... he did some groundbreaking work on word frequency which you may find interesting. However I think a flexible design with the ability to re arrange the data storage over time will be the most efficient and accurate approach to solving the problem you describe. |
If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you. |
Search Google Answers for |
Google Home - Answers FAQ - Terms of Service - Privacy Policy |