Google Answers: Science

View Question

Q: Science ( Answered 3 out of 5 stars

Question

Subject: Science
Category: Science > Technology
Asked by: teruca-ga
List Price: $150.00

Posted: 02 Jul 2002 08:05 PDT
Expires: 01 Aug 2002 08:05 PDT
Question ID: 35806

¿Given all the information that is available in internet plus all that
is containes in private memory banks I would like to know if it is
conceivable to image a system, method or device that can become the
nucleus of all that information, translate it into one language and
make it available to any given user having such device?

Answer

Subject: Re: Science
Answered By: eiffel-ga on 02 Jul 2002 13:55 PDT
Rated: 3 out of 5 stars

Hi teruca, The idea of a single global store of knowledge has captivated people's interests for decades. Ted Nelson founded Project Xanadu in 1960. He had a vision of a world-wide public hypertext publishing and document storage system. For the past 42 years Project Xanadu has worked to implement a network of deep electronic documents with side-by-side inter-comparison. Xanadu provides for frictionless re-use of copyrighted material, with rights management. The Xanadu model envisages a massive global repository of information with automatic version management, quoting and annotation. Xanadu documents are deeply-interconnected and systematically stored: http://xanadu.com/ I will mention Ted Nelson's dream again later in this answer. It is not currently possible to implement all of the things that you mention in your question, although the technology does exist to do some of those things - and some of them are being done quite well already. It is possible to index and store much of the information available on the internet, and also some privately-held information. It is possible to perform a rough automated translation of that information into other languages. It is possible to provide access to that information across networks such as the internet. But it is not possible to store all of the information on a single compact device. It is not possible to provide automatic translations that are anywhere near as good as translations performed by a human. And most privately-held information is likely to remain private. I'll break your question down into three parts. If I've missed anything, or not addressed a part of your question in the way that you intended, please post a request for clarification. PART ONE: Is it conceivable to imagine a system that can become the nucleus of all the information that is available on the internet, and all information in private databases? According to the Online Computer Library Center there are 8.4 million unique websites, a growth of 18% since last year. Of these, three million were public, two million were private, and a further three million were "provisional" or "under construction": http://wcp.oclc.org/ (click on "Size and Growth") It is quite practical to access (or "crawl") and index these public websites, and web search engines do just that. According to Inktomi, their index comprises 500 million web pages: The Inktomi Difference http://www.inktomi.com/products/web_search/difference.html In addition to HTML web pages, Google indexes a range of other document types including PDF documents, Microsoft Office document formats and Corel document formats. According to the Google home page, Google indexes and searches "2,073,418,204 web pages": ://www.google.com/ The Google index also includes over 700 million Usenet newsgroup postings and 330 million images. "Google Offers Immediate Access to 3 Billion Web Documents" ://www.google.com/press/pressrel/3billion.html To access and index such a large number of documents, and to allow the index to be queried, obviously requires massive communications bandwidth and computing power. For example, Google use over 6000 RedHat Linux servers: Interview with Google's Sergey Brin http://www.linuxgazette.com/issue59/correa.html It's not practical to access and re-index every document every day. So search engines crawl the web periodically - typically every month or so. Sometimes it is possible to determine which documents are likely to change frequently, in which case those documents can be indexed more often. According to the link referenced above, Google refreshes "millions of web pages every day". In your question, you mention information held in private databases. It is hard to envisage a legislative or technological change that would result in private databases being made centrally available, so it seems inevitable that much of the world's private data will remain private. Some private data is marketable, and this could be made centrally available if a suitable charging structure is in place. This already happens with commercial data (such as the stock exchange prices that can be obtained from Google), and with subscription-based access to private data sold by content aggregators. In addition to indexing the web, search engines can also store copies of it. For example, Google caches copies of the documents that it indexes, and can serve those copies to anyone who requests them. For documents in formats other than HTML, an HTML version of the document can be served if the user requests it. So, the answer to part one of your question is that state-of-the-art search engines can already be though of as a nucleus for much the information that is available on the internet, and even for some information from private databases. PART TWO: Is it conceivable to imagine translating this information from one language to another? Certainly, the answer is "yes", although translation technology is fairly rudimentary at present. Machine translation can give a rough idea of the general content of a document, but it is rarely good enough to make the translated document fully usable. I'm guessing from your question that you may be fluent in Spanish, and I have translated your question into Spanish to demonstrate the limitations of automatic language translation. There are many online translation services, and I used Free Translation for this example: http://www.freetranslation.com/ Here's how this service translated your question. You will probably agree that something has been lost in the translation: ¿Dada toda la información que está disponible en el internet más todo que es contiene en los bancos privados de la memoria que apreciaría saber si es concebible a la imagen un sistema, el método o el artefacto que pueden llegan a ser el núcleo de toda esa información, lo traducen en un idioma y lo hace disponible a algún usuario dado que tiene tal artefacto? Automated machine translation is already provided from within the search results of search engines such as AltaVista and Google. So, the answer to part two of your question is that it is quite feasible to translate the information from one language to another - but current technology doesn't do the job very well. PART THREE: Is it conceivable to imagine a method or device that can make all the information available to any given user having such device? It would be uneconomical for every person to have a device that holds all of the documents. If we assume that each of the three billion documents mentioned above averages 10 kilobytes (kB), we would need around 30 terabytes (TB) of storage on each device. At current prices of about $1 per gigabyte (GB), the storage would cost $30,000 per device. Also, the device would fill a small room! Luckily, we don't need to replicate all of the documents on every device; we just need to make the all of the documents accessible from every device. We already have the technology to do this, by using an internet-connected computer or even an internet-enabled mobile phone. So, the answer to part three of the question is that once we have the information, we are certainly able to make it available to any user. To sum up: state-of-the-art search engines can already: - index much of the information on the internet - store copies of that information - translate that information into other languages - make that information available to anyone who has a computer and internet access Search engines already provide much of the functionality that you ask about in your question. The remaining problems are how to increase the amount of information that can be gathered, and how to improve the quality of translation. Earlier in this answer, I mentioned Ted Nelson's Project Xanadu. The story of Ted's long struggle to implement his dream is told here: Gary Wolf, "The Curse of Xanadu". Wired Magazine (June 1995) http://www.wired.com/wired/archive/3.06/xanadu.html Gary Wolf's article is fairly harsh, almost chastising Nelson for having a dream so ambitious and all-encompassing that it will probably never be implemented. In a posting to the C2 wiki site, Peter Merel wrote: "Xanadu was a good idea, but if you can't adapt your ideas to circumstances, you can't get them to go any place no matter how good they are." Peter Merel, "The Curse of Xanadu" (comment). Online posting to: http://c2.com/cgi/wiki?TheCurseOfXanadu Peter Merel's comment seems to sum up the current situation. For the forseeable future, most of the world's online information will to be held on the world-wide-web, and any successful global repository of knowledge must be built on top of what the web already provides, and take account of the way the web already works. Additional links: Search Engine Watch http://searchenginewatch.com/ C2 Wiki site (discussions by programmers about software) http://c2.com/cgi/wiki?FindPage Google search strategy: "how big is the www" ://www.google.com/search?q=%22how+big+is+the+www%22 +www "million pages" ://www.google.com/search?q=%2Bwww+%22million+pages%22 "translation service" ://www.google.com/search?q=%22translation+service%22 "project xanadu" ://www.google.com/search?q=%22project+xanadu%22 I hope you find this information useful. If I have missed any information that you were seeking, please ask for clarification. Regards, eiffel-ga
Request for Answer Clarification by teruca-ga on 04 Jul 2002 18:00 PDT hi eiffel-ga thanks for your answer it was very good for me in its first part. the second part about the translation is not what I had in mind simply because it was my mistake not to expand in which type of translation I meant. I am writing a novel where the main character comes by the information or knowledge to build what I call a ¨translator¨. this devise will enable him not to search the web or hack into private banks of memory for specific questions but to be plugged in, or ¨know¨ all that is available instantly. the same way that radio waves are present in the air and by using a radio you can tune in, i imagined that baring legal barriers and ¨languages¨(I am thinking of different computer languages) a direct connection of the opperator to this device and of it to computarized knowledge can make the opperator KNOW the answer to his querry. is this too crazy?
Clarification of Answer by eiffel-ga on 05 Jul 2002 10:42 PDT Hi teruca, In your request for clarification, you say that you do not want a user of your system to have to search the web or hack into private banks of memory for specific questions. Instead, you want the user to be plugged in and to ¨know¨ all that is available instantly. Unfortunately, this is still in the realms of science fiction. However, there is one real-world project that is working towards this goal. The "Cyc" project aims to capture not information but knowledge - as a universal database of all human knowledge, including both "common sense" and "facts". David Whitten maintains a "Frequently asked questions" page about the Cyc project: http://www.robotwisdom.com/ai/cycfaq.html Cycorp has some commercial products based on Cyc, but they are in very early stages of development. One is the knowledge base system itself: The Cyc Knowledge Server: http://www.cyc.com/products2.html The other product is a question-and-answer system built on top of the knowledge server: CycAnswers: https://answers.google.com/answers/main?cmd=threadview&id=35806#a These products should give you a solid basis for incorporating such technology into your novel. Regards, eiffel-ga
Clarification of Answer by eiffel-ga on 05 Jul 2002 14:37 PDT The correct link for CycAnswers is: http://www.cyc.com/products-cycanswers.html

teruca-ga rated this answer: 3 out of 5 stars

Comments

Subject: Re: Science
From: lot-ga on 02 Jul 2002 16:12 PDT

A great answer by eiffel-ga
Although at some point in the future it may be technically possible,
it won't be possible because of privacy laws, patents, secrets which
will prevent that centralised source from being born, as eiffel-ga has
pointed out some information will remain private.
However it could all be linked and protected for authorised access
only I guess.

Besides it would never happen, I have some info stashed away on some
CD's ;)

Subject: Re: Science
From: sahaja108-ga on 05 Jul 2002 21:41 PDT

If you obtain your self-realization, your yoga (literally 'union with
the Divine'), you then have access to the Divine knowledge, ie. if you
meditate on a topic, the answer will be revealed (in the Divine's time
of course). This is, in essence, what was practised by the ancient
yogis of India.
A modern equivalent is Sahaja Yoga:
It is known as 'vibrational awareness'. I use it all the time! 
Ps. This 'self-realization' is free for the asking - contact your
local Sahaja Yoga group through:
http://www.sahajayoga.org

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy