Google Answers Logo
View Question
 
Q: Science ( Answered 3 out of 5 stars,   2 Comments )
Question  
Subject: Science
Category: Science > Technology
Asked by: teruca-ga
List Price: $150.00
Posted: 02 Jul 2002 08:05 PDT
Expires: 01 Aug 2002 08:05 PDT
Question ID: 35806
¿Given all the information that is available in internet plus all that
is containes in private memory banks I would like to know if it is
conceivable to image a system, method or device that can become the
nucleus of all that information, translate it into one language and
make it available to any given user having such device?
Answer  
Subject: Re: Science
Answered By: eiffel-ga on 02 Jul 2002 13:55 PDT
Rated:3 out of 5 stars
 
Hi teruca,

The idea of a single global store of knowledge has captivated people's
interests for decades.

Ted Nelson founded Project Xanadu in 1960. He had a vision of a
world-wide public hypertext publishing and document storage system.
For the past 42 years Project Xanadu has worked to implement a network
of deep electronic documents with side-by-side inter-comparison.
Xanadu provides for frictionless re-use of copyrighted material, with
rights management. The Xanadu model envisages a massive global
repository of information with automatic version management, quoting
and annotation. Xanadu documents are deeply-interconnected and
systematically stored:
http://xanadu.com/

I will mention Ted Nelson's dream again later in this answer.

It is not currently possible to implement all of the things that you
mention in your question, although the technology does exist to do
some of those things - and some of them are being done quite well
already.

It is possible to index and store much of the information available on
the internet, and also some privately-held information. It is possible
to perform a rough automated translation of that information into
other languages. It is possible to provide access to that information
across networks such as the internet.

But it is not possible to store all of the information on a single
compact device. It is not possible to provide automatic translations
that are anywhere near as good as translations performed by a human.
And most privately-held information is likely to remain private.

I'll break your question down into three parts. If I've missed
anything, or not addressed a part of your question in the way that you
intended, please post a request for clarification.

PART ONE: Is it conceivable to imagine a system that can become the
nucleus of all the information that is available on the internet, and
all information in private databases?

According to the Online Computer Library Center there are 8.4 million
unique websites, a growth of 18% since last year. Of these, three
million were public, two million were private, and a further three
million were "provisional" or "under construction":
http://wcp.oclc.org/ (click on "Size and Growth")

It is quite practical to access (or "crawl") and index these public
websites, and web search engines do just that.

According to Inktomi, their index comprises 500 million web pages:
The Inktomi Difference
http://www.inktomi.com/products/web_search/difference.html

In addition to HTML web pages, Google indexes a range of other
document types including PDF documents, Microsoft Office document
formats and Corel document formats. According to the Google home page,
Google indexes and searches "2,073,418,204 web pages":
://www.google.com/

The Google index also includes over 700 million Usenet newsgroup
postings and 330 million images.
"Google Offers Immediate Access to 3 Billion Web Documents"
://www.google.com/press/pressrel/3billion.html

To access and index such a large number of documents, and to allow the
index to be queried, obviously requires massive communications
bandwidth and computing power. For example, Google use over 6000
RedHat Linux servers:
Interview with Google's Sergey Brin
http://www.linuxgazette.com/issue59/correa.html

It's not practical to access and re-index every document every day. So
search engines crawl the web periodically - typically every month or
so. Sometimes it is possible to determine which documents are likely
to change frequently, in which case those documents can be indexed
more often. According to the link referenced above, Google refreshes
"millions of web pages every day".

In your question, you mention information held in private databases.
It is hard to envisage a legislative or technological change that
would result in private databases being made centrally available, so
it seems inevitable that much of the world's private data will remain
private.

Some private data is marketable, and this could be made centrally
available if a suitable charging structure is in place. This already
happens with commercial data (such as the stock exchange prices that
can be obtained from Google), and with subscription-based access to
private data sold by content aggregators.

In addition to indexing the web, search engines can also store copies
of it. For example, Google caches copies of the documents that it
indexes, and can serve those copies to anyone who requests them. For
documents in formats other than HTML, an HTML version of the document
can be served if the user requests it.

So, the answer to part one of your question is that state-of-the-art
search engines can already be though of as a nucleus for much the
information that is available on the internet, and even for some
information from private databases.

PART TWO: Is it conceivable to imagine translating this information
from one language to another?

Certainly, the answer is "yes", although translation technology is
fairly rudimentary at present. Machine translation can give a rough
idea of the general content of a document, but it is rarely good
enough to make the translated document fully usable.

I'm guessing from your question that you may be fluent in Spanish, and
I have translated your question into Spanish to demonstrate the
limitations of automatic language translation. There are many online
translation services, and I used Free Translation for this example:
http://www.freetranslation.com/

Here's how this service translated your question. You will probably
agree that something has been lost in the translation:

   ¿Dada toda la información que está disponible en
   el internet más todo que es contiene en los bancos
   privados de la memoria que apreciaría saber si es
   concebible a la imagen un sistema, el método o el
   artefacto que pueden llegan a ser el núcleo de toda
   esa información, lo traducen en un idioma y lo
   hace disponible a algún usuario dado que tiene
   tal artefacto? 

Automated machine translation is already provided from within the
search results of search engines such as AltaVista and Google.

So, the answer to part two of your question is that it is quite
feasible to translate the information from one language to another -
but current technology doesn't do the job very well.

PART THREE: Is it conceivable to imagine a method or device that can
make all the information available to any given user having such
device?

It would be uneconomical for every person to have a device that holds
all of the documents. If we assume that each of the three billion
documents mentioned above averages 10 kilobytes (kB), we would need
around 30 terabytes (TB) of storage on each device. At current prices
of about $1 per gigabyte (GB), the storage would cost $30,000 per
device. Also, the device would fill a small room!

Luckily, we don't need to replicate all of the documents on every
device; we just need to make the all of the documents accessible from
every device. We already have the technology to do this, by using an
internet-connected computer or even an internet-enabled mobile phone.

So, the answer to part three of the question is that once we have the
information, we  are certainly able to make it available to any user.

To sum up: state-of-the-art search engines can already:

- index much of the information on the internet

- store copies of that information

- translate that information into other languages

- make that information available to anyone who
  has a computer and internet access

Search engines already provide much of the functionality that you ask
about in your question. The remaining problems are how to increase the
amount of information that can be gathered, and how to improve the
quality of translation.

Earlier in this answer, I mentioned Ted Nelson's Project Xanadu. The
story of Ted's long struggle to implement his dream is told here:
Gary Wolf, "The Curse of Xanadu". Wired Magazine (June 1995)
http://www.wired.com/wired/archive/3.06/xanadu.html

Gary Wolf's article is fairly harsh, almost chastising Nelson for
having a dream so ambitious and all-encompassing that it will probably
never be implemented. In a posting to the C2 wiki site, Peter Merel
wrote:

"Xanadu was a good idea, but if you can't adapt your ideas to
circumstances, you can't get them to go any place no matter how good
they are."
Peter Merel, "The Curse of Xanadu" (comment). Online posting to:
http://c2.com/cgi/wiki?TheCurseOfXanadu

Peter Merel's comment seems to sum up the current situation. For the
forseeable future, most of the world's online information will to be
held on the world-wide-web, and any successful global repository of
knowledge must be built on top of what the web already provides, and
take account of the way the web already works.


Additional links:

Search Engine Watch
http://searchenginewatch.com/

C2 Wiki site (discussions by programmers about software)
http://c2.com/cgi/wiki?FindPage


Google search strategy:

"how big is the www"
://www.google.com/search?q=%22how+big+is+the+www%22

+www "million pages"
://www.google.com/search?q=%2Bwww+%22million+pages%22

"translation service"
://www.google.com/search?q=%22translation+service%22

"project xanadu"
://www.google.com/search?q=%22project+xanadu%22


I hope you find this information useful. If I have missed any
information that you were seeking, please ask for clarification.

Regards,
eiffel-ga

Request for Answer Clarification by teruca-ga on 04 Jul 2002 18:00 PDT
hi eiffel-ga thanks for your answer it was very good for me in its
first part. the second part about the translation is not what I had in
mind simply because it was my mistake not to expand in which type of
translation I meant. I am writing a novel where the main character
comes by the information or knowledge to build what I call a
¨translator¨. this devise will enable him not to search the web or
hack into private banks of memory for specific questions but to be
plugged in, or ¨know¨ all that is available instantly. the same way
that radio waves are present in the air and by using a radio you can
tune in, i imagined that baring legal barriers and ¨languages¨(I am
thinking of different computer languages) a direct connection of the
opperator to this device and of it to computarized knowledge can make
the opperator KNOW the answer to his querry. is this too crazy?

Clarification of Answer by eiffel-ga on 05 Jul 2002 10:42 PDT
Hi teruca,

In your request for clarification, you say that you do not want a user
of your system to have to search the web or hack into private banks of
memory for specific questions. Instead, you want the user to be
plugged in and to ¨know¨ all that is available instantly.

Unfortunately, this is still in the realms of science fiction.

However, there is one real-world project that is working towards this
goal. The "Cyc" project aims to capture not information but knowledge
- as a universal database of all human knowledge, including both
"common sense" and "facts".

David Whitten maintains a "Frequently asked questions" page about the
Cyc project:
http://www.robotwisdom.com/ai/cycfaq.html

Cycorp has some commercial products based on Cyc, but they are in very
early stages of development. One is the knowledge base system itself:

The Cyc Knowledge Server:
http://www.cyc.com/products2.html

The other product is a question-and-answer system built on top of the
knowledge server:

CycAnswers:
https://answers.google.com/answers/main?cmd=threadview&id=35806#a

These products should give you a solid basis for incorporating such
technology into your novel.

Regards,
eiffel-ga

Clarification of Answer by eiffel-ga on 05 Jul 2002 14:37 PDT
The correct link for CycAnswers is:
http://www.cyc.com/products-cycanswers.html
teruca-ga rated this answer:3 out of 5 stars

Comments  
Subject: Re: Science
From: lot-ga on 02 Jul 2002 16:12 PDT
 
A great answer by eiffel-ga
Although at some point in the future it may be technically possible,
it won't be possible because of privacy laws, patents, secrets which
will prevent that centralised source from being born, as eiffel-ga has
pointed out some information will remain private.
However it could all be linked and protected for authorised access
only I guess.

Besides it would never happen, I have some info stashed away on some
CD's ;)
Subject: Re: Science
From: sahaja108-ga on 05 Jul 2002 21:41 PDT
 
If you obtain your self-realization, your yoga (literally 'union with
the Divine'), you then have access to the Divine knowledge, ie. if you
meditate on a topic, the answer will be revealed (in the Divine's time
of course). This is, in essence, what was practised by the ancient
yogis of India.
A modern equivalent is Sahaja Yoga:
It is known as 'vibrational awareness'. I use it all the time! 
Ps. This 'self-realization' is free for the asking - contact your
local Sahaja Yoga group through:
http://www.sahajayoga.org

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy