Google Answers: Science of Information Retrieval

View Question

Q: Science of Information Retrieval ( Answered, 0 Comments )

Question

Subject: Science of Information Retrieval
Category: Science > Technology
Asked by: bill_6677-ga
List Price: $60.00

Posted: 27 Nov 2002 13:41 PST
Expires: 27 Dec 2002 13:41 PST
Question ID: 115634

What are the current top ten grand challenges in the science/technology
of information retrieval from massive data resources? Issues could be
related to library science and digital libraries.

Answer

Subject: Re: Science of Information Retrieval
Answered By: websearcher-ga on 27 Nov 2002 14:27 PST

Hi bill_6677:

What a fascinating question. I have worked in the IR field for over 7
years and never thought to list its Top 10 challenges. Given my
background, I tried to remain objective and tried to find the best
lists out there on the web.

The absolute best list I was able to find was at:

What Do People Want from Information Retrieval? (The Top 10 Research
Issues for Companies that Use and Sell IR Systems)
Author:
W. Bruce Croft
Center for Intelligent Information Retrieval
Computer Science Department
University of Massachusetts, Amherst
URL: http://www.dlib.org/dlib/november95/11croft.html

While this list was compiled in 1995, I read through it and the items
are still relevant (and "grand") today. Some progress has been made in
some of the areas listed, but none of them have been overcome or
solved by any stretch of the imagination.

The ten challenges Dr. Croft lists and the first sentences from each
explanation are listed below. You can read further detail at the URL
above.

1. Integrated Solutions: "The most important problem from the point
of view of companies using and selling text-based systems is
integration with other systems."

2. Distributed IR: "With the advent of the World-Wide Web and the
huge increase in the use of the Internet, there has been a
corresponding increase in demand for text retrieval systems that can
work in distributed, wide-area network environments."

3. Efficient, Flexible Indexing and Retrieval: "One of the most
frequently mentioned, and most highly rated, issues is efficiency.
Many different aspects of a system can have an impact on efficiency,
and metrics such as query response time and indexing speed are major
concerns of virtually every company involved with text-based systems."

4. "Magic": "One of the major causes of failures in IR systems is
vocabulary mismatch. This means that the information need is often
described using different words than are found in relevant documents.
Techniques that address this problem by automatic expansion of the
query are often regarded as a form of "magic" by users and are viewed
as highly desirable."

5. Interfaces and Browsing: "Effective interfaces for text-based
information systems are a high priority for users of these systems.
The interface is a major part of how a system is evaluated, and as the
retrieval and routing algorithms become more complex to improve recall
and precision, more stress is placed on the design of interfaces that
make the system easy to use and understandable."

6. Routing and Filtering: "Information routing, filtering and
clipping are all synonyms used to describe the process of identifying
relevant documents in streams of information such as news feeds."

7. Effective Retrieval: "The development of effective retrieval
techniques has been the core of IR research for more than 30 years."

8. Multimedia Retrieval: "Multimedia indexing and retrieval refers to
techniques being developed to access image, video and sound databases
without text descriptions."

9. Information Extraction: "Information extraction techniques,
primarily developed in the context of the Advanced Research Projects
Agency (ARPA) Message Understanding Conferences (MUCs), are designed
to identify database entities, attributes and relationships in full
text."

10. Relevance Feedback: "Relevance feedback is a process where users
identify relevant documents in an initial list of retrieved documents,
and the system then creates a new query based on those sample relevant
documents."

This set of issues is referred to over and over again in the IR
literature - right up to the present day - as the ultimate list.

Another set of grand challenges was set out at:

2002 ASEE Annual Conference, Montreal, Canada - Engineering Libraries
Division -
Technical Session 1641 - "Emerging Information Technologies"
Author:
Bill Mischo,
Engineering Librarian,
University of Illinois at Urbana-Champaign
URL: http://www.englib.cornell.edu/eld/conf/02/emerg_technologies.html

where he stated the following challenges:

* standard retrieval environment (Web) and interface/client (Web
Browser),
* standardized search/retrieval mechanisms (HTTP Post/Get, SQL,
Z39.50),
* standard language for describing and transforming content and
metadata (XML, XSLT, DC, DCQ, RDF, Schemas),
* standard transport mechanisms to connect heterogeneous content
(HTTP, SOAP, OAI).

While this list is much more "technical" in nature, it does add some
valuable insight to the debate.

I hope that this information has been of help to you.

If you need any clarification of the information I have provided, or
you need me to drill down in any specific area, please ask using the
Clarification feature and provide me with additional details as to
what you are looking for. As well, please allow me to provide you with
clarification(s) *before* you rate this answer.

Thank you.

websearcher-ga

Search Strategy:

"top 10" challenges "information retrieval"
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&safe=off&q=%22top+10%22+challenges+%22information+retrieval%22

"grand challenges of information retrieval"
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&safe=off&q=%22grand+challenges+of+information+retrieval%22

Request for Answer Clarification by bill_6677-ga on 03 Dec 2002 05:50 PST

The Mischo ref was not really that helpful, but
the Bruce Croft reference is great!!  Thank you.  But it is
a 1995 ref, and I can't help but wonder about more recent 
things ... especially in two areas: 
- "Distributed IR" esp. the use of multi-agent systems to
  locate and manage the searching of multiple disparate
  databases.  According to Croft in '95, "Research addressing
  these issues has begun to appear".  What is the best current
  multi-agent IR system(s)?  and
- "Interfaces and Browsing".  Croft refers to "much
  more work on interfaces will be appearing".  I'm interested
  in more current challenges re how the presentation of retrieved
  information affects the search process and the search performance.
  Has anyone done any recent work on the "I'll know it when I
  see it" effect?  Not that I'm expecting the search process
  to be yielding "bingo - the answer".  It's not a needle in
  a haystack.  It's more about revealing a story with characters 
  and plot lines. It's more like ...  "I have confidence that I 
  have seen enough, and have a complete enough picture of the 
  situation".  So how about interfaces supporting "confidence",
  and "situations" rather than "answers"?   Or, just more recent
  challenges re interfaces for IR.  :-)
Thks!!

Clarification of Answer by websearcher-ga on 03 Dec 2002 11:31 PST

Hi bill_6677:

Thanks for the clarification request. I'm glad you found the Croft
paper useful. :-)

As for the two additional areas you mention, it is difficult to define
the "best" approach to anything - but I will supply you with links to
some of the most recent research.

Multi-Agent IR systems
**********************

An Architectural Design for Multi-Agent Information Trading
P. McDermott and C O'Riordan (National University of Ireland, Galway)
http://www.it.nuigalway.ie/TR/rep02/NUIG-IT-101002.pdf

Eighteenth National Conference on Artificial Intelligence - Workshop
Program
http://www.aaai.org/Workshops/2002/wsparticipation-02.pdf

Research summary: Communication-sensitive decision making in
multi-agent, real-time environments
Marie desJardins, Karen Myers, David Morley, and Michael Wolverton
http://www.cs.umbc.edu/~mariedj/papers/ucav-ss01.ps


Interfaces and Browsing
***********************

Improving Display of Search Results in Information Retrieval Systems –
Users’ Study
Offer Drori (Hebrew University of Jerusalem)
http://shum.huji.ac.il/~offerd/papers/drori072001.pdf

User Interface Design
http://www.clis2.umd.edu/dlrg/filter/papers/speech/node2.html


Additional Search Strategy (on Google):

"Distributed IR" "multi-agent" 2001 OR 2002
"Distributed IR" "multi-agent" OR "multi agent"
"Information Retrieval" "multi-agent" 2001 OR 2002
confidence "Information Retrieval" interface OR interfaces
"confidence level" "Information Retrieval" interface OR interfaces
"user confidence" "Information Retrieval" interface OR interfaces

Comments

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy