Hi bill_6677:
What a fascinating question. I have worked in the IR field for over 7
years and never thought to list its Top 10 challenges. Given my
background, I tried to remain objective and tried to find the best
lists out there on the web.
The absolute best list I was able to find was at:
What Do People Want from Information Retrieval? (The Top 10 Research
Issues for Companies that Use and Sell IR Systems)
Author:
W. Bruce Croft
Center for Intelligent Information Retrieval
Computer Science Department
University of Massachusetts, Amherst
URL: http://www.dlib.org/dlib/november95/11croft.html
While this list was compiled in 1995, I read through it and the items
are still relevant (and "grand") today. Some progress has been made in
some of the areas listed, but none of them have been overcome or
solved by any stretch of the imagination.
The ten challenges Dr. Croft lists and the first sentences from each
explanation are listed below. You can read further detail at the URL
above.
1. Integrated Solutions: "The most important problem from the point
of view of companies using and selling text-based systems is
integration with other systems."
2. Distributed IR: "With the advent of the World-Wide Web and the
huge increase in the use of the Internet, there has been a
corresponding increase in demand for text retrieval systems that can
work in distributed, wide-area network environments."
3. Efficient, Flexible Indexing and Retrieval: "One of the most
frequently mentioned, and most highly rated, issues is efficiency.
Many different aspects of a system can have an impact on efficiency,
and metrics such as query response time and indexing speed are major
concerns of virtually every company involved with text-based systems."
4. "Magic": "One of the major causes of failures in IR systems is
vocabulary mismatch. This means that the information need is often
described using different words than are found in relevant documents.
Techniques that address this problem by automatic expansion of the
query are often regarded as a form of "magic" by users and are viewed
as highly desirable."
5. Interfaces and Browsing: "Effective interfaces for text-based
information systems are a high priority for users of these systems.
The interface is a major part of how a system is evaluated, and as the
retrieval and routing algorithms become more complex to improve recall
and precision, more stress is placed on the design of interfaces that
make the system easy to use and understandable."
6. Routing and Filtering: "Information routing, filtering and
clipping are all synonyms used to describe the process of identifying
relevant documents in streams of information such as news feeds."
7. Effective Retrieval: "The development of effective retrieval
techniques has been the core of IR research for more than 30 years."
8. Multimedia Retrieval: "Multimedia indexing and retrieval refers to
techniques being developed to access image, video and sound databases
without text descriptions."
9. Information Extraction: "Information extraction techniques,
primarily developed in the context of the Advanced Research Projects
Agency (ARPA) Message Understanding Conferences (MUCs), are designed
to identify database entities, attributes and relationships in full
text."
10. Relevance Feedback: "Relevance feedback is a process where users
identify relevant documents in an initial list of retrieved documents,
and the system then creates a new query based on those sample relevant
documents."
This set of issues is referred to over and over again in the IR
literature - right up to the present day - as the ultimate list.
Another set of grand challenges was set out at:
2002 ASEE Annual Conference, Montreal, Canada - Engineering Libraries
Division -
Technical Session 1641 - "Emerging Information Technologies"
Author:
Bill Mischo,
Engineering Librarian,
University of Illinois at Urbana-Champaign
URL: http://www.englib.cornell.edu/eld/conf/02/emerg_technologies.html
where he stated the following challenges:
* standard retrieval environment (Web) and interface/client (Web
Browser),
* standardized search/retrieval mechanisms (HTTP Post/Get, SQL,
Z39.50),
* standard language for describing and transforming content and
metadata (XML, XSLT, DC, DCQ, RDF, Schemas),
* standard transport mechanisms to connect heterogeneous content
(HTTP, SOAP, OAI).
While this list is much more "technical" in nature, it does add some
valuable insight to the debate.
I hope that this information has been of help to you.
If you need any clarification of the information I have provided, or
you need me to drill down in any specific area, please ask using the
Clarification feature and provide me with additional details as to
what you are looking for. As well, please allow me to provide you with
clarification(s) *before* you rate this answer.
Thank you.
websearcher-ga
Search Strategy:
"top 10" challenges "information retrieval"
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&safe=off&q=%22top+10%22+challenges+%22information+retrieval%22
"grand challenges of information retrieval"
://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&safe=off&q=%22grand+challenges+of+information+retrieval%22 |