Dear 6ra3,
Information visualization, or infovis, is a broad multidisciplinary
field that never stops evolving. Although great strides have been made
in the presentation, manipulation, and interpretation of data, there are
few definitive results because the technology of infovis is changing
at too rapid a pace. As hardware improves, bringing larger storage
capacities, higher-resolution screens, and faster graphics processors,
the means of information visualization quickly become obsolete and
must be refreshed. At the same time, the data to be visualized is also
growing in scale due to the increasing reach of data collection devices
and the competitive demands of new marketplaces in which information is
the currency or the source of profit.
With the understanding that information visualization is constantly
undergoing development, the best we can do is to provide a snapshot
of current results. I have listed below a number of research projects
that deal with state-of-the-art visualization techniques for gaining
insight into raw data or for rearranging chaotic visual information into
meaningful structures. Most of the pages to which I link provide further
links to technical papers on their subject. The topics covered below
range from the static representation of mathematical structures to the
dynamic exploration of climatic models. At the end, I'll point you to a
web junction that leads to many more examples of information-visualization
research.
You may also like to read a set of slides written by Tamara Munzner, a
leading academic in the field, as part of her course materials. The slides
are accessible through the second link below, after a paragraph from
Prof. Munzner that I think captures the essence of infovis very nicely.
Computer-based information visualization, or "infovis", centers
around helping people explore or explain data by designing
interactive software that exploits the properties of the human
perceptual system. The central design challenge in infovis is
designing a cognitively useful spatial mapping of a dataset
that is not inherently spatial. There are many possible visual
encodings, only a fraction of which are helpful for a given
task. It draws on the intellectual history of several traditions,
including computer graphics, human-computer interaction, cognitive
psychology, semiotics, graphic design, statistical graphics,
cartography, and art. The synthesis of relevant ideas from these
fields with new methodologies and techniques made possible by
interactive computation are critical for helping people keep
pace with the torrents of data confronting them. One of the few
resources increasing faster than the speed of computer hardware
is the amount of data to be processed.
University of British Columbia: Tamara Munzner: CPSC 533C Overview
http://www.cs.ubc.ca/~tmm/courses/cpsc533c-03-spr/
University of British Columbia: Tamara Munzner: Information Visualization
Introduction [slides]
http://www.cs.ubc.ca/~tmm/courses/cpsc533c-03-spr/0108.intro/index.html
One of Prof. Munzner's research projects concerns the representation
of the massive graphs that result from network analysis. A graph with
hundreds of thousands of nodes is not easily drawn, and when drawn,
not easily assessed.
Many real-world domains can be represented as large node-link
graphs: backbone Internet routers connect with 70,000 other
hosts, mid-sized Web servers handle between 20,000 and 200,000
hyperlinked documents, and dictionaries contain millions of words
defined in terms of each other. Computational manipulation of such
large graphs is common, but previous tools for graph visualization
have been limited to datasets of a few thousand nodes. [...]
This thesis contains a detailed analysis of three specialized
systems for the interactive exploration of large graphs, relating
the intended tasks to the spatial layout and visual encoding
choices. We present two novel algorithms for specialized layout
and drawing that use quite different visual metaphors. The H3
system for visualizing the hyperlink structures of web sites
scales to datasets of over 100,000 nodes by using a carefully
chosen spanning tree as the layout backbone, 3D hyperbolic
geometry for a Focus+Context view, and provides a fluid
interactive experience through guaranteed frame rate drawing. The
Constellation system features a highly specialized 2D layout
intended to spatially encode domain-specific information for
computational linguists checking the plausibility of a large
semantic network created from dictionaries. The Planet Multicast
system for displaying the tunnel topology of the Internet's
multicast backbone provides a literal 3D geographic layout of
arcs on a globe to help MBone maintainers find misconfigured
long-distance tunnels.
Stanford: Tamara Munzner: Interactive Visualization of Large Graphs
and Networks
http://graphics.stanford.edu/papers/munzner_thesis/
A particular kind of graph is a tree, which organizes nodes below
a designated root node. The next two projects demonstrate that the
spatial characteristics of a complex tree can be effectively explored
by augmenting them with color coding or with navigational tools that
act as an interactive guide to the tree structure.
Treemap is a space-constrained visualization of hierarchical
structures. It is very effective in showing attributes of leaf
nodes using size and color coding. Treemap enables users to
compare nodes and sub-trees even at varying depth in the tree,
and help them spot patterns and exceptions.
University of Maryland: Human-Computer Interaction Lab: Treemap
http://www.cs.umd.edu/hcil/treemap/
SpaceTree is a novel tree browser that builds on the conventional
layout node link diagrams along a single preferred direction. It
adds dynamic rescaling of branches of the tree to best fit the
available screen space, optimized camera movement, and the use
of preview icons summarizing the topology of the branches that
cannot be expanded. In addition, it includes integrated search
and filter functions. This paper reflects on the evolution
of the design and highlights the principles that emerged from
it. A controlled experiment showed benefits for navigation tasks
to already previously visited nodes and estimation of overall
tree topology.
University of Maryland: Human-Computer Interaction Lab: SpaceTree:
a novel node-link tree browser
http://www.cs.umd.edu/hcil/spacetree/
There is much work dedicated to the problem of usefully representing
massive data sets containing millions of data points. Large computer
screens with millions of pixels have made it physically possible to
squeeze the data into a unified representation, but in order to make sense
of this vast visual display, researchers have devised new algorithms to
subdivide the data into semantically cohesive subsets.
Because of the ever increasing size of datasets, it gets harder
and harder to extract useful information out of them. Indeed
with the use of digital technology everywhere for data
collection or generation, typical datasets have grown by
orders of magnitude and some may be a million times or more
larger. These immense datasets become exploration-dominant
since even the experts who create or collect them don't know,
in detail, what's inside. It is important that the user of the
data is able to explore the data in an interactive fashion, no
matter how large the dataset is. This is because very responsive
communication between user and data allows the user to achieve a
new, higher state,perceptually and cognitively, of information
understanding. We're developing several techniques that enable
this responsive communication. [...]
One of these techniques is a fast clustering algorithm that
uses an initial binsort to scale the data to a more manageable
size. Given an arbitrary dataset, the algorithm finds a
user-specified number of clusters in the data. The emphasis isn't
so much on the accuracy of the positions of the clusters, but
rather on the time it takes to find them. This makes the emphasis
much different than for usual clustering and related (e.g.,
shape determination) algorithms, where accuracy is attained at
the expense of much longer computational times. Nevertheless, the
algorithm is reasonably good and satisfies the criterion of being
"good enough" by supplying enough detail for further exploration.
Georgia Tech: Data Visualization Group: Clustering for Exploratory
Visualization of Large Data
http://www.cc.gatech.edu/gvu/datavis/research/clustering.html
Visualizing one million of items on a 1600x1200 screen is a
challenge in term of visualization, graphics, perception and
interaction. We have designed new techniques to achieve it for
treemaps and scatter plots. [...]
Whereas treemaps are space-filling visualization techniques where
areas never overlap, scatter plots cannot avoid overlap. Even
with hundreds of items, the distribution tends to be sparse
with areas of high density that are hard to see. Transparency is
useful when up to five items overlap, but with one million items,
hundreds of overlapping items are not rare. To solve that problem,
we synthesize an overlap attribute to show the item density. [...]
When the positions of the data items are preserved - i.e. when
only changing colors or stacking order - flipping between views
enables quick comparisons thanks to the retina persistency. [...]
Dynamic queries rely on interactively filtering and redisplaying
a data set through a continuous interaction. Current systems
use "range-sliders" to filter one attribute either changing the
smallest value, the largest, or sweeping a range of values between
the smallest and the largest. To achieve the redisplay speed
required for smooth interaction, we have designed a technique
that relies on hardware acceleration. The data set should be
loaded into main memory. When the user activates a slider to
perform the query dynamically, all the items are sent to the
GPU and stored in a display list. The Z coordinate is calculated
according to the attribute being filtered by the dynamic query
so, for example, if a film database is displayed and the user
wants to filter on the size of the film, the size is assigned to
the Z-axis. Each time the slider moves, a new near or far plane
value is computed and sent to the GPU and the list is redisplayed,
leaving the visibility computation to the hardware.
University of Maryland: Human-Computer Interaction Lab: Interactive
Information Visualization of a Million Items
http://www.cs.umd.edu/hcil/millionvis/
Computer simulations of natural phenomena have long been pursued on
a prodigious scale, but only recently has graphical hardware reached
the point where it can adequately represent some portion of the vast
number-crunching effort. Researchers at Georgia Tech have built a system
that allows meteorologists to visually navigate through a computational
model of a local weather system in search of significant patterns. Others
are developing more general tools to visually analyze the data that
flows from large simulations.
Researchers are working now to integrate a high-resolution weather
model, which can forecast conditions for areas as small as one
to four square kilometers. Researchers expect to complete the
project within two years.
"Once we have it all there, we will be able to show for the first
time these dynamic volumes of information in this visualization
system, basically as the data are received," Ribarsky says. "This
has not been done in 3D before in a time-dependent format."
Faust adds that the ability to look at storms in three dimensions
in real time will give researchers new insight into the 3D nature
of storm development, and that information will result in better
severe weather detection software.
Georgia Tech: Data Visualization Group: Seeing Three Dimensions in
Real Time
http://www.cc.gatech.edu/gvu/datavis/research/weather/weather.html
We have constructed a tightly coupled set of general methods
for monitoring, steering, and applying visual analysis to large
scale simulations. This work is based in part on a collaborative,
interdisciplinary process that teams application and computer
scientists to develop a powerful integrated approach. The
integrated design allows great flexibilty in the development
and use of analysis tools. Underlying all this is a general data
organization for exploratory visualization/analysis. It supports
mappings between (user-defined) visual representations and the
original data. We have developed tools that take advantage of
this organization and allow selection of data using spatial or
other constraints. These selected data can be reclassified for
binding to new visual representations(or hidden from view). The
selection and binding is performed in an intuitive manner by
direct manipulation so that even users who are not graphics
experts can do it. The direct manipulation selection process and
the underlying data organization also allow us to apply powerful
and straightforward techniques for visual steering.
Georgia Tech: Data Visualization Group: Steering, Visualization,
and Analysis
http://www.cc.gatech.edu/gvu/datavis/research/atmospheric.html
Quite apart from the problem of rendering a graph in a readable fashion,
there is the one of attaching labels that will inform viewers of the
meaning of each node or edge. The size and placement of such labels poses
considerable difficulties in the spatial interface. One project proposes
that circular, localized arrangements of labels are an effective solution.
The widespread use of information visualization is hampered by
the lack of effective labeling techniques. We propose "excentric
labeling", a new dynamic technique to label a neighborhood
of objects located around the cursor. This technique does not
intrude into the existing interaction, it is not computationally
intensive, and was easily applied to several visualization
applications. A pilot study indicated a strong speed benefit
for tasks that involve the rapid exploration of large numbers
of objects.
University of Maryland: Human-Computer Interaction Lab: Excentric Labeling
for Information Visualization
http://www.cs.umd.edu/hcil/excentric/
The web is a graph structure that attracts interest for its commercial
value and its rapidly growing topology. Novel methods that distort the
graph on a non-linear scale offer an appealing alternative to traditional
perspectives.
We visualize the structure of sections of the World Wide Web
by constructing graphical representations in 3D hyperbolic
space. The felicitous property that hyperbolic space has
``more room'' than Euclidean space allows more information to
be seen amid less clutter, and motion by hyperbolic isometries
provides for mathematically elegant navigation. The 3D graphical
representations, available in the WebOOGL or VRML file formats,
contain link anchors which point to the original pages on the
Web itself. We use the Geomview/WebOOGL 3D Web browser as an
interface between the 3D representation and the actual documents
on the Web. The Web is just one example of a hierarchical tree
structure with links ``back up the tree'' i.e. a directed graph
which contains cycles. Our information visualization techniques
are appropriate for other types of directed graphs with cycles,
such as filesystems with symbolic links.
Stanford University: Tamara Munzner: Visualizing the Structure of the
World Wide Web in 3D Hyperbolic Space
http://graphics.stanford.edu/papers/webviz/
Carnegie Mellon has three ongoing projects devoted to information
visualization. The first seeks to give users a convenient way to
manipulate data by associating it with a physical representation. Thus,
each data set is represented by a virtual reality that one can
explore and modify. The second is devoted to visualizing queries on
relational databases, making a departure from the conventional textual
representations. Finally, the SolarPlot tool combines the techniques of
data segmentation and circular depiction to ease pattern discovery in
large data sets.
Current static visualizations are limited in several important
ways:
* Users are not able to focus on different object sets in detail
while still keeping them in context with the environment.
* When the information space is dense, there will be a lot of
clutter and object occlusion.
* A data set may contain elements that have vastly different
values. Thus, some objects may be dwarfed when shown in the
scale used for the entire data set.
* Many visualizations only allow users to view the underlying
data, and do not provide tools for classifying sets of objects
and saving those classifications.
* It is difficult to compare quantities represented by graphical
objects which are not spatially contiguous.
The SDM paradigm deals with these difficulties by providing
object-centered selection, direct object manipulation through
the use of handles, and a "physics" of objects that supports
malleability and flexible control. Every object in a graphic set
correllates with a unique object in the data set. Each object
in a graphic set uses the same visual specifications.
For example, a data set of supply centers might be visualized as a
set of cylinders; where the materials-weight attribute is mapped
to the height of the cylinder, and the longitude and latitude
attributes are mapped to the x and y location of the cylinder.
Carnegie Mellon: SAGE Visualization Group: Selective Dynamic Manipulation
http://www-2.cs.cmu.edu/Groups/sage/sdm.html
Exploratory data analysis is an iterative process where high level
questions lead to specific queries whose answers are examined for
interesting patterns. These in turn suggest new questions. To
facilitate this kind of exploration, we would like to provide
the analyst rapid, incremental, and reversible operations giving
continuous visual feedback. However we also need the expressive
power to reorganize the data on the fly, to juxtapose objects
according to diverse criteria, and visualizations to show
relationships among properties of these different objects. In
short, we want both the ease of use of direct manipulation
systems and the power of database query systems. [...]
VQE is a Visual Query Environment for expressing queries involving
navigation among multiple objects, aggregating these objects,
and defining derived attributes for them.
Carnegie Mellon: SAGE Visualization Group: Visual Query Environments
http://www-2.cs.cmu.edu/Groups/sage/vqe.html
New technologies have made it much easier for us to collect
and disseminate information. However, the explosion of
information also means that we have to keep track of huge
amounts of information. This can sometimes be very difficult
and time consuming. The SolarPlot and Aggregate TreeMaps are
visualization interfaces that can help us better deal with certain
types of large data sets. They are both based on the concept of
data aggregation or data binning. Data aggregation or binning
simplifies large data sets by summarizing groups of data elements
and representing such groups with a single graphical symbol.
The SolarPlot is an interactive circular histogram. Data values
are plotted around the circumference of a circle. Interactive
control is provided so that the circle may be continuously
expanded or contracted. By interactively expanding and contracting
the solarPlot we can view the data at different levels of detail
(i.e. different levels of aggregation). Different aggregation
levels may reveal different patterns within the data set.
Carnegie Mellon: SAGE Visualization Group: Solarplot & Aggregate Tree Map
http://www-2.cs.cmu.edu/Groups/sage/solar.html
Descriptions of many more research projects, along with papers and other
technical materials, can be found by visiting the research centers linked
on the following page.
Information Visualization: Research
http://iv.homeunix.org/research.php
Experts in information visualization predict that advanced techniques
for information visualization will become increasingly widespread as
computers become ubiquitous and as software designers, assisted by
user-interface experts, begin to confront the limitations of existing
representational modes. Systems that previously needed no user interface
will now acquire them as they become more software-intensive, calling
for means of displaying and interpreting the resulting flow of data.
For instance, as computerized automobile control systems extend to every
corner of our cars, the data they feed back to the driver from their
many sensors must be marshaled in a sensible fashion. The fact that auto
manufacturers have not managed to do so yet -- the most advanced control
systems by Mercedes and BMW are equipped with the most notoriously
inscrutable user interfaces -- means that there is a patent need for
good information visualization that must be addressed in future.
Information visualization will become a daily task not just for
specialized information workers but for average consumers who increasingly
need a way to cope with the data surging at them from every corner of a
digitized world. As another example, televisions now cram a great deal of
information into the screen in a haphazard fashion that makes each data
stream more difficult to pick out. Fortunately, information-visualization
researchers have laid considerable groundwork that offers better solutions
to these problems. These solutions will inevitably migrate into the public
sphere to assist consumers in controlling and understanding their data
streams. And yet more data is coming, making it all the more vital to
pursue further research.
We believe that the future of user interfaces is in the direction
of larger, information-abundant displays. With such designs, the
worrisome flood of information can be turned into a productive
river of knowledge. Our experience during the past eight years
has been that visual query formulation and visual display of
results can be combined with the successful strategies of direct
manipulation. Human perceptual skills are are quite remarkable
and largely underutilized in current information and computing
systems. Based on this insight, we developed dynamic queries,
starfield displays, treemaps, treebrowsers, zoomable user
interfaces, and a variety of widgets to present, search, browse,
filter, and compare rich information spaces.
There are many visual alternatives but the basic principle
for browsing and searching might be summarized as the Visual
Information Seeking Mantra: Overview first, zoom and filter,
then details-on-demand. In several projects we rediscovered this
principle and therefore wrote it down and highlighted it as a
continuing reminder. If we can design systems with effective
visual displays, direct manipulation interfaces, and dynamic
queries then users will be able to responsibly and confidently
take on even more ambitious tasks.
University of Maryland: Human-Computer Interaction Lab: Visualization
http://www.cs.umd.edu/hcil/research/visualization.shtml
In the future, information visualization systems will become
increasingly pervasive. Future processors speed will no doubt
continue to advance according to Moore's Law, but the amount
of data to process will increase even faster. This explosion of
data comes from many sources:
* processors with the ability to log events have become interwoven
with the fabric of daily and business life;
* sensors that have become small, cheap, and networked;
* and the growing feasibility of simulation that allows the
gathering of data about virtual rather than real-world events.
Data collection isn't an end of itself, but a means to the end of
helping humans deal with the world. Computer-based visualization
lets humans wend their way hrough these mountains of data,
making decisions based on understanding.
Stanford University: Tamara Munzner: Introduction to IEEE Computer
Graphics and Applications Special Issue on Information Visualization
http://graphics.stanford.edu/~munzner/cga02/gei.html
Now for some book recommendations. The following is a collection of
academic articles on state-of-the-art topics in the field of information
visualization. It is often used in the classroom as a graduate textbook.
Amazon: Readings in Information Visualization : Using Vision to Think
http://www.amazon.com/exec/obidos/tg/detail/-/1558605339/002-9353419-4653610?v=glance
Here is another and somewhat less costly textbook often used in graduate
courses on information visualization.
Amazon: Information Visualization : Perception for Design
http://www.amazon.com/exec/obidos/ASIN/1558608192/qid=1113328930/sr=2-1/ref=pd_bbs_b_2_1/002-9353419-4653610
Some professors opine that the following will soon become the new
standard textbook.
Amazon: Information Visualization
http://www.amazon.com/exec/obidos/tg/detail/-/0201596261/qid=1113329528/sr=1-1/ref=sr_1_1/002-9353419-4653610?v=glance&s=books
Finally, here is a recognized classic that bears on effective visual
representation and interpretation in all media, from books to posters.
Amazon: The Visual Display of Quantitative Information
http://www.amazon.com/exec/obidos/ASIN/096139210X/o/qid%3D978727203/sr%3D2-1/002-9353419-4653610
It has been an interesting challenge to address this question on your
behalf. If you find fault with my answer, please inform me through a
Clarification Request so that I may fully meet your needs before you
assign a rating.
Regards,
leapinglizard |