by Martin Wattenberg, Manager, Visual Communication Lab
16 May 2005
Data visualization, the art of using visual thinking to understand complex information, is a growing trend--but it also has an illustrious history. Some of the biggest scientific discoveries hinged on turning data into pictures. One famous example of visualization is the periodic table of elements: When Mendeleev published a grid-like arrangement of elements for a textbook in 1869, not only did he create a beautifully simple display of known data, but his diagram highlighted gaps in current knowledge that told scientists where to search for new elements.
Visualization has always involved a partnership between science and art. A wonderful example is the work of Santiago Ramon y Cajal in the late nineteenth century. Ramon y Cajal, a doctor who trained as an artist, used a new staining technique to examine brain tissue. He created a series of drawings that are so clear and precise they are still used today in introductory neuroscience courses. And at the time, they led him to argue that the brain was made of discrete neurons, a discovery that won him the Nobel Prize in 1906.
Although visualization is an old idea, there are two new developments that give the partnership between art and science a central place today. One is increasing computing power, and the other is an increasing amount of information available online. As computers have been used for visualization, we've seen an explosion of new techniques.
The first examples largely came from a field known as "scientific visualization"--the display of numerical data, often arising from physical observations or simulations. Diagrams of vector fields, of temperatures observed around the earth, or fluid flow around a boat's hull are all examples of this field. The authors of these programs are the spiritual descendants of Santiago Ramon y Cajal, trying to communicate facts about the physical world clearly and precisely.
Starting about two decades ago, computer scientists began systematically investigating how computers might be used to visualize data not tied directly to the real world, creating a new field termed "information visualization." These researchers invented some astonishingly creative techniques for displaying abstract data such as tree structure—algorithms such as the "hyperbolic tree," a method of displaying a tree as if it were embedded in a two-dimensional hyperbolic plane, or the "treemap," which used an elegant arrangement of tiles on a plane. Investigators in the field of information visualization are following the lead of Mendeleev, inventing completely new geometries for abstract data.
At IBM, we're working hard on creating new visualizations to let people understand their data. We're interested in all kinds of visualization, but there are three key directions we are pursuing.
The first is visualizing text and conversations. Unlike the visualization of numerical data, this is a relatively new field, with lots of potential for new discoveries. Some examples of the data we work with include e-mail, large discussions, and online spaces such as wikis. Our "history flow" visualization of collaborative editing can be downloaded from alphaWorks. "History flow" is a good example of a second direction, as well: the visualization of the histories of complex objects. We've been adapting the techniques behind history flow to display the history of structures ranging from corporate org charts to the evolution of software libraries. A third, extremely promising area is what we call "social data exploration." In social data exploration, a visual display becomes the catalyst for conversation and collective data mining, analysis, and conversation.
If you have data you'd like to understand better--or would like to help your users understand better--what's the best way to get started? Along with the technologies on alphaWorks, there are also open-source libraries, such as JUNG and Prefuse that you can use. If you have a large, scientific data set, try the excellent OpenDX package from IBM.
About the researcher
Martin Wattenberg is a mathematician whose research interests include information visualization and its application to collaborative computing, journalism, bioinformatics, and art. Before joining IBM, Dr. Wattenberg was the Director of Research and Development at SmartMoney.com, where he designed internet-based financial software. His work at SmartMoney included the groundbreaking Map of the Market, which visualizes live data on hundreds of publicly traded companies.
Tell us what you think.....
Let us know what you think about the Visualization Research topic, the related technologies, or what other emerging topics you'd like to see. Your feedback is appreciated.