A picture is worth a thousand words. The old adage is true – today, more than ever as the impact and importance of quality data visualization reaches across industries and has the power to improve how we tell stories.
The Stanford Institute for Computational and Mathematical Engineering (ICME) recently hosted the Xtrapolate roundtable, a forum to discuss the latest technology trends and use cases in data visualization. ICME lecturer Dave Deriso gave a talk on interactive 2D and 3D visualization using open source web tools, and computer science professor Maneesh Agrawala gave a talk on theory for effectively communicating quantitative information visually. Here’s what they shared:
In this example, an interactive data visualization tool called crossfilter allows a user to access dynamically updating flight information. When a user selects one point, additional data – departure time, arrival delays, distance, etc. – also update.
Another example allows users to explore a large, multi-dimensional dataset centered on 27 years of Nasdaq 100 index data. The charts provide instant feedback on a user’s interaction. By clicking on a specific time period, you can get interesting answers to questions like, “Is Friday or Monday the most unlucky day for investors?” or “Is spring better than winter to invest?”
Ultimately, interactive visualizations allow users to have a conversation with the data – to ask questions and get immediate answers. “They take complex information and present it in a way that’s accessible to a general audience,” Deriso says.
Interactive data visualizations are living, breathing representations of data. Because they update dynamically, interactive data visualizations allow users to extract information based on their specific interest.
“Interactive visualizations are engaging. When a user enjoys interactively exploring your data, they are more likely to understand and appreciate it,” says Deriso.
Previously, complex 3D visualizations required specialized applications and high-performance computers. Recent advances in browser-based 3D graphics, powered by WebGL, have enabled these sophisticated scientific visualizations to be made available online. Web interfaces also allow these visualizations to become compelling interactive experiences.
In this example, a WebGL library called Three.js powers a sophisticated MRI image of a brain in 3D with several layers of information overlaid. Animation features of the library allow the brain to smoothly morph into different representations, from a standard brain to a flat map-like view. A 2D plot on the right serves as an interface for displaying different data sets collected from the study while simultaneously serving as a visual depiction of the results of the experiment.
“This was a wonderful neuroscience experiment and the visualization made it quite popular among neuroscientists,” explains Deriso. “You used to only see images of the brain printed in the research article, but now you could share the entire dataset, view it, and interact with it all from one link.”
Deriso also discussed how these tools could be ported to immersive virtual reality platforms, such as Google Cardboard and Oculus Rift. If you happen to have a Google Cardboard handy, here is an example of a visualization where you are standing in the middle of a room while an algorithm computes a simulation of birds flocking around you. Deriso believes interactive web-based 2D, 3D, and VR will be the future of data visualization.
Well-designed visuals allow readers to easily understand a lot of quantitative data at once. Unfortunately, poorly designed visuals are everywhere – in reports, magazines, books, on TV and the internet. Using computer vision and machine learning, two processes that attempt to duplicate the abilities of human vision by electronically perceiving and understanding an image, Agrawala’s research focuses on figuring out how to let machines access the wealth of information locked inside of charts and graphs, so that ultimately, visuals can be redesigned to be more accurate and also easier to understand.
Well-designed visuals make it easy for readers to extract important information quickly. The National Institutes of Health (NIH) pie chart below was created to show the percentage of budget devoted to research for various diseases in 2005. A reader can easily see from this pie chart that AIDS research takes up the largest percentage of the 2005 budget. But it isn’t as easy to determine if NIH dedicated more budget to diabetes or Alzheimer’s – the sizes of those pie slices are very similar to one another. In this case, while a pie-chart does communicate some data, it’s not effectively designed to allow an audience to understand all of the data.
Agrawala and his team are using low-level image-processing operations, including edge, corner, pixel intensity, and color detection, as well as computer vision techniques, to extract information from images. They use these artificial systems to enable machines to recover the “structure” of a visual – that is, the type of chart (bar, pie, scatter plot, etc.), the data, and the graphical marks. Once the structure has been recovered, it’s possible to manipulate and redesign the visual. In the NIH example, access to the structure of the visual made it possible to convert the pie chart into a bar chart. With this change in representation, it’s easier to see that NIH dedicated more budget to Alzheimer’s than to diabetes.
Beyond visual redesign for optimal data representation, Agrawala brings visuals to life by adding graphical overlays – elements layered onto a visual to facilitate the perceptual and cognitive processing in chart reading. In the chart below, Agrawala adds grid lines to make it easier to read off the lengths of the bars, allowing the reader to trace back to the axes. It’s also possible to add an overlay that highlights a particular part of a visual, or to add an overlay that displays summary statistics including mean, median, etc., so the reader can more easily extract specific and usable information.
Today, visuals are almost always embedded in text documents. Agrawala and his team are thinking about how to better integrate the reading of text and charts, so that readers are not forced to read them separately. The example below is a bar chart from The Guardian. The bar circled in green is referenced in the text. Ideally, the reader would be able to simultaneously isolate text and the corresponding portion of the visual to read these two things together, and to allow text and visuals to simultaneously inform one another.