Data is everywhere and well-designed data graphics can be both beautiful and meaningful. As visualizations take center stage in a data-centric world, researchers and developers spend much time understanding and creating better visualizations. But they spend just as much time understanding how tools can help programmers and designers create visualizations faster, more effectively, and more enjoyably.
Let’s talk about some of the work in that area, and what the future looks like.
As any visualization practitioner will tell you, turning a dataset from raw stuff in a file to a final result in a picture is far from a single-track, linear path. Rather, there is a constant iteration of competing designs, tweaking and evaluating at once their pros and cons. The visualization research community has recognized the importance of keeping track of this process.
Consider the “vanishing consultant problem.” In many situations, a small number of people hold a large portion of the project “in their heads.” When they move on, they take much of the project with them, whether they want to or not. Recent visualization systems, like VisTrails, recognize the importance of this institutional memory, and try to make it easy for the knowledge to exist out in the world. This data, automatically tracked, can then be used for searching similar work across different users, or it can help determine how different visualizations and designs were created and how they differ from each other. Crucially, these systems exist so that the mental effort of the user stays in the problem domain (in this case, creating great visualizations), rather than in bookkeeping.
At roughly the same time that VisTrails came out, Martin Wattenberg and Fernanda Viegas developed ManyEyes, a groundbreaking project and website where users can share visualizations, datasets and stories.
Visual.ly itself highlights this type of effort to increase access to data and visualizations for non-expert programmers. Concurrently, the last few years have seen an explosion in the quality and availability of personalized recommendation tools. The best-known example is Netflix, whose movie rating prediction system drives a large portion of the traffic on the website; social networking sites like Facebook and Twitter suggest new friends to add and users to follow.
This type of data-driven interaction will only become more prevalent. One of the tenets of good visualization practice is to “iterate early, and iterate often.” The quicker new alternatives can be tried, explored and compared, the faster we can arrive at the right result.
One of the more exciting prospects for visualization tools is to provide Netflix-style recommendations for every option in the system. Expect, for example, your favorite visualization tools to suggest color schemes based on how you set the type on your infographic: “Futura users tend to pick these schemes.” In addition, websites such as crowdlabs (“VisTrails on the web”), rpubs, Github (especially Github gists and Mike Bostock’s blocks) and jsfiddle have shown how socialization, when done correctly, is very empowering for users: it is almost impossibly easy to create and share simple ideas expressed via code.
Without extrapolating too much, one can foresee a future where the design, iteration and deployment of data graphics is done in much the same way that we today share code snippets and repositories. Our backends will mine data generated by users as they interact with our systems and websites, automatically learn about these interactions, and use it to drive better visualization design. In this future, attribution and tracking of sources and data is automated and transparent; since authoring tools are integrated with publishing on the web, integrated, in-place analytics will be the norm. Our tools will then produce reports such as “of the three types of plot you chose, the bar chart resulted in 20% longer engagement.”
Infographic and visualization design is very much a subjective experience, and tools won’t ever replace careful, thoughtful design. But the more they automate the boring, repetitive, and yet invaluable book-keeping processes of our design workflows, the more time we can spend understanding and exploring the design space. And we will be better designers and data analysts for it.
This is an exciting time to work in data visualization.
Carlos Scheidegger works on visualization and data analysis at AT&T Research, in the Information Visualization department. He develops algorithms to better understand large data through visualizations, and better systems to make the work of data analysts simpler, faster, more effective and more fun.