Data Visualization Quick Start Guide

data-viz-quick-start-guide

Anyone with a computer can make a data visualization nowadays. The proliferation of graphing tools and online resources continue to make the process ever more accessible. But having a camera in one’s phone does not a photographer make. Confusing and inefficient graphics are everywhere. And in the Information Age, having visualization savvy has become, to quote Dona Wong, “almost as indispensable as good writing”.

As both an art and a science, data visualization requires a rare combination of skills to master. But you don’t have to be a statistician to create beautiful graphics. I believe that with a sharp mind and a good eye (and a little guidance), anyone can be a data visualist.

This guide is intended for those who are new to data visualization or have some experience with it and want to hone their skills. I hope you find it useful and please pay close attention as there will be an exam at the end of the article.

Defining Your Goals

The very first thing you’ll want to do when making a data visualization is decide on its purpose. Why are you taking the time to create this thing? What are the primary goals you’re trying to achieve? Answering these questions out of the gate is essential to keeping your work focused. Without the necessary constraint of a concrete goal, your analysis and design can easily tumble down a hundred different rabbit holes.

Defining Your Audience

The next essential step is clarifying who your visualization is intended for. Answering the questions below will help further narrow your focus.

  • What does your audience need from the visualization? Is there a specific point you need to make? What will add value for the reader?
  • What is their visual literacy level? Some audiences are more adept at reading charts than others. Are you designing your visual for a bunch of data geeks? If so, you may opt for an approach that’s more exploratory and detail-oriented rather than curated and simplified.
  • How much time do they have? Will they only have a few minutes to skim an overview or a half hour to comb through a deep dive? In what context will they be viewing the visualization?

Finding the Story

Great visualizations always have a clear, deliberate message. Any dataset can tell several different stories depending on which numbers you use and how you frame them. It’s up to you to uncover the strongest story — this is about finding the signal in the noise.

Sometimes you’ll be looking for a trend or an outlier in your data; something you can emphasize. Other times you’ll have to decide on the data series that best illustrates your point (like using market share instead of total revenue).

Most datasets will contain information that is irrelevant to the story you’re telling. Simplifying your data and boiling it down to the essentials is critical for delivering an unclouded message to your reader.

In some cases, you may want to crunch the numbers to emphasize your point, like changing absolute values to percentage change. Be careful though; I’d recommend brushing up on your statistics chops before doing any advanced numerical adjustments.

Exploration and Experimentation

The human brain is notoriously bad at discerning patterns when presented with a bunch of numbers in a spreadsheet. When I can’t immediately pick up on the story in a dataset or I’m not sure where to start with my visualization, I’ll do a little something the experts call “exploratory data analysis”.

For me, this involves loading the data into a simple software tool like Excel or Tableau and just seeing what the numbers look like when plotted in different ways. Don’t be afraid to slice and dice — isolate certain data points, compare fields with simple graphs, or even experiment with other data sources for context.

Based on these experiments, I’ll usually get out an old fashioned pen and paper and sketch some initial ideas. The goal is to start thinking of your data in terms of visuals rather than just numbers.

Choosing a Chart

Choosing a chart type that fits your data is critical to making an effective visual. Different charts are good for displaying different types of information. For example, a line chart is best used for displaying a trend while a bar chart is useful for comparing discrete values.

Before you even think of getting fancy, start with the tried and trusted. Classics like line and bar charts will serve your needs in many cases. With a little basic design work, these can look just as professional as anything else.

If you’re feeling ambitious, you can give some of the less common charts a try. These include scatterplots, area graphs, bubble charts, heat maps and tree maps among others. For a full list take a look at The Data Visualization Catalogue. A word to the wise though: if you can’t articulate the reason behind using a more advanced chart type, you probably shouldn’t be using it.

The Right Tool for the Job

The Internet has seen an explosion of data visualization tools in recent years. With so many options on the market, it’s hard to know which one will be the best fit for your needs. I recommend exploring a few different options before deciding on one. The good people at datavisualization.ch have compiled a comprehensive list of pretty much every visualization tool out there. Here are some of my favorites to get you started.

Excel is usually the first program people use when playing around with visualization. It’s a great place to start and with a little tweaking, can produce passable static graphics.

Adobe Illustrator is a great go-to if you’re creating a static visual. It has a robust collection of basic charting tools and really shines when it comes to customizing your design.

Tableau is an application that allows you create a range of impressive visualizations with relatively little effort. It can be great for exploratory analysis as well.

D3 is a relatively new kid on the block that excels at building interactive visualizations. It’s got a bit of a learning curve but if you’re comfortable with coding it’s well worth your time.

Mapbox is a fantastically powerful and accessible mapping program. It’s one of the most customizable services on the market and also provides some tools and synergies with other programs for advanced map visualizations.

R is a more advanced program used for serious statistical computing and exploratory data analysis — useful for crunching complex datasets.

Processing is another advanced computational tool that’s used to create some stunning works of art in addition to visualizations.

Making it Beautiful

It would be foolish for me (or anyone) to think they can write a definitive guide for designing effective visuals. Data visualization is an art as well as a science, after all. What I can do however, is give you some tips that I’ve found helpful in my experience.

Emphasize important points. This is where storytelling meets design. Guide your reader through the graphic by giving visual emphasis to the data points that tell the story best. One good technique is to use a highlight color on the takeaway data while using shades of grey for data of secondary importance. This is what designers call establishing a “visual hierarchy”.

Use color sparingly. Color is one of the most powerful tools in the designer’s toolkit. But a little goes a long way. Limiting yourself to a single highlight color is usually the safest bet. If you absolutely must have multiple colors, it’s best to go with different shades of the same color.

Mind the gap. White space is another useful technique for making a layout lighter and more accessible. A cluttered visual can be suffocating — give your design some breathing room by spacing the elements out. To quote my old colleague Pete Santilli, “Smart use of white space can lend elegance to an otherwise dull display”.

Data-ink ratio. Take a lesson from the father of modern data visualization, Edward Tufte, and employ the data to ink ratio. This rule suggests that the less figures, text, lines and graphics on your page, the better. When designing, constantly ask yourself which elements are unnecessary and which could be more efficient.

Label and annotate. Explicit labelling is absolutely essential for an effective, easy to read visualization. Use labels for anything in your graphic that isn’t obvious, like chart titles, descriptions, axes and legends. It also may be useful to add annotations or explanatory text to emphasize the narrative. Be careful though; too many words can overwhelm your design. If you’re unsure, ask yourself if what you’re including adds value for the reader. Remember the data-ink ratio.

Interactive or Static?

Deciding whether to design your visual as interactive or static will be largely dependent on the goals you’ve defined and your particular skill set.

Static visualizations, sometimes called “infographics”, are generally straightforward and easily sharable in any medium, physical or digital. However, their invariable nature limits the amount of information you can include in them.

Interactive visuals can provide more “information density” by concealing data beneath layers of interactivity. They have the potential to yield a more exploratory or narrative experience, but will often do so at the expense of simplicity.

In Closing

Before I leave you, a word of friendly advice. While I do have a little experience designing visualizations, by no means do I claim to be an expert. So if you’re serious about trying your hand at data visualization, I would strongly recommend picking up some books from the real pros (included below). And if you think I left something critical out or you have any questions, let me know in the comments! Good luck in your visualization endeavors and thanks for reading.

Great Books You Should Definitely Read

Helpful Resources

Data Blogs