2: Getting Started
At its core, visualization starts with data. Scholarship in the digital age will increasingly call on researchers to reach a comfort level with databases and database management systems. Fortunately, new tools and techniques are emerging that make dealing with data easier and more interoperable than ever.
For one thing, if you rely on public data sources for information – say, historical economic statistics by country– you might find the tabular information in the pdf’s you download, or the html you copy and paste to be a terrible mess. Columns may be merged together, entities like place names may be written in different formats (Nev., Nevada or NV, for example). Tools like Stanford’s Wrangler and Google Refine let you interactively track down those glitches, fix them and output clean, usable data quite easily.
For getting your information into a database, Google also offers not only online spreadsheets but the increasingly powerful Google Fusion Tables, which lets you upload, edit, visualize and share data files of up to 100 megabytes in size. Whereas a desktop application like Microsoft Excel might become unresponsive after opening a spreadsheet of ten thousand rows, Fusion Tables lets you manage enormous files relatively painlessly and, moreover, lets you join them with related data quite easily. Want to create a heatmap of occurrences of what you’re studying on a world map? With a few clicks you’re there. Plus you can output your data set in any of the predominant file formats that you might need to create visualizations with other tools.
One of the keys to good visualization is understanding what your immediate goals are. Are you visualizing data to understand what’s in it, or are you trying to communicate meaning to others?