This tutorial was written by Katherine Walden, Digital Liberal Arts Specialist at Grinnell College.
This tutorial was reviewed by Sarah Purcell (L.F. Parker Professor of History) and Gina Donovan (Instructional Technologist) at Grinnell College, and edited by Papa Ampim-Darko, a student research assistant at Grinnell College.
Network Analysis-Part I (Palladio) is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Digital scholars can use a wide range of software programs and digital tools for network analysis and network visualizations. In this tutorial, we will use a combination of Palladio (a web-based GUI interface), NetworkX (a Python package) and Gephi (an GUI software) to explore various aspects of network analysis and visualization.
Network Analysis using Palladio
Developed through a National Endowment for the Humanities Implementation grant, Palladio was released by Stanford University in June 2016. Designed for historians and other digital humanities scholars, Palladio is a web-based application that allows users to analyze data that includes time and network features and display that data via maps, timelines, other types of visualizations, and gallery exhibits. Designed as a browser-based GUI interface, Palladio works well for quick visualizations and offers limited interactive export options.
1-We will be working with sample networked data sets based on the Oxford Dictionary of National Biography and the Six Degrees of Francis Bacon project. These data sets include a list of names and relationships for early seventeenth-century Quakers.
2-Navigate to http://vivero.sites.grinnell.edu/files/ in a web browser.
3-Save the quakers_nodelist and quakers_edgelist CSV files to your Desktop.
4-Open these files in Microsoft Excel to explore the data structure.
What types of historical figures are represented?
How are they described in the nodelist data?
What additional questions do you have about the individuals who will be represented as nodes?
How is the edgelist data structured?
Based on a preliminary scan of the nodelist and edgelist CSV data, what types of networks do you think this data might illuminate?
Are there gaps, silences, or alternative networks that are not accounted for in the data?
Loading data into Palladio
5-Open the Palladio home page in Chrome or Firefox. Click on the Start icon to open a new project.
6-Drag and drop the nodelist CSV into the Load csv window.
7-Click the Load icon to load the data into Palladio.
8-Click on “Untitled” to relabel the table as Node_List or another descriptive name. You may notice Palladio has a red dot next to the Historical Significance field. Click on the red dot to open the Edit dimension pop-up window.
9-Palladio requires you to verify special or unexpected characters in the data fields—in this case, a comma in the “maker of clocks, watches, and barometers” role. Click Verify special characters icon, then click Done.
10-Back on the Data screen, you can see the error has been resolved, and Palladio has also automatically described your data fields as a text, date, or number. Before visualizing this data, we want to also load the edgelist data.
11-Click on the Name field to open the Edit dimension window again.
12-Under Extension, click the Add a new table icon.
13-Drag and drop the edgelist CSV file into the Add new table area. Click the Load icon.
14-Palladio returns to the Edit dimension window and tells you out of the 119 names included in the nodelist table, 78 also appear in the edgelist table.
15-Click the Done icon to add the new table to your Palladio project.
16-The new Untitled table will appear in your Data tab. Rename the new table Edge_List or another descriptive name. Click on Provide a title to this project to rename the project Network_Tutorial or another descriptive name.
Analyzing and Visualizing Data in Palladio
17-Palladio does offer mapping functionality, but our data does not contain spatial information. Click on the Graph tab to start building a network visualization.
18-Select Source as the Source dimension and Target as the Target dimension, and Palladio will generate a network graph.
19-What do you notice about the networks generated in Palladio? How would you describe these networks using the terminology outlined in Chapter 6 of Exploring Big Historical Data? What surprises you about the networks, or what additional questions do you have?
20-Check the bock next to Size nodes and select Number of Edge_List for According to. These changes sized our nodes based on where they appear in the edge_list data. Select Number of Node_List for According to, and the nodes are sized according to where they appear in the node_list data.
21-How do these graphs differ? What do they highlight or emphasize about the data? What questions do you have?
22-We can choose other Source and Target fields to alter the network visualization. For example, choosing Gender as Source and Name as Target reveals the gender ratio of the names represented in the data.
23-Choosing Gender as Source and Historical Significance as Target highlights the gendering of historical roles in the data set.
24-Palladio offers additional facet, timeline, and timespan analysis via the buttons on the bottom of the page. Feel free to explore these additional functions on your own.