The Digital Humanities and Word Clouds

Ever since I joined the Initiative on Neuroscience and Law, I’ve had a growing interest in big data analysis. With so much information being digitized — whether it’s criminal records, government documents, or historical archives —  researchers can engage with old resources in new ways and ask questions on scales previously unimaginable. Though I’m not too vocal about it here (yet), right now I’m working to apply what I’ve learned at the Initiative to the Library of Congress’ “Chronicling America” archives. This crossing of fields, for those who are curious, is called the “Digital Humanities.” (If you’d like to know more, I suggest checking out the historian Dan Cohen’s blog. Fred Gibbs also has a helpful introduction to historical data analysis here).

I won’t reveal any of my graphics here (I’m saving them for a future post), but here’s an example of the Digital Humanities that everyone’s familiar with: Word clouds. Technically, these were possible before the digitization of famous works, but it’s the kind of work that required slave labor teaching assistants. The following I put together in a few minutes using Project Gutenberg and Wordle.

This is Sinclair Lewis’ Main Street (1920):

Main Street Sinclair Lewis Word Cloud

Lewis’ Babbitt (1920):
Babbitt Sinclair Lewis Word CloudThomas Paine‘s entire collected writings:
Complete Writings Thomas Paine Word Cloud My personal diaries (May 2008 to May 2012):

Preston Journals Word Cloud

Now, even though all of this big data talk is just an excuse for me to post word clouds, I see in each of these one thing: Opportunity. Imagine doing this same work with thousands of books and newspapers. Imagine tracking keywords across time to measure the political trends of in a community (or state or country). We can! We are!

Every researcher ought to be salivating.

One thought on “The Digital Humanities and Word Clouds

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s