Principal Investigator: Matt Erlin, Washington University in St. Louis
This project employs the techniques of probabilistic topic modeling to test a set of longstanding assumptions about the periodization of German literary history. Scholars have applied a fairly consistent set of period designations to categorize German literature written during the span of roughly one hundred years between 1750 and 1850: “Enlightenment and Sensibility,” “Storm and Stress,” “Weimar Classicism,” “Romanticism,” “Biedermeier,” “Young Germany,” and “Realism.” Applying the MALLET topic modeling toolkit to a data set of 154 novels written between 1731 and 1864, the project has been evaluating whether these novels do in fact cluster together in ways that supports the scholarly consensus, or whether there might be hidden thematic structures in these works that point to new ways of thinking about their “proximity” to one another. This analysis has the potential to shed light on a range of research questions related to the literary history, especially with regard to understanding those features of texts that might cause us to classify them together. Is similarity, for example, best grasped in terms of similar themes (as a topic modeling approach would suggest), or should distinctions of style and structure play an equally significant role? To what extent do variables such as genre (the novel) or gender (male versus female authorship) generate patterns of similarity that challenge traditional ways of thinking about classification? A final component of the project involves an attempt to find compelling ways to represent visually the idea of a proximity among texts as measured across multiple variables. Network diagrams would seem to offer a particularly promising model for such visualizations.
The MALLET interface was designed to simplify the use of MALLET, software for topic modeling. The interface replaces running the program from the command line and offers choices for customizing topic modeling runs and can queue up more than one run at a time. Users have individual accounts and are able to keep multiple topic modeling runs organized by the date run. After running the software, the interface allows the user to view the results, see which text passages fit into which topics and which words from those topics are present in the individual text chunks.
The Enlightenment Novel
The Enlightenment Novel project seeks to rethink how this literary period is defined through text mining and other forms of "distant reading." We compared literary texts with publication dates ranging from 1750-1850 with a collection of philosophical Enlightenment texts via topic modeling (both with and without the philosophical texts present in the corpus) in order to identify Enlightenment topics and the literary texts that have a high participation in those topics. We calculated the Euclidean "distance" between the texts and then used Gephi to create a network diagram of works and their connections to the identified Enlightenment topics with those calculations. The various topics that were deemed to be "Enlightenment" topics concerned different aspects of Enlightenment thinking.
The Literary Lists project investigates the composition of lists in literature from the eighteenth to the early twentieth century. Using both and English language and German language corpus, we were able to pull the lists, the text they came from, the author, and the number of items in the list (as well as the complete sentences in which the lists appear) into a csv file. From there, we were able to calculate an average list length for each individual text as well as each author represented in the corpus. Additionally, we are interested in the content of the lists, and have developed a system of categorization for the lists. Of particular interest are the human attributes, which may reveal certain Enlightenment ideals present in the texts.
Enlightenment Ngram Viewer
The Enlightenment Ngram viewer is designed to search novels by Goethe and others for ngrams found in philosophical texts by Fichte, Hegel, Herder, Kant, and Leibnitz. Its primary purpose is to ascertain whether or not authors are recycling bits of texts from philosophers. It is an attempt to determine whether or not the Enlightenment novel has aspects that can be quantifiably measured.