[This blog originally appeared on Big Data Republic in 2013. Unfortunately, all the content has been taken offline]
Note: Jane Austen fans can find a variety of merch here: https://www.zazzle.com/store/totally_jane_austen
Sir
Walter Scott contrasted his style of writing with that of Jane Austen: "The
big Bow-Wow strain I can do myself like any now going; but the exquisite touch
which renders ordinary commonplace things and characters interesting from the
truth of the description and the sentiment is denied to me. "While he
characterized his work as large, Jane Austen called her own small, a "little
bit (two inches wide) of ivory on which I work with so fine a brush."
Seeing
themselves as such strong contrasts to each other, they likely would have been
very surprised to be coupled together as "the literary equivalent of Homo
erectus, or, if you prefer, Adam and Eve. " Using computational power to
analyze 3,592 works published between 1780 and 1900, he concluded that Walter
Scott and Jane Austen were the two primary influencers of all novelists who
came after them in terms of style and theme.
Those are the types of discoveries that Jockers expound upon in his
newly published book, Macroanalysis: Digital Methods
and Literary History.
Systematic
textual analysis has a history that goes much further back than computers. The first
concordance, according to The Word Crunchers dates back about 800 years. It was a most labor-intensive
project, taking up the work of 500 friars. A Chaucer concordance took 50 years
until it was read for publication in 1927. Computers entered the picture as early
as 1951 when "I.B.M. helped create
an automated concordance." Those were the days of punch card programming,
so “indexing all of Aquinas took a million man-hours.” It was only complete in 1974.
Ten years later, though, computers could
analyze texts effortlessly, as depicted in the reports of a novelist’s favorite
word in David Lodge’s novel Small World.
The
proliferation of digitalized books, courtesy of Google books is what makes it
possible for computers to now process huge volumes of text from thousands of
works. Matthew Jockers, along with
Franco Moretti, founded the Stanford Literary Lab in 2010. The research is
done in groups along the lines of scientific investigations with the help of
computer.
Data-diggers are
gunning to debunk old claims based on "anecdotal" evidence and answer
once-impossible questions about the evolution of ideas, language, and culture.
Critics, meanwhile, worry that these stat-happy quants take the human out of
the humanities. Novels aren't commodities like bags of flour, they warn.
Cranking words from deeply specific texts like grist through a mill is a recipe
for lousy research, they say—and a potential disaster for the profession.
It’s not just a matter of traditionalists feeling
threatened by computer power. Algorithms that depend on Google books for
meta-data tags may reach wrong conclusions. Geoffrey Nunberg, a linguist,
is quoted as declaring Google’s tags "a
mess," not to be relied
on. Aside from questions of accuracy,
there is that of relevance. Researcher have to ask themselves: "What does
this tell me that what we can't already do?"
I had the same question when I read the article on Jockers. Aside from identifying the novel’s
trail set by Austen, it points out the supposed revelation that the novels of George
Eliot "more closely resemble the patterns of male writers." Is it altogether surprising that the author of
Silly Novels by Lady Novelists who deliberately adopted
a masculine pseudonym broke the mold conceived for female writers? That’s something that any student of
Victorian literature should already know.
What
this form of research could do that traditional studies do not is unearth the
roads not taken by the literary canon. In a New Scientist article on Jockers’ work, Nicholas Dames, chair of the department
of English and comparative literature at Columbia University as seeing the
value of this type of research to bring to light the full body of fiction "rather
than the small percentage of canonical texts that are usually taken as
exemplary." That opens up the consideration of the canon in a larger
context, which can lead to questioning the marked trail of influence. But that will only work if the Google Books
data proves comprehensive and reliable enough to accurately represent the
literature of the time.
RelatedJane Austen at the Morgan
Three Janes, Two Governesses, and the Abolitionist Movement
Some observations on Jane Austen's Emma
Jane Austen and Autism
Pride, Prejudice and Persuasion: Obstacles to Happiness in Jane Austen's Novels
No comments:
Post a Comment