Skip to content

The view from the ‘Cultural Observatory’: Trillions of needles, billions of haystacks.

By ANNE EISENBERG [New York Times] – To cope with the information explosion, [Princeton University Professor David] Blei and other researchers write algorithms so that computers can sift through millions of works and find their common themes by sorting related words into categories. It’s a field called probabilistic topic modeling.

Other research tools identify shifts in language over time that could signal important cultural, scientific or historical changes. At Harvard, Erez Lieberman Aiden and Jean-Baptiste Michel, who jointly lead a group there called the Cultural Observatory, will soon inaugurate a browser that searches for such language changes in a large online repository of scientific papers known as arXiv (pronounced like “archive”).

Users will be able to type in one or two words at the site, called Bookworm-arXiv, and immediately see a graph showing the ups and downs of the phrase’s use in the archive, Dr. Michel said. (A test version is at arxiv.culturomics.org.) Users can then click on the graph and drill down to read the original papers in which the terms appear, tracing ideas back toward their roots, or to spots where scientific ideas spread from one field to another.

The new analytical techniques won’t replace the close reading and interpretation of text that is the province of scholars, said Anthony T. Grafton, a history professor at Princeton and a former president of the American Historical Association.

“But these tools have enormous implications,” he said, in their ability to reveal unexpected patterns and associations in the historical record. “These are tools that can pick up big changes,” he said. “You can’t do this by using older, conventional means of reading books and taking notes.”

Continued at The New York Times | More Chronicle & Notices.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x