Jeffrey R. Tharsen     康森傑

Research Computing Center, University Lead Computational Scientist for the Digital Humanities

Digital Humanities Forum, co-Chair (2016-17)

Dept. of East Asian Languages & Civilizations (Ph.D. 2015)

The University of Chicago

Teaching and Research

My work is informed by a diverse variety of languages and literary traditions, ancient classics and cutting-edge technologies. Over the last 25 years I've been primarily engaged in the study of traditional Chinese literature and linguistics, intertextuality and philology. Particularly fascinated by the ways Chinese literary artistry in poetry and prose utilized various forms of phonetic patterning, I spent most of the last decade developing new metholologies and building digital systems to reveal the intricate ways acoustics, metrics and semantics were employed in concert in premodern Chinese texts. As a practitioner of the digital and computational arts, in recent years I've been focusing on creating new methods and toolkits for a number of languages and traditions to help us better understand the classic works produced by some of our most ancient civilizations, how their contents and uses transformed over time, and how we can best employ these new methods to bring the lessons from the texts, their histories, and the cultures that produced them to modern audiences.

About two decades ago I was struck by the following question: How can we hear what Shakespeare’s plays sounded like in his time, or Sappho’s verses, or the songs of ancient Sumer? As they were written in phonetic scripts, modern historical linguists have largely been able to reconstruct the sounds of these works. For written Chinese, which has always utilized a logographic as opposed to a phonetic script, dramatic shifts in pronunciation down through the millennia have largely obscured the original sounds of the language; much premodern Chinese rhyming poetry no longer rhymes, and many of the intricate phonetic patterns in classical prose have been completely lost. In an attempt to create a method which can approximate the original sounds of these works in their entirety, I developed a digital phonological toolkit which permits the efficient, broad comparative analysis of the complete sound patterns in any Chinese text, modern, premodern or ancient. Initial results have provided new evidence for intricate phonetic patterning in many of the great classics of the Chinese canon: the elegant euphony, prosody and literary artistry of ancient masterworks of poetry and prose. (My 2015 dissertation on this topic, entitled "Chinese Euphonics: Phonetic Patterns, Phonorhetoric and Literary Artistry in Early Chinese Narrative Texts", is available for download here: Link.) This new method also allows for the efficient use and comparative evaluation of phonological systems and can be a useful tool for teachers, learners and scholars of modern and classical languages, phonology and poetics. Thanks to modern database technologies and new types of digital texts and digital tools, we can now efficiently muster and deploy a wide variety of lexical and textual resources with just a few clicks of the mouse: The Digital Etymological Dictionary of Old Chinese ( mentioned above is an online toolkit via which any user can instantly perform comparative phonological analyses of virtually every graph in any user-provided Chinese text.

My research into the early development of Chinese literature and poetics utilizes a mix of modern linguistic strategies and traditional methodologies, mainly centered on philology, historiography, intellectual history, paleography, epigraphy and classical literary studies. However, making use of over three decades of training in information technologies and lessons learned from my previous career as a software engineer in the private sector, over the past five years I've also been teaching and working in the nascent field of "digital humanities", focusing on digital philology and developing new types of systems for textual analytics, new data architectures and designing computational methodologies and frameworks for linguistic and textual analysis in a number of languages. In my position as university lead computatational scientist for the digital humanities, I serve both as mentor to students interested in pairing humanities research methodologies with computational techniques, and as director of several teams of students and scholars, working to bring individualized research projects with digital and/or computational components to fruition and provide new insights into established disciplines. In 2014 I founded the "Digital Sinology and Digital East Asia" (DSDEA) Workshop at the University of Chicago, with a main goal to help students, faculty and scholars of East Asian languages begin to familiarize themselves with the wide range of recently-developed digital resources, toolkits and methods. Specifically targeting resources most beneficial for teachers, learners and scholars of Chinese, Japanese and Korean, the workshop focused on pairing traditional research methodologies with cutting-edge pedagogies and technology in the classroom. In my workshops and course lectures, we regularly cover advanced computational techniques and their applications in specific humanities disciplines, such as computational linguistics, network analysis, machine learning, text mining, tokenization and automated text parsing and various markup and advanced data-analysis strategies.

Selected Recent Conference Papers, Lectures and Workshops

Workshop: “Text Analysis and Visualization Strategies for Digital Humanists”, Research Computing Center, University of Chicago (February 2017)

Workshop: “Computational Research Methods : Digital Lexicography, Digital Phonology, Digital Philology”, Center for Spatial and Textual Analysis, Stanford University (February 2017)

“Designing Next-Generation Computational Systems for Philological, Phonorhetorical and Literary Analysis”, DH Asia 2017 Lecture Series, Center for Spatial and Textual Analysis, Stanford University (January 2017)

The Visual Text Explorer: A Visual Textual Analytical Framework for Everyone”, Chicago Colloquium on Digital Humanities & Computer Science, University of Illinois at Chicago (November 2016)

Workshop: “Hands-on OCR Training for Historians”, Department of History and the Visual Resources Center, University of Chicago (December 2016)

Lecture: “Optical Character Recognition (OCR) Strategies for Historians and Historiography”, Department of History, University of Chicago (November 2016)

Workshop: “Data Visualization and Interactivity”, Computation Institute, University of Chicago (November 2016)

Lecture: “Premodern Chinese Digital Phonology and Philology”, Department of Comparative Literature, University of Chicago (November 2016)

Workshop: “Introduction to the Digital Humanities”, University of Chicago Research Computing Center (January, July, October 2016) Workshop Materials (zip)

Lecture: “Digital History and Historiography”, University of Chicago Department of History (October 2016)

“Understanding the Databases of Premodern China : Harnessing the Potential of Textual Corpora as Digital Data Sources”, Digital Research in East Asian Studies: Corpora, Methods, and Challenges Conference, Universiteit Leiden (July 2016) PPT

Workshop: “Data Visualization Strategies and Digital Cartography: D3”, University of Chicago Research Computing Center (April 2016) Workshop Materials (zip)

“From Digital Lexicography to Digital Philology”, Association for Asian Studies Annual Conference, Seattle (April 2016) [Panel: “Digital Humanities Approaches to Chinese Culture, Part 1: Tools and Methods For Textual and Historical Analysis”, organizers J.Tharsen and P.Vierthaler] PPT

Lecture: "Introduction to HPC, MPI, Curated Datasets and R for Text Mining", University of Chicago Master of Arts Program in the Humanities (March 2016) Handout (zip)

Workshop: "Digital Humanities for Historians", University of Chicago Department of History (February 2016)

“Digitizing the Jingdian shiwen《經典釋文》: Deriving a Lexical Database from Ancient Glosses”, Chicago Colloquium on Digital Humanities & Computer Science, University of Chicago (November 2015) Poster

“Digital Sinology and the Future of Philology”, Lives and Afterlives: The Future of Asian Studies -- Third Annual Trans-Asia Graduate Student Conference at the University of Wisconsin-Madison (March 2015) PPT

“The Poetry in the Prose: Comparative Analyses of Phonetic Structures and Prosody in Selected Western Zhou Bronze Inscriptions, the Earliest Chapters of the Classic of Documents and Speeches from the Zuo Commentary to the Spring and Autumn Annals”, Stanford-Berkeley Premodern Chinese Humanities Graduate Student Conference, Stanford University (April 2014) PPT

“《古漢語詞源字典》數據庫與西周青銅器銘文古聲韻系統研究簡介” [“A Brief Introduction to the Databases of the Digital Etymological Dictionary of Old Chinese and the Ancient Sound Systems of Western Zhou Bronze Inscriptions”], Center for the Study of Excavated Documents and Ancient Philology 出土文献与古文字研究中心, Fudan University 复旦大学 (May 2013) PPT

Selected Publications

The Visual Text Explorer, user-customizable interactive digital visual interface designed for analytics of words and phrases in user-provided sources, allowing for simultaneous close reading and multidimensional data analytics, 2016 [Currently in beta at]

Chinese Euphonics: Phonetic Patterns, Phonorhetoric and Literary Artistry in Early Chinese Narrative Texts, Ph.D. dissertation, University of Chicago, 2015 (publication from DeGruyter forthcoming)

Intertext, customizable digital toolkit designed to assist with the identification and philological analysis of words and phrases in series of files, 2014-16 [Currently in beta at]

“Talking Shop: Digital Resources for Sinologists 1.0”, with Holger H. G. Schneider, Dissertation Reviews, Published May 27, 2014; accessible online at

FulWiki 富布維基, a customMediaWiki platform developed for the J. William Fulbright Program, Institute of International Education and the U.S. Department of State, 2013. Accessible online at

The Digital Etymological Dictionary of Old Chinese《古漢語詞源字典》, computer metadictionary and series of database tools designed to assist with the analysis of phonetic structures in Chinese texts, 2010-2016 Accessible online at

“Poetic Diplomacy : The Practice of fu shi 賦詩 in Parallel Passages from the Zuo zhuan《左傳》 and Guo yu《國語》”, M.A. Thesis presented to the University of Chicago, Dept. of East Asian Languages & Civilizations, 2012

“The Paleography, Rhetorical Structure and Content of the Shanghai Museum Chu Bamboo Manuscript ‘San de’〈參德〉”, M.A. Thesis presented to the University of Chicago, Dept. of East Asian Languages & Civilizations, 2012

“The ‘Offerings’ Chapter of the Wen xuan《文選‧祭文》”, M.A. Translation presented to the University of Chicago, Dept. of East Asian Languages & Civilizations, 2011

“Sport and the ‘Competitive Spirit’ in Ancient China”〈中國古代體育運動與競爭之關係〉, 2006