Unsupervised Decomposition of a Document into Authorial Components

Citation:

Moshe Koppel, Navot AkivaI, Idan Dershowitz, and Nachum Dershowitz. 2011. “Unsupervised Decomposition of a Document into Authorial Components.” Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (2011): 1356–1364, Pp. 1356–1364. Direct Link

Abstract:

We propose a novel unsupervised method for separating out distinct authorial components of a document. In particular, we show that, given a book artificially “munged” from two thematically similar biblical books, we can separate out the two constituent books almost perfectly. This allows us to automatically recapitulate many conclusions reached by Bible scholars over centuries of research. One of the key elements of our method is exploitation of differences in synonym choice by different authors.
Last updated on 09/20/2017