Computing Science Seminar by Ani Nenkova

Computing Science Seminar by Ani Nenkova
-

This is a past event

Identifying Elements of Text Quality Automatically

Accurate methods for quantifying elements of text quality would be useful in search, computer-assisted writing and text generation technologies. However progress in this area of research is constrained by the lack of suitable datasets for development and evaluation.Here I introduce a corpus of science journalism articles which fulfills the glaring need for realistic data for text quality applications. The corpus consists of science journalism pieces, categorized in three levels of writing quality. I will describe how we  identiļ¬ed, guided by the judgments of renowned writers, samples of extraordinarily well-written pieces and how these were expanded to a larger set of typical journalistic writing.  I introduce methods for determining two elements of text quality---sentence specificity and syntax-based local coherence---that distinguish amazing from typical writing. I also present results on novel automatic models for elements of text quality specific to the science journalism domain, including the presence of creative language and visual elements in the text.

Bio:

Ani Nenkova is an Assistant Professor of Computer and Information Science at the University of Pennsylvania. Her main areas of research are automatic text summarisation, affect recognition and text quality. She obtained her PhD degree in Computer Science from Columbia University in 2006. She then spent a year and a half as a postdoctoral fellow at Stanford University before joining Penn in Fall 2007. Ani was a member of the editorial board of Computational Linguistics (2009--2011) and has served as an area chair/senior program committee member for ACL, NAACL, AAAI and IJCAI.  Her research is sponsored by NSF, NIH and Google.

Speaker
Prof Ani Nenkova
Hosted by
Advaith Siddharthan
Venue
MT203