Projects

In this section
Projects

The variety of types of research pursued in the NLG group is shown in the details of our major projects of recent years, accessible by the links below.

Current Projects

  • SASSY
  • CURIOS
  • Digital Conservation
  • WikiRivers
  • MinkApp
Affecting People

People | What the work is about | Bibliography


This funding from the Engineering and Physical Sciences Research Council under its "platform grant" scheme provides general support for the activities of the NLG group, focussing on three strands in particular:

  • experimental studies of how readers are affected by language
  • modelling of language users, particularly with regard to "affective" aspects
  • examination of how best to construct more general NLG systems, in terms of internal structures and processes

The grant has supported a number of workshops, collaboration with research work elsewhere, further development of the SimpleNLG software, and several "mini-projects" in the area of NLG.


People


What the work is about

Limitations of Traditional NLG

In the real world, texts vary enormously both in their communicative purpose, and in the abilities and preferences of the people who read them. Much previous research in NLG has assumed that the purpose of generated texts is simply to communicate factual information to a user [17]. There has been little attention to other aims, such as persuading people [16], teaching people [9,25], helping people make decisions [18], [6], and entertaining people [19]. While texts with these other aims usually do communicate information, they do so in order to affect the reader at a deeper level, and this has an impact on how the information should be communicated (the central task of NLG). Even where the main goal is to inform, the other ways in which the language affects the reader may have an important effect on the achievement of that goal.

Traditional NLG tackles a single type of generic goal (factual information) for a general user (or one of a small number of user types). The focus needs to be broadened to a variety of types of goals for specific users. Although NLG research has begun to explore the issues of reader variability (eg [23], [1]), including user modelling (see [24] for a good review), this is at an early stage, and tends to concentrate on broad decisions about content rather than fine-grained linguistic form, the focus of our proposed work.

Our own projects have begun to address these issues. User groups have included children with linguistic difficulties (STANDUP), adults with limited literacy (SkillSum), general members of the public (STOP, ILEX [12]), and professional doctors and engineers (SumTime, [6]), sometimes with individual customisation (STOP, SkillSum). The texts have been informative (SumTime), persuasive (STOP, SkillSum), humorous (STANDUP), and entertaining (NECA).

Strategic Vision

NLG has enormous potential to achieve benefits in the real world, especially given the growing importance of eCommerce, eHealth and eGovernment, but current NLG applications exist only in niche areas. We believe that there are two main reasons for this:

  1. Firstly, many real applications challenge the assumptions of traditional NLG highlighted above (single, generic goal; general user). We would like to push forward the scientific understanding of how the attributes of an individual reader (and the reading process for them) influence the effect that particular linguistic choices have on them. This will then result in an ability to build systems which, from a model of the reader, can intelligently select linguistic forms in order to achieve increasingly ambitious effects. Hence our goal is to learn better how to affect people with natural language.
  2. Secondly, NLG can be somewhat inward-looking. As our current projects (PolicyGrid, BabyTalk) show, NLG adds value to other computational solutions and often cannot be viewed as a stand-alone technology. We would like to lead in the emergence of NLG from its small corner, as it contributes to wider research initiatives and is increasingly exploited commercially. This requires us to make use of the methodologies and knowledge of other disciplines, within and outside Computer Science, to a much greater extent than hitherto. Hence there is a need for strategic alliances with a variety of researchers and disciplines.

To address the problems highlighted above, we see the following scientific themes as especially relevant:

  1. Psychology and Reader Experiments.We need to understand the relevance to NLG of attention, perception and memory. Particularly relevant are results about human reading [15] and how humans align their language use in order to effectively reach their hearers [2]. Although we are already at the forefront of measuring the effects of NLG texts on real users (e.g. testing reading time, or task completion) collaboration with psychologists will enable us to broaden and deepen this strand, looking at more fine-grained measures of reader behaviour (eg using eye-tracking) and assessments of a wider range of effects (such as emotional impact). In general, NLG can offer to psychologists the opportunity to further formalise and test their theories in more realistic settings. In return, results from psychology can inform our user and context models, as well as providing evidence about the effects of language alternatives in controlled settings.
  2. User Modelling and Affective computing. Affective computing is computing that relates to, arises from, or deliberately influences emotions or other non-strictly rational aspects of humans [13]. So far, however, work in "affective NLG" has aimed mainly to produce text that portrays the emotions of the writer, rather than considering how linguistic factors can affect the emotions of the reader. Work in affective computing may provide useful ways of formalising theories of emotion [10], modelling affective state and measuring effects on this state. In general, affective results may be easiest to monitor and achieve in multimodal communication systems, and this may require us to work with areas such as machine vision.
  3. NLG Architectures. The above issues (non-informative texts, reader variation), expose deficiencies in current NLG practices. Complex effects often involve a number of very different aspects of the text (e.g. sentence structuring, choice of vocabulary), interacting in non-trivial ways, and independent of the core factual content. Also, many effects arise from purely surface phenomena (eg text length, choice of words, word co-occurrences), and yet pipeline NLG architectures [17] discover surface effects only after all central decisions have been made. Abstract stylistic goals may have to be balanced against basic communicative tasks [21]; the COGENT project addresses some of these issues. There are a number of approaches to these problems: intelligent backtracking [4], 'overgeneration' architectures [5], and stochastic search [7], but such methods go beyond most current NLG architectures [8], and are still relatively untested on realistic examples.

Benefits

This research can be expected to have large benefits for both science and technology. From a scientific perspective, it will lead to theoretical results about some very poorly understood aspects of language. From an engineering point of view, it will establish practical methodologies for NLG development and evaluation. From a technological perspective, our work could lead to systems that help people in numerous ways, e.g. encouraging people to change their behaviour (cf. STOP, SkillSum), teaching children and other learners (cf. STANDUP), assisting specialists to understand complex data (cf. SumTime, BabyTalk). NLG research is on the cusp of a movement from simple informative software to more general, powerful and varied communication systems. Key to this development is a better understanding of how to affect people with natural language.


Bibliography

  1. Cawsey, A., Jones, R.B., and Pearson, J., "The Evaluation of a Personalised Information System for Patients with Cancer". User Modeling and User-Adapted Interaction, vol 10, no 1, 2000.
  2. Garrod, S. and Pickering, M., "Why is conversation so easy?". Trends Cogn Sciences8(1), pp 8-11, 2004.
  3. Hovy, E.H., "Pragmatics and Natural Language Generation". Artificial Intelligence43(2) pp153-198, 1990.
  4. Kamal, H. and Mellish, C., "An ATMS Approach to Systemic Sentence Generation". Procs of the Third International Conference on Natural Language Generation (INLG-04), New Forest, UK, pp 80-89, 2004.
  5. Langkilde, I. and Knight, K., "Generation that Exploits Corpus-based Statistical Knowledge". Procs of COLING/ACL, 1998.
  6. Law, A., Freer, Y., Hunter, J., Logie, R., McIntosh, N. and Quinn, J., "A Comparison of Graphical and Textual Presentations of Time Series Data to Support Medical Decision Making in the Neonatal Intensive Care Unit". Jnl of Clinical Monitoring and Computing, to appear (2005).
  7. Manurung, H., Ritchie, G., and Thompson, H., "A flexible integrated architecture for generating poetic texts". Procs of the Fourth Symposium on Natural Language Processing (SNLP 2000), Chiang Mai, Thailand, May 2000.
  8. Mellish, C. and Evans, R., "Implementation Architectures for Natural Language Generation". Natural Language Engineering, 10(3/4): pp 261-282, 2004.
  9. Moore, J., Porayska-Pomsta, K., Varges, S. and Zinn, C., "Generating Tutorial Feedback with Affect". Procs of the Seventeenth International Florida Artificial Intelligence Research Symposium Conference (FLAIRS), AAAI Press, 2004.
  10. Oatley, K. and Jenkins, J., Understanding Emotions, Blackwell, 1996.
  11. Oberlander, J. and Gill, A., "Individual differences and implicit language: Personality, parts-of-speech and pervasiveness." In Procs of the 26th Annual Conference of the Cognitive Science Society, pp1035-1040. Chicago, August 5-7, 2004.
  12. O'Donnell, M., Knott, A., Mellish, C. and Oberlander, J., "ILEX: The Architecture of a Dynamic Hypertext Generation System". Natural Language Engineering, 7: pp 225-250, 2001.
  13. Picard, R. W., Affective Computing. MIT Press, 1997.
  14. Piwek, P., "An Annotated Bibliography of Affective Natural Language Generation". Version 1.3 available from http://www.itri.brighton.ac.uk/~Paul.Piwek/topic-papers.html
  15. Rayner, K. and Pollatsek, A., The Psychology of Reading, Lawrence Erlbaum Associates, 1995.
  16. Reed, C. and Norman, T.J. (eds), Argumentation Machines: New Frontiers in Argumentation and Computation. Dordrecht: Kluwer, 2004.
  17. Reiter, E. and Dale, R., Building Natural Language Generation Systems. Cambridge: CUP, 2000.
  18. Reiter, E., Sripada, S., Hunter, J., Yu J., Davy I., "Choosing Words in Computer-Generated Weather Forecasts". Artificial Intelligence167(1-2): pp 137-169, 2005.
  19. Ritchie, G., "Current directions in computational humour". Artificial Intelligence Review16(2): pp 119-135, 2001.
  20. de Rosis, F. and Grasso, F., "Affective Natural Language Generation". In A. Paiva (ed.), Affective Interactions, Springer LNAI 1814, 2000.
  21. van Deemter, K., "Is Optimality-Theoretic Semantics Relevant for NLP?". Jnl of Semantics21(3), 2004.
  22. Walker, Marilyn A., Cahn, Janet E. and Whittaker, Stephen J., "Improvising linguistic style: social and affective bases for agent personality". Pp. 96 - 105 in Proc. 1st International Conference on Autonomous Agents, Marina del Rey, USA, 1997.
  23. Walker, M., Whittaker, S., Stent, A., Maloor, P., Moore, J., Johnston, M., Vasireddy, G. "Generation and Evaluation of User Tailored Responses in Multimodal Dialogue". Cognitive Science, 28(5), pp 811-840, 2003.
  24. Zukerman, I. and Litman, D. "Natural Language Processing and User Modeling: Synergies and Limitations". User Modeling and User-Adapted Interaction, 11(1-2), pp 129 - 158, 2001.
  25. Zinn, C., Moore, J. and Core, M., "Multimodal Intelligent Information Presentation". O. Stock and M. Zancanaro (eds.), Text, Speech and Language Technology, Vol. 27, pages 227-254, Kluwer Academic Publishers, 2005 (in press).
BabyTalk-Family

When a newborn baby is admitted to a neonatal intensive care unit (NICU), parents are frequently overwhelmed by the experience. The neonatal environment in which their baby is looked after can cause feelings of worry, confusion, and helplessness. Parents would often like more information about what is happening to their baby: Like the baby's current weight, oxygen levels, milk feeding quantities, and so on. This coupled with understanding enables parents to adapt and cope with the situation. This sort of information is important because it helps parents to take on their parental role, as well as get involved with the care of their child.

To help supply parents with this sort of information, we are developing a computer system - known as BabyTalk-Family - that can automatically generate easy to understand reports on the medical condition of babies in neonatal care. These reports are updated every 24 hours and made available online to the infant's parents, providing a simple summary of their child's progress.

We are currently working with parents and clinical staff to help improve this system. The system will be trialled in a neonatal unit, in collaboration with the Simpson Centre for Reproductive Health neonatal unit at Edinburgh Royal Infirmary hospital.

Contact: Ehud Reiter


Media


People

University of Aberdeen

NHS Lothian

Digital Economy Hub

The University of Aberdeen has a long-standing tradition of cross-disciplinary research across national and international rural arenas. In the past 10 years, research income in the rural domain totalled 12 million (8.5m active).

This platform of rural research is matched by an equally vibrant and successful programme of ICT research.

Major on-going activities include the International Technology Alliance in Network & Information Sciences (2006-2016), the PolicyGrid eSocial Science Research Node (2006-2012), the Platform Grant - Affecting People with Natural Language (2007-2011) and EC Broadband for All (2004-2009).

Research is based around four interconnecting themes: Accessibility & Mobilities, Healthcare, Enterprise & Culture, and Natural Resource Conservation.

dot.rural applies digital technologies, including intelligent agents, narual language generation, knowledge graph, semantic web and linked data, in the above four themes.

Project Homepage: Digital Economy Hub: Rural Digital Economy

Contact: Pete Edwards

Empirical Effects of Vague Language

We have been carrying out experiments with human subjects investigating the processing of vague quantifiers in referring expressions, eg, 'few', 'many'.

Participants are presented with stimuli on screen in the form of squares containing arrays of dots, and are instructed to select one of the squares with reference to how many dots it contains. The experiments show that, under some circumstances, people make their selection faster when the referring expression uses a vague quantifier than when it uses a crisp alternative. The experiments also show that, under some circumstances, this response time advantage can be achieved by using crisp verbal quantifiers like `fewest', `most', ie, that the response time advantage might not be due to vagueness per se, but to the verbal format.

The results have implications for NLG systems that must choose between different forms of linguistic referring expressions for conveying numerical information to human readers.

This work is supported by the EPSRC Platform Grant

How Was School Today?

Supporting Narrative for Non-Speaking Children

Being able to tell stories about ourselves is a central part of the human experience and of social interaction. Most people do this naturally, for example while chatting with family members over the dinner table. But telling stories about oneself can be a real struggle for people with complex communication needs (CCN); they find it very difficult to create and articulate such stories. People with CCN (ie individuals with severe physical and communication impairments and possibly varying degrees of intellectual disability, eg due to cerebral palsy) rely on computer-generated synthetic speech. Speech generating devices are currently limited to short, pre-stored utterances or tedious preparation of text files which are output, word for word, via a speech synthesiser. Restrictions in speed and vocabulary can be a frustrating experience and are an impediment to spontaneous social conversation.

This project is a follow on to the feasibility study "How was School Today...?" where we wanted to see if we can help children with CCN create stories about what they did in a day by developing a computer tool which produces a draft story based on knowledge of the user's planned daily activities (eg from a diary) and automatically-acquired sensor data; and also an editing and narration tool which lets the user edit the story into something which is his/hers and not just a computer output.

Project Homepage: "How was School Today...?"

Contact: Ehud Reiter

Semantic Grid for Rural Policy Development and Appraisal (PolicyGrid)

PolicyGrid is a research Node of the National Centre for e-Social Science (NCeSS). NCeSS is funded by the Economic and Social Research Council (ESRC) to investigate how innovative and powerful computer-based infrastructure and tools developed over the past five years under the UK e-Science programme can benefit the social science research community. PolicyGrid involves a collaboration between computer scientists and social scientists at the University of Aberdeen, the Macaulay Institute (Aberdeen) and elsewhere in the UK.

The project aims to support policy-related research activities within social science by developing appropriate Grid middleware tools which meet the requirements of social science practitioners. The vision of the Semantic Grid is central to the PolicyGrid research agenda.

The first stage of PolicyGrid developed novel interfaces using NLG to allow researchers to interact with a digital repository. The project is now extending this work to produce a general “NLG service” working on semantic web data whose behaviour can be influenced by “policies” incorporating user preferences and imposed constraints from the environment and context of use.

Contact: Pete Edwards

Common Ground and Granularity of Referring Expressions

Dr Kees van Deemter is collaborating with Dr Raquel Fernandez (Amsterdam) and Dale Barr (Glasgow), with funding from the EURO-XPRAG: ESF Research Networking Programme.

EURO-XPRAG main website

What If?

We have a richness of data about numerous aspects of our activities, yet these data are only any use when we know what they are, agree upon what they are and how they relate to each other. Semantic descriptions of data, the means by which we can achieve these aims, are widely used to help exploit data in industry, academia and at home. One way of providing such meaning or semantics for data is through "ontologies", yet these ontologies can be hard to build, especially for the very people that are expert in the fields whose knowledge is being captured but who are not experienced in the specialised "modelling" field.

In the "what if...?" project we look at the problems of creating ontologies using the Web Ontology Language (OWL). With OWL logical forms, computers can deduce knowledge that is only implied within the statements made by the modeller. So any statement made by a modeller can have a dramatic effect on what is implied. These implications can be both "good" and "bad" in terms of the aims of the modeller. Consequently, a modeller is always asking themselves "what if...?" questions as they model a field of interest. Such a question might be "what happens if I say that a planet must be orbiting a star?" or "what happens if I add in this date/time ontology?".

The aim of the "what if...?" project is to build a dialogue system allowing a person building an ontology to ask such questions and get meaningful answers. This requires getting the computer to determine what the consequences of a change in the ontology would be and getting it to present these consequences in a meaningful way. To do a good job, the system will have to understand something about what the person is trying to do and what sorts of results will be most interesting to them. For this, we need to understand more about how ontologists model a domain and interact with tools; be able to model the dialogues between a human and the authoring system; achieve responsive automated reasoning that can provide the dialogue system with the information it needs to create that dialogue.

Contact: Jeff Z. Pan

The WhatIf project is supported by the Science and Engineering Research Council from 2012 to 2015 through grants EP/J014354/1 and EP/J014176/1.



Key Research Areas

There are three main research areas:

  • Understanding the process of ontology authoring
  • Natural dialogue systems and controlled natural languages
  • Incremental ontology reasoning
  • Reasoning enabled test-driven ontology authoring

Who We Are

University of Aberdeen

  • Chris Mellish
  • Jeff Z. Pan
  • Artemis Parvizi
  • Yuan Ren
  • Kees van Deemter

University of Manchester

  • Caroline Jay
  • Robert Stevens
  • Markel Vigo

Advisors

  • Richard Power, Open University
  • Mike Uschold, Semantic Arts Inc.
  • Peter Winstanley, Scottish Government

Documents, Presentations & Publications

Documents will be posted here in due course.


RefNet

RefNet is an EPSRC research network advancing collaboration between research communities that have tended to work separately, namely computer scientists, linguists and psychologists. The phenomeon on which the network focusses is reference.

Reference is the process of making sure that a user/receiver can identify an entity - for example a person, thing, place, or an event. Reference can be considered the "anchor" of communication. As such it is crucial for communication between people, and for many practical applications: from robotics and gaming to embodied agents, satellite navigation, and multimodal interfaces. Through the study of reference, RefNet will build a base of interdisciplinary skills and resources for research on communication.

Project Homepage: RefNet

Contact: Kees van Deemter


RefNet's objectives are:

  1. To promote high-quality interdisciplinary research, and research resources relating to reference, particularly involving computational linguistics and psycholinguistics.
  2. To find ways to improve practical applications in which reference plays a role.
  3. To build skills for the interdisciplinary study of language and communication.

To do this, RefNet organizes activities whose goals are networking, skywriting, consultation, training, and showcasing of research.

Previous Projects

Atlas

Textual Descriptions Access to Geo-referenced Statistical Data

Summary

A lot of data available to public is geo-referenced. For example, census data is often aggregated over different levels of geographic regions such as counties and wards. Currently such data is presented to the public using thematic maps such as the ones published by National Statistics showing data from the Census 2001.

Although such visual presentations of geo-referenced data work great for sighted users they are inaccessible to visually impaired users. Particularly, visually impaired users find it hard to perceive important trends and patterns in the underlying data which sighted users so effortlessly manage using the visual maps. There are a number of emerging technologies to improve accessibility of map data to visually impaired users such as haptic maps and sonic maps .

In this project we apply Natural Language Generation (NLG) technology to automatically produce textual summaries of map data highlighting 'important' content extracted from the underlying spatial data. We hope that visually impaired users can use existing screen readers to listen to these textual summaries before exploring the data sets in detail using other access methods. We believe that textual summaries of spatial data could be useful to sighted users as well because multi-modal presentations (visual maps + textual summaries) often work better.

Objectives

  1. To develop NLG techniques for generating textual summaries of spatial data.
  2. To evaluate the utility of the textual summaries with visually impaired users in collaboration with Grampian Society for the Blind .
  3. To evaluate the utility of the combination of textual summaries and visual maps in collaboration with HCI Lab, University of Maryland .

People

  1. Yaji Sripada
  2. Kavita Thomas

Publications

  1. Kavita E Thomas and Somayajulu Sripada (2010) Atlas.txt:Exploring Lingustic Grounding Techniques for Communicating Spatial Information to Blind Users, Universal Access in the Information Society. [ONLINE] DOI: 10.1007/s10209-010-0217-5 pdf
  2. Kavita E Thomas and Somayajulu Sripada (2008) What's in a message? Interpreting Geo-referenced Data for the Visually-impaired Proceedings of the Int. conference on NLG. pdf
  3. Kavita E Thomas, Livia Sumegi, Leo Ferres and Somayajulu Sripada (2008) Enabling Access to Geo-referenced Information: Atlas.txt, Proceedings of the Cross-disciplinary Conference on Web Accessibility. pdf
  4. Kavita E Thomas and Somayajulu Sripada (2007) Atlas.txt:Linking Geo-referenced Data to Text for NLG, Proceedings of the ENLG07 Workshop. pdf

Background

This project is part of our ongoing work on developing technology for automatically producing textual summaries of numerical data . Our work on summarising time series data as part of the SumTime project has lead to the development of SumTime-Mousam, an NLG system that was deployed in the industry to generate marine (for the offshore oil industry) weather forecasts from numerical weather prediction (NWP) data. As part of RoadSafe, we are currently extending this technology to generate weather forecasts for winter road maintenance applications. We are also working on summarising scuba dive computer data in the ScubaText project and clinical data from neonatal intensive care units in the BabyTalk project.

Grampian Society for the Blind

Grampian Society for the Blind is a charity providing advice and support to people with visual impairments in the North-East (of Scotland). In the current project we work closely with their members for understanding their requirements and also for evaluating our technology.

Funded by EPSRC Logo

BabyTalk

BabyTalk is investigating ways of summarising and presenting patient information to medical professionals and family members. Our focus is on data in the Neonatal Intensive Care Unit.

This involves the use of Intelligent Signal Processing to analyse and interpret the available information about the patient, and Natural Language Generation techniques to generate coherent, readable summaries of this information in English.

Our ultimate aim is to use this technology to provide decision support to medical professionals, who base treatment on large amounts of information. Summaries will also help to keep family members informed about the condition of their baby.

NEONATE

The NEONATE project has three major objectives:

  • to investigate on a systematic basis, a comprehensive range of actions taken in the Neonatal Intensive Care Unit
  • to identify the terms used to describe patient state by staff at different levels and types of expertise
  • to use the results of these investigations to implement and evaluate computerised aids designed to support clinical decision making

Making more data available to decision makers does not necessarily of itself lead to improved care. This has been demonstrated in the neonatal intensive care unit where providing nurses and junior doctors with detailed trends of physiological information does not lead to improved patient outcomes. Our earlier studies (COGNATE project) have shown that a major reason for this finding is that the staff caring for the infants observe them closely and frequently to obtain more information than just the data shown on the monitors.

Presenting Ontologies in Natural Language

Chris Mellish and Xiantang Sun, supported by EPSRC grant GR/S62932.

  • 2004 Project poster
  • 2005 Project poster
  • Mellish, C. and Sun, X., "Natural Language Directed Inference in the Presentation of Ontologies", Procs of the 10th European Workshop on Natural Language Generation, Aberdeen, 2005. PDF version
  • Mellish, C. and Sun, X., "The Semantic Web as a Linguistic Resource: Opportunities for Natural Language Generation". Presented at the Twenty-sixth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, 2005. Also in Knowledge Based Systems Vol 19, pp298-303, 2006. PDF version
  • Pan, J. and Mellish, C., "Supporting Semi-Automatic Semantic Annotation of Multimedia Resources". Presented at the special session on "Semantics in Multimedia Analysis and Natural Language Processing" at the 3rd IFIP Conference on Artificial Intelligence Applications & Innovations (AIAI), Athens, 2006 PDF version
  • Mellish, C. and Pan, J., "Finding Subsumers for Natural Language Presentation". Presented at the DL2006 International Workshop on Description Logics, Windermere, England, 2006. PDF version
  • Sun, X. and Mellish, C., "Domain Independent Sentence Generation from RDF Representations for the Semantic Web". Presented at the ECAI06 Combined Workshop on Language-Enabled Educational Technology and Development and Evaluation of Robust Spoken Dialogue Systems, Riva del Garda, Italy, 2006. PDF version
  • Sun, X. and Mellish, C., "An Experiment on `free' Generation from Single RDF Triples". Presented at the European Workshop on Natural Language Generation, Dagstuhl, Germany, 2007. PDF version
  • Mellish, C. and Pan, J., "Natural Language Directed Inference from Ontologies". Artificial Intelligence 172(10): 1285-1315 (2008). PDF version
  • Prolog code for generating subsumers of ontology concepts for natural language presentation

Related papers:

  • Hielkema, F., Edwards, P. and Mellish, C., "Flexible Natural Language Access to Community-Driven Metadata". Submitted for publication, 2007. PDF version
ROADSAFE

RoadSafe was a collaborative project between the Computing Science Department at the University of Aberdeen and Aerospace & Marine International. The RoadSafe project aimed to build upon the expertise Aerospace & Marine International has in weather forecasting and the expertise the Computing Science Department at the University of Aberdeen has in building real world Natural Language Generation Systems.

The RoadSafe project:

  • used Knowledge Aquisition techniques to understand how humans write textual instructions for road maintenance vehicle routing
  • produced a system capable of automatically evaluating a region's geographical data combined with the weather forecast for 10'000s of points in that region to provide textual routing and de-icer spread rate instructions
  • utilised Aerospace & Marine International's expert forecasters in order to post-edit generated advisory texts and therefore improve the performance of the system

The main objective of the project was to use the advisory texts produced by RoadSafe as a guide to local councils for grit and salting applications during the winter.

People

External Collaborator

  • Ian Davy, Aerospace & Marine International

Publications

Publicity

Demos

SCUBATEXT: Generating Textual Reports of Scuba Dive Computer Data

SCUBA divers carry out decompression stops while ascending to the surface to allow their bodies to naturally get rid of the unwanted nitrogen. Divers can also be decompressed in decompression chambers to remove excess Nitrogen. Over the years dive tables have been used to provide guideline information about required decompression times during the ascent of a dive and also about required rest times between two successive dives. When used faithfully these tables help in planning safe dives to avoid 'the bends'.

One of the modern items of diving gear is a dive computer. A dive computer is a sports gadget that is worn on the divers' wrist (looks more like a wrist watch than a computer) to continually monitor their dives. A dive computer continuously records data such as depth and ambient temperature about the dive. It can also generate a dive table on the fly and compare the recorded data against the table data to inform divers about required decompression stops. They therefore ensure that divers are continually informed to perform safe dives.

Dive computers record dive logs which contain time series of dive depth and tissue saturation. These data sets can be useful to:

  • clinicians - to diagnose decompression illness
  • diving Instructors - to evaluate learners' dives and to provide feedback
  • dive supervisors - to monitor dives

In this project we develop techniques to produce textual (English) reports of dive data recorded by dive computers. The computer generated report will contain the following information

  • Issues across multiple dive profiles such as:
    • rapid ascent incidents
    • necessary and unnecessary stops
  • Unsafe dive profiles with special patterns such as square and reverse profiles:
    • square
    • saw-tooth
    • reverse
SkillSum

SkillSum developed an automatic assessment and reporting tool for adult basic skills (literacy and numeracy). The tool was a web-based system that allowed new entrant students at a college to take a basic skills assessment as part of their normal enrolment process.

When the test was completed, the tool produced a report for the user describing his or her skill level and whether this was adequate for the course about to be taken, and suggesting actions he or she could take to improve basic skills.

STOP

Aims:

  • to develop a computer system for generating tailored letters to help people stop smoking
  • to research knowledge acquisition (KA) techniques to acquire text-planning and sentence-planning rules from domain experts
  • to evaluate the clinical effectiveness of the computer generated letters in a general practice setting
  • to evaluate the cost effectiveness of this brief smoking cessation intervention

The results of our clinical trial suggested that while sending smokers a letter could help a small but useful number of people quit, the tailored letters were no more effective in this regard than the non-tailored letters. The tailored letters may have been slightly more effective with heavy smokers and others who found it especially difficult to quit, but the evidence for this is not conclusive.

SumTime - Generating Summaries of Time Series Data

Project Summary

Currently there are many visualisation tools for time-series data, but techniques for producing textual descriptions of time-series data are much less developed. Some systems have been developed in the natural-language generation (NLG) community for tasks such as producing weather reports from weather simulations, or summaries of stock market fluctuations, but such systems have not used advanced time-series analysis techniques.

Our goal is to develop better technology for producing summaries of time-series data by integrating leading-edge time-series and NLG technology.

SumTime Parrallel Corpus

SumTime-Meteo : A parallel corpus of weather data and their corresponding human written forecast texts

Demo

SumTime-Mousam Demo - Generates only Wind Descriptions

IGR

Final Report (IGR) to EPSRC about SumTime

Publications

Links to publications

Project Team

Collaborators

Related Links


Funded by EPSRC Logo

TUNA - Towards a UNified Algorithm for the Generation of Referring Expressions