Simplifying complex data with the power of language
Helping global industries better understand 'big data' with Natural Language Generation technology.
As organisations continue to use an increasing amount of ‘big data’, industries across the globe must identify the best way to capture, process and understand this information. Although computers and machines can produce and store a huge quantity of digital data, the volumes of information are increasing so quickly that it’s impossible for human beings to make full use of the data available.
One of the main challenges of data management is that once a data set is generated by a system, an expert within the organisation is still required to analyse the data and explain the results. Illustrative and numerical reports are useful methods of interpreting sets of data, as well as written reports that use simple words and plain English, but both visual and written summaries can be very time-consuming to produce.
Natural Language Generation technology
As a solution to these challenges, University of Aberdeen academics and pioneers in the science of natural language generation Professor Ehud Reiter and Dr Yaji Sripada joined forces with entrepreneurs Ian Davy and John Perry to found the spin-out company, Data2Text, to further develop their ideas and research around Natural Language Generation (NLG) technology that could be used in real-world scenarios.
A form of artificial intelligence, NLG technology can make it quicker and easier for organisations to read and understand large amounts of data, by converting complex information into simple, summarised text. The written narratives are produced in a matter of seconds and are designed to be read as if they were written by a human subject expert.
Commercialising research
“Our goal was to build ‘articulate machines’ which communicate with people in the same way that other people do. We combine artificial intelligence technology, which analyses the data and identifies the most important information, with computational linguistics technology to transform the key information into sentences.”
Research projects included automatically generated narrative weather forecasts from raw weather prediction data, generating summaries for maintenance engineers on gas turbine sensor data and generating useful summaries of complex and diverse physiological time series data (such as heart rate and blood pressure) to monitor babies in neonatal intensive care units.
In 2009, the NLG technology was released as an open-source Java library called SimpleNLG, which is now maintained by a global network of volunteers and has been further developed to generate narratives in eight European languages and Mandarin Chinese. The SimpleNLG ‘realisation engine’ has been used by institutions and academic research groups to generate weather forecasts, provide dietary advice, and create interactive narratives, and has also been used commercially by an internet travel company to generate narrative descriptions of hotels and a radio and television broadcaster for sports reporting.
As a result of many years of research and development, Data2Text was acquired in 2013 by commercialisation specialist, Arria, to become Arria NLG.
Since then, Arria NLG has developed more than 40 core patented technologies and tools, with the Arria NLG Engine the world’s most advanced natural language generation engine. The technology continues to develop with new features and has also been used in several major industries, including financial services, life sciences, healthcare, energy, business, meteorology, and professional services.
Impact
- Professor Reiter and Dr Sripada are among the world’s foremost authorities in the field of NLG and have played a key role in the development of NLG architecture
- Arria NLG is now a world leader in the field of Natural Language Generation, employs more than 100 people and has offices in Aberdeen, London, New Zealand, Australia, and the USA
- The NLG technology improves efficiency, facilitates decision making and saves time and costs for major global industries and new markets such as business intelligence, financial reporting, and automated journalism
- The open-source SimpleNLG package allows institutions to generate narratives in languages other than English
- Arria’s new NLG Studio toolkit allows organisations to develop their own natural language generation systems