Data science methods and large datasets are fundamental to much of our research, and we use both to address research questions across a broad range of fields.
This page describes more specific uses of advanced data science techniques that are research areas in their own right.
- Machine learning and artificial intelligence
-
Machine learning is a rapidly-evolving field that uses large datasets to discover underlying and unknown relationships within datasets, and to generalise those relationships to new data. It is a major subfield of artificial intelligence, which seeks to replicate certain functions of human intelligence using computers.
Some of us make heavy use of Large Language Models (LLMs) as integral components of our modelling and data analysis work.
We are very active in the use of data-driven modelling and machine learning for applications in biological sciences. Some examples are:
Predicting the length of hospital stays
- Ecology
- Psychiatry
- Critical care medicine
- Attributing individuals to source populations using genetic markers to train classifiers, in ecology, anthropology, infectious diseases, and forensics
- Physico-chemical properties of biochar
- Alzheimer’s disease
- Epilepsy
- Bacterial responses to stress
- EEG signals
- State identification of complex systems

We are also interested in combining ideas from nonlinear dynamics and machine learning to model and classify real world systems. Examples of our research combining these two fields, and some other research topics where we use machine learning, include:
- Dynamics of supervised learning
- Partially-known systems of differential equations
- Astronomical data analysis
- Spectral imaging of celestial objects
- Remote sensing for archaeology
Top: Source attribution of campylobacter using the Minimal Multilocus Distance method. Each chart shows the percentage of campylobacter from a particular source that was attributed to each of five sources. From Perez-Reche et al. (2020).
Bottom: Hypothetical spectral imaging of asteroid Ryugu enhanced by machine learning. Image: Charles Wang.
Staff contacts: Marco Thiel, Francisco Pérez-Reche, Andrew Angel, Murilo Baptista, Sandip George, Alessandro Moura, Charles Wang.
- Data mining and feature engineering
-
Data mining is a collection of approaches used to extract useful information from large datasets. Related is feature engineering, a pre-processing step for other methods such as machine learning where raw data is transformed into more useful quantities relevant to the particular dataset.
We are generally interested in how data-driven approaches can be used to analyse complex systems, uncover patterns, and inform decision-making in domains such as healthcare, neuroscience, and the energy sector.
In complex systems such as brain network studies, we are interested in reverse engineering the system's internal connections from observations of its collective behaviour using indirect measurements such as EEG or MRI.
Right: Rat intracranial EEG showing brain schematic and sensor locations, and recordings for wakefulness, REM sleep, and non-REM sleep. From González et al. (2022).
Staff contacts: Nicolas Rubido, Marco Thiel, Andrew Angel, Murilo Baptista, Francisco Pérez-Reche.
- Signal processing and information theory
-
Signal processing is an extremely broad field applicable to many areas of research that involve analysis, manipulation, and creation of signals – the transmission of data. It has several areas of overlap with time series analysis, which we use extensively to study nonlinear systems.
Signals contain information (in the mathematical sense), and information theory is concerned with the classification, storage, and quantification of information, data, signals, and related concepts.
Right: Entropy calculated from morning and evening ECG time series showing differences between people who showed upcoming depressive transitions (red) and people who did not (blue). From George et al. (2023).
Staff contacts: Murilo Baptista, Sandip George, Claudiu Giuraniuc, Alessandro Moura, Francisco Pérez-Reche, Nicolas Rubido, Marco Thiel, Ekkehard Ullner, Roland Young.
Academic staff
Header banner image credit: University of Aberdeen Image Asset Bank.