Dr Vinny Davies
- Lecturer (Statistics)
Biography
Dr Davies is a lecturer in Statistics in the School of Mathematics and Statistics specilising analytical methods for digital twins. He completed his Ph.D. in 2016 within the School of Mathematics and Statistics where he focused on variable selection models for selecting antigenic sites in virus evolution. He then completed several post-docteral research positions in both the schools of Statistics and Computing Science, as well as spending time as a Biostatistician at the University of Leeds. He returned to the School of Mathematics and Statistics in 2021 specialising in research and teaching on the inferface between Statistics, Machine Learning, and AI.
Dr Davies' research is primarily focused on analytical methods for digital twins, with applications across a wide variety of areas including biodiversity, manufacturing, and metabolomics. In 2021 Dr Davies won the Royal Statistical Society's Mardia prize to run a series of workshop of analytics for environmental digital twins. He has active collaborations with NatWest Group, CENSIS, and Apollo Tyres around different applications of digital twins. In addition to his work on digital twins, Dr Davies has a strong focus on computational metabolomics and leads a collaborative projects with Zoetis on heath related quality of life in animals.
If you are interested in a doing a Ph.D., please take a look at the Additional Information section or email me directly.
Grants
Energy-related Emissions Analytics for Sustainable Finance - NatWest / EPSRC IAA - £375,730
Collaborative PhD Studentship, University of Glasgow - ~£75,000
Reinvigorating Research Grant, University of Glasgow - £28,631
Generating Deep Fake Left Ventricle Images: a Step Towards Personalised Heart Treatments - EPSRC SofTMech Collaboration Grant - £13,189
Royal Statistical Society Mardia Prize (PI) - £7,000
Edinburgh Mathematical Institute Summer Student Grant - £900
Supervision
Current PhD Students
- Ross McBride - Tackling Scheduling and Uncertainty in Mass Spectrometry Fragmentation Strategies
- Coats, Aaron
Bayesian variable selection for genetic and genomic studies - Davison, Emily
Designing advanced statistical inference methods for learning the parameters of a mathematical biodiversity model - Ren, Hongjin
Gaussian Process Emulation for Mathematical Models of the Heart - Terzis, Nikolaos
Using statistics and machine learning to create a new metabolomics fragmentation spectra resolver
Past MRes Students
- Cara MacBride - A Comparative Analysis of Machine Learning Methods and Spatial Statistical Methods for Areal Unit Scottish Property Price Data
Teaching
This year I am teaching Python and GLMs, as well as supervising a number of undergraduate and master's projects. Previously I have taught on the Large Scale Computing course (NNs in Tensorflow).
Research datasets
Additional information
I am looking for potential PhD students across a range of subjects with currently available projects listed below. Please contact me if you wish to discuss these or any other projects further.
Estimating false discovery rates in metabolite identification using generative AI
Supervised jointly with Andrew Elliott and Justin J.J. van der Hooft (Wageningen University)
Metabolomics is the study field that aims to map all molecules that are part of an organism, which can help us understand its metabolism and how it can be affected by disease, stress, age, or other factors. During metabolomics experiments, mass spectra of the metabolites are collected and then annotated by comparison against spectral databases such as METLIN (Smith et al., 2005) or GNPS (Wang et al., 2016). Generally, however, these spectral databases do not contain the mass spectra of a large proportion of metabolites, so the best matching spectrum from the database is not always the correct identification. Matches can be scored using cosine similarity, or more advanced methods such as Spec2Vec (Huber et al., 2021), but these scores do not provide any statement about the statistical accuracy of the match. Creating decoy spectral libraries, specifically a large database of fake spectra, is one potential way of estimating False Discovery Rates (FDRs), allowing us to quantify the probability of a spectrum match being correct (Scheubert et al., 2017). However, these methods are not widely used, suggesting there is significant scope to improve their performance and ease of use. In this project, we will use the code framework from our recently developed Virtual Metabolomics Mass Spectrometer (ViMMS) (Wandy et al., 2019, 2022) to systematically evaluate existing methods and identify possible improvements. We will then explore how we can use generative AI, e.g., Generative Adversarial Networks or Variational Autoencoders, to train a deep neural network that can create more realistic decoy spectra, and thus improve our estimation of FDRs.