Machine learning in spectroscopy

Wynne Mosquito S&E - shortlisted 2018/19

Keywords – chemistry, spectroscopy, machine learning, entomology, disease vectors

Project Summary - We will combine expertise in chemistry, spectroscopy, entomology, and computing science to apply state-of-the-art machine-learning techniques to the determination of traits in insects and the design of novel molecules for attracting or repelling insects.

In the battle against the spread of diseases such as malaria and Zika, it is critically important to be able to monitor the distribution of ages, species, and other traits of the population of vector species that transmit disease.  As a key example, malaria can only be transmitted by mosquitoes older than 10 days. Therefore, control efforts should focus on reducing the fraction of older mosquitoes. The current best methods for doing this are highly inaccurate or expensive.

We have been able to demonstrate in preliminary work on mosquitoes that the mid-infrared spectrum contains sufficient information to determine age and species when analysed using a simple neural network. In this project, much more complete and robust analysis will be developed using supervised machine learning using more extensive spectral data sets. We will use dimensionality-reduction techniques for gaining greater insight into what spectral data are most important, we will use different forms of data, and generate synthetic data to improve robustness. Additionally, we will add near- infrared spectral data to allow the machine-learning algorithms to discover additional correlations. The experiments will be carried out on mosquitoes reared in Glasgow and at the Ifakara Health Institute in Tanzania as well as ticks from Scotland.

The initial work on application of machine learning tools in a fairly standard approach will give the student a firm foundation, preparing them for exciting advanced work on graph-convolutional autoencoders to produce a data-driven continuous representation of molecules. We already have a machine-learning model trained on a database of 500,000 SMILES representations of molecules from pubchem. Preliminary work has shown that the attractiveness of a molecule to mosquitoes can be quantified semi-automatically on a greatly parallel scale, which will be exploited to find novel molecules that repel or attract insects. This is a potentially disruptive technology with wide applicability to molecular design.

Project Team - The project will be led by Prof Klaas Wynne in the School of Chemistry and co-supervised by Prof Roderick Murray-Smith in the School of Computing Science and Dr Francesco Baldini in the Institute of Biodiversity Animal Health and Comparative Medicine. The student will work among the three groups and primarily be based in Chemistry. The supervisors will hold regular meetings with the student to review the project’s progress and also to provide supports as required in order to meet the anticipated project goals in time. S/he will have access to the facilities available in three research groups and will also benefit from a highly active research culture of working in the interdisciplinary team.

Person Specification

This studentship is open to candidates of any nationality – UK, EU or International.

Applicants should demonstrate the following:

Applicants should have a good degree in a relevant science discipline (e.g., physical chemistry, chemical physics, computing science), be highly motivated and have excellent English communication skills. The successful candidate will need to be enthusiastic about acquiring new skills and have an interest in spectroscopy and programming. Research experience, laboratory skills, experience with infrared spectroscopy, and familiarity of programing in Python will be considered an advantage.

In the first instance, prospective applicants should contact Prof Klaas Wynne ( to discuss their eligibility. Applicants may submit applications up until the application deadline of 12 noon, Friday 12 January 2018.

The following documentation will be required from applicants if they are invited to submit a full application:

  • LKAS Interdisciplinary Scholarships Application Form
  • 2 references in support of your application. (The references relevant to the application for admission to Glasgow for PhD study may be submitted to this process – they do not need to be tailored to this process.)
  • Degree transcripts in English (Undergraduate and Masters, if relevant)
  • Candidates whose first language is not English must show evidence of appropriate competence in English in the form of an IELTS certificate or similar.