Closed-Loop Data Science
Closed-Loop Data Science
Data science and machine learning are rapidly developing powerful tools to extract insights from large-scale, complex datasets. There is much focus on the algorithms that power inference, but significantly less focus on how humans are involved in the data science process. Data science often involves a pattern of query; fit model; predict; plot results; repeat. This is clumsy and makes it difficult for end users to explore, comprehend and make judgements. A more modern approach would have humans brought into a tightly coupled closed-loop with the inference processes, interactively exploring the beliefs compatible with data and directing the inference process to mine the seams of knowledge within. This PhD will be focused on applying these ideas to probabilistic models, where representation and communication of uncertainty in the results of an inference problem are particular importance. This studentship will focus on producing interactive, animated representations of probabilistic models. The project will focus on:
Establishing key interaction and animation “primitives” that can be linked to (potentially high-dimensional), probabilistic Bayesian models to represent their underlying uncertainty more effectively than static displays. These will form the analogues of techniques like error bars but which exploit active perception via closed-loop control of displays.
Developing techniques to augment sample-based (e.g. Markov Chain Monte Carlo) inference algorithms to dynamically sample and cache results from user inputs to close the loop between explorative data visualisation and inference. The aim is to provide accelerated inference in regions of importance to (implicit) queries in close to real-time.
The PhD will develop interaction techniques which facilitate active perception of uncertain data for a variety of data types (e.g. temporal, spatio-temporal, high-dimensional vector space models) and efficient strategies to accelerate inference to provide relevant inference to explorations happening in real-time. The successful candidate will have a strong interest/background in visualisation, human-computer interaction and/or Bayesian probabilistic modelling.
Machine learning techniques are widely used to address many recommendation scenarios – such as suggesting a movie to watch on (e.g.) Netflix, or recommending a point-of-interest to visit in a city, often by learning from historical user data. However, recommendation systems can be influenced by what users have already been recommended and thereafter viewed/visited, rather than what these systems might have found to be relevant of their own accord –for instance, Netflix might start to recommend movies that are already popular from its previous recommendations.
Such an effect can be described as a filter-bubble or a closed-loop feedback, and has been typically avoided through introducing novel or serendipitous recommendations into the suggestions. However, the alternative use of approaches originating from closed-loop theory, such as intermittent control, have not been systematically investigated within recommender systems.
This PhD will be focussed on applying ideas and techniques from closed-loop theory to state-of-the-art recommender systems. The candidate will investigate the modelling and deployment of closed-loop recommender systems using new neural networks architectures in comparison and along traditional matrix factorization and BPR-based recommenders. The evaluation of the resulting systems will be conducted using both public benchmarks in recommender systems as well as within the experimental pipeline of some of our data partners in the EPSRC Closed-Loop Data Science project.
The successful candidate will have a strong interest/background in recommender systems, machine learning, and/or information retrieval.