Closed-Loop Data Science

Progress in sensing, computational power, storage and analytic tools has given us access to enormous amounts of complex data, which can inform us of better ways to manage our cities, run our companies or develop new medicines. However, the 'elephant in the room' is that when we act on that data we change the world, potentially invalidating the older data. Similarly, when monitoring living cities or companies, we are not able to run clean experiments on them - we get data which is affected by the way they are run today, which limits our ability to model these complex systems. We need ways to run ongoing experiments on such complex systems. We also need to support human interactions with large and complex data sets. In this project we will look at the overlap between the challenge someone faces when coping with all the choices associated with booking a flight for a weekend away, and an expert running complex experiments in a laboratory.
 
The project will test the core ideas in a number of areas, including personalisation of hearing aids, support for travel planning, analysis of cancer data, and media recommendation systems.
 

Partners

This project involves collaborative research with academic partners:  Glasgow Polyomics, the Urban Big Data Centre,  the University of Warwick, and industrial partners: Moodagent,   Widex A/S, Aegean Airlines and is further supported by DataLab Scotland.
 

Vacancies

We are advertising the following vacancy:

1 Post-Doctoral Research Associate/Fellow position in Computing Science on Data Systems 

To make a leading contribution to the EPSRC funded project “Closed-Loop Data Science for Complex, Computationally- and Data-Intensive Analytics” coordinated by PI Prof. Roderick Murray-Smith, working with Dr Christos Anagnostopoulos (line manager) and Dr Nikos Ntarmos 

The postdoctoral researcher will contribute to our research towards Machine Learning, explainable ML models and Exploratory Data Analysis. ML methods are often used to guide the exploratory data analysis over distributed data systems and to extract, infer and explain information from the data, as part of a data analytics pipeline. These predictions often inadvertently lead to changes in the behaviour of both systems and users (analysts, data scientists), thus potentially invalidating future predictions. Such closed loop effects have so far gone largely unaddressed, resulting in uncertain and unexpected results in practice. This is a huge problem for all associated stakeholders, as it affects both the operational characteristics (stability, scalability, performance) of the data system itself, and the accuracy and perceived value of predictive and inferential analytics on its data. The successful candidate is then expected to develop and experiment with sophisticated ML explainable algorithms, novel exploratory analytics methods and techniques to address these issues, building on results from the fields of adaptive/intermittent closed loop control, machine learning, and statistical learning, as applied to large-scale data systems. The successful candidate will further be able to closely collaborate with our industrial partners (including but not limited to Telefonica, MoodAgent, BBC, Aegean Airlines) to apply our research results in real-world settings.

This position offers:

  • An exciting opportunity for inter-disciplinary and highly impactful research.
  • Mentoring and support from world-leading academic staff and industry professionals.
  • The freedom to determine and manage your own exciting and complex programme of research.
  • Access to state-of-the-art computing facilities and real-world datasets/workloads.
  • Opportunities for further funding/continuation of employment at the University of Glasgow via joint authorship of research proposals and internships/secondments at our industrial partners.

The post requires expert knowledge in applying machine learning to large-scale data systems and a strong background in mathematical analysis. The ideal candidate will have substantial research experience and publication record in high quality venues and journals in the areas of ML/AI and data analytics systems (e.g., NeurIPS, DSAA, ICML, JMLR, KDD, PKDD, ICDE, ICDM, TKDD, TKDE).

For more details and to apply for this post, please navigate to https://www.jobs.ac.uk/job/CAC233/research-associate-fellow 

Informal inquiries and requests for information can be made to Dr. Christos Anagnostopoulos christos.anagnostopoulos@glasgow.ac.uk. For more information and to apply online: here

 

Reference Number: 038185
Location Gilmorehill Campus / Main Building
College / Service COLLEGE OF SCIENCE & ENGINEERING
Department SCHOOL OF COMPUTING SCIENCE
Job Family Research And Teaching
Position Type Full Time
Salary Range level 7 (£35,845 - £40,322 per annum) or 8 (£44,045 - £51,034 per annum).
 
 
 

Publications

Anagnostopoulos, C. (2020) Edge-centric inferential modeling & analytics. Journal of Network and Computer Applications, 164, 102696. (doi: 10.1016/j.jnca.2020.102696)

Savva, F. , Anagnostopoulos, C. and Triantafillou, P. (2020) Adaptive learning of aggregate analytics under dynamic workloads. Future Generation Computer Systems, 109, pp. 317-330. (doi: 10.1016/j.future.2020.03.063)

Savva, F. , Anagnostopoulos, C. , Triantafillou, P. and Kolomvatsos, K. (2020) Large-scale data exploration using explanatory regression functions. ACM Transactions on Knowledge Discovery from Data, (Accepted for Publication) Item availability restricted.

Williamson, J. H. , Quek, M., Popescu, I., Ramsay, A. and Murray-Smith, R. (2020) Efficient human-machine control with asymmetric marginal reliability input devices. PLoS ONE, 15(6), e0233603. (doi: 10.1371/journal.pone.0233603)

Savva, F. , Anagnostopoulos, C. and Triantafillou, P. (2020) SuRF: Identification of Interesting Data Regions with Surrogate Models. In: 36th IEEE International Conference on Data Engineering (IEEE ICDE), Dallas, TX, USA, 20-24 April 2020, pp. 1321-1332. ISBN 9781728129037 (doi:10.1109/ICDE48307.2020.00118)

Anagnostopoulos, C. and Kolomvatsos, K. (2020) Predictive intelligence of reliable analytics in distributed computing environments. Applied Intelligence, (doi: 10.1007/s10489-020-01712-5) (Early Online Publication)

Savva, F. , Anagnostopoulos, C. and Triantafillou, P. (2020) Aggregate Query Prediction under Dynamic Workloads. In: 2019 IEEE International Conference on Big Data (IEEE BigData 2019), Los Angeles, CA, USA, 09-12 Dec 2019, pp. 671-676. ISBN 9781728108582 (doi:10.1109/BigData47090.2019.9006267)

Anagnostopoulos, C. and Triantafillou, P. (2020) Large-scale predictive modeling and analytics through regression queries in data management systems. International Journal of Data Science and Analytics, 9(1), pp. 17-55. (doi: 10.1007/s41060-018-0163-5)

Ireland, D.G. , Doring, M., Glazier, D.I., Haidenbauer, J., Mai, M., Murray-Smith, R. and Ronchen, D. (2019) Kaon photoproduction and the Lambda decay parameter alpha. Physical Review Letters, 123, 182301. (doi: 10.1103/PhysRevLett.123.182301)

Wandy, J., Davies, V., van der Hooft, J. J.J. , Weidt, S., Daly, R. and Rogers, S. (2019) In silico optimization of mass spectrometry fragmentation strategies in metabolomics. Metabolites, 9(10), 219. (doi: 10.3390/metabo9100219) (PMID:31600991)

Jadidinejad, A. , Macdonald, C. and Ounis, I. (2019) How Sensitive is Recommendation Systems' Offline Evaluation to Popularity? In: REVEAL 2019 Workshop at RecSys, Copenhagen, Denmark, 20 Sep 2019,

Davies, V. , Harvey, W. T., Reeve, R. and Husmeier, D. (2019) Improving the identification of antigenic sites in the H1N1 Influenza virus through accounting for the experimental structure in a sparse hierarchical Bayesian model. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(4), pp. 859-885. (doi: 10.1111/rssc.12338)

Tonolini, F., Jensen, B. S. and Murray-Smith, R. (2019) Variational Sparse Coding. In: Conference on Uncertainty in Artificial Intelligence (UAI 2019), Tel Aviv, Israel, 22-25 July 2019,

Savva, F. , Anagnostopoulos, C. and Triantafillou, P. (2019) Explaining Aggregates for Exploratory Analytics. In: IEEE Big Data 2018, Seattle, WA, USA, 10-13 Dec 2018, pp. 478-487. ISBN 9781538650356 (doi:10.1109/BigData.2018.8621953)

Jadidinejad, A. H. , Macdonald, C. and Ounis, I. (2019) Unifying Explicit and Implicit Feedback for Rating Prediction and Ranking Recommendation Tasks. In: 5th ACM SIGIR International Conference on the Theory of Information Retrieval, Santa Clara, CA, USA, 02-05 Oct 2019, pp. 149-151. ISBN 9781450368810 (doi:10.1145/3341981.3344225)

Moran, O., Caramazza, P., Faccio, D. and Murray-Smith, R. (2018) Deep, Complex, Invertible Networks for Inversion of Transmission Effects in Multimode Optical Fibres. In: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, 02-08 Dec 2018,

Funded Projects

EPSRC funded project: £3M, 2018-2022:  Closed-Loop Data Science for Complex, Computationally- and Data-Intensive Analytics