PhD Projects

PhD Projects

Please browse the projects listed below, each of which has been proposed by a member of staff. Competitive scholarships are available for UK/EU students (and a very limited number for students from elsewhere). In many cases, these proposals can be modified - please contact the academic responsible to discuss. We also accept self-defined projects. For general information please email

Details of how to apply can be found on the Postgraduate research opportunities page.

Formal Analysis, Theory and Algorithms

Efficient Algorithms for Matching Problems Involving Preferences - Dr. David Manlove

Matching problems involving preferences are all around us: they arise when agents seek to be allocated to one another on the basis of ranked preferences over potential outcomes.  Examples include the assignment of school leavers to universities, junior doctors to hospitals and children to schools.  For example in the second case, junior doctors may have preferences over the hospitals where they wish to undertake their training, and hospitals may rank their applicants on the basis of academic merit.

In many of these applications, centralised matching schemes produce allocations in order to clear their respective markets.  One of the largest examples is the National Resident Matching Program in the US, which annually matches around 30,000 graduating medical students to their first hospital posts.  At the heart of schemes such as this are algorithms for producing matchings that optimise the satisfaction of the agents according to their preference lists.  Given the number of agents typically involved, it is of paramount importance to design efficient (polynomial-time) algorithms to drive these matching schemes.

There are a wide range of open problems involving the design and analysis of algorithms for computing optimal matchings in the presence of ranked (or ordinal) preferences.  Many of these are detailed in the book “Algorithmics of Matching Under Preferences” by David Manlove (  This PhD project will involve tackling some of these open problems with the aim of designing new efficient algorithms for a wide range of practical applications.  There will also be scope for algorithm implementation and empirical evaluation, and the possibility to work with practical collaborators such as the National Health Service.

The importance of the research area was recognised in 2012 through the award of the Nobel Prize in Economic Sciences to Alvin Roth and Lloyd Shapley for their work on algorithms for matching problems involving preferences.

Contact: email, web.

Empirical Algorithmics with Graphs - Dr. Patrick Prosser

Many real world problems map into hard problems in graph theory. For example, scheduling problems are at heart graph colouring problems, recognising communities in social networks is clique finding, some problems in computational chemistry are graph isomorphism problems and stable matching problems correspond to the selection of edges while maintaining stability.

One of the challenges is to develop new algorithms with improved performance for these hard problems. A second challenge is to identify what problem features most affect the performance of algorithms. And thirdly, given the availability of multi-core and parallel processing, can we implement efficient parallel variants of our algorithms? Therefore, our research is a mix of theory (propose new algorithms), engineering (implement those algorithms and make them parallel), empirical study (carry out experiments to investigate algorithmic behaviour) and application (solve real problems and incorporate our algorithms into constraint programming toolkits).

Recently we have reported new algorithms for maximum clique, algorithms for variants of this problem (labelled clique, k-cliques and k-clubs), subgraph isomorphism, parallel versions of most of these algorithms, new constraint encodings for the stable roommates problem and symmetry breaking in graph representations. We continue to see challenges in using smart backtracking search, learning while searching, exploiting parallelism, symmetry breaking in search and addressing enormous problems.

A PhD will address some of the above problems (matching, colouring, routing, clique finding, isomorphism) and will require the development and implementation of new algorithms, facilitating an empirical investigation of algorithmic behaviour. In the PhD we might expect the application of algorithms to enormous problems, the exploitation of parallelism techniques  (bit-parallel, multi-core, distributed parallelism), the incorporation of symmetry breaking and learning while searching and where applicable, making the implementation of parallel search less difficult for the programmer.

Contact: email, web.

Model checking UAVs - Dr Alice Miller

Increasingly, software controlled systems are designed to work for long periods of time without human intervention. They are expected to make their own decisions, based on the state of the environment and knowledge acquired through learning. These systems are said to be autonomous and include self-driving cars, autonomous robots and unmanned aerial vehicles (UAVs).

The verification of autonomous systems is a vital area of research. Public acceptance of these systems relies critically on the assurance that they will behave safely and in a predictable way, and formal verification can go some way to provide this assurance. In this project we aim to embed a probabilistic model checking engine within a UAV, which will enable both learning and runtime verification.

Model checking is a Computing Science technique used at the design stage of system development. A small logical model of a system is constructed in conjunction with properties expressed in an appropriate temporal logic. An automated verification tool (a model checker) allows us to check the properties against the model. Errors in the model may indicate errors in the system design. New verification techniques allow this process to take place at run-time, and so enable us to analyse possible future behaviours and the best course of action to take to achieve a desired outcome. For example, a verification engine on board a UAV might allow us to determine the correct response of the UAV during extreme weather conditions in order to avoid a collision. Or it might predict the best route to take to achieve its mission, given the likelihood of failure of some of the UAV components.


  • Identify and refine existing runtime verification techniques suitable for in-flight UAVs
  • Implement verification methodology demonstrator on board a UAV (developed in the School of Engineering at Glasgow)

Contact: email, web.

Modelling and Optimisation of Demand Side Management - Dr. Gethin Norman

The objective of this project is to apply quantitative verification to the modelling and optimisation of Demand Side Management (DSM) in the Smart Grid. The Smart Grid is distinguished from classical electrical grids by the two way flow of electricity and information, allowing communication and collaboration between the consumers and producers. DSM aims to use this collaboration to shape the consumers' electrical demand to match supply (load-balancing) [1]. DSM can yield large improvements in energy and operational efficiency, reduction in greenhouse gas emissions, improvements in customer satisfaction and reduce network investment. Matching demand to supply requires coordinated diagnostic and prediction techniques using the flow of information to make control-decisions. In fact, McDonald [2] explains that the Smart Grid framework is essentially a control-optimisation problem with objectives for demand, distribution, assets, transmission and workforce.

The proposed research will aim to synthesise control strategies for building energy management systems that optimise energy load-balance while satisfying constraints on the building's operation and evaluate trade-offs between building management policies, environmental requirements, performance and efficiency. The foundations for this research will be the quantitative verification paradigm and in particular quantitative multi-objective model checking [3,4]. In this quantitative approach, models can include information regarding both the likelihood and timing of a system's evolution. Such behaviour is required for modelling energy usage as a consumer's behaviour is neither deterministic nor independent of the time of day.

[1] G. Strbac. Demand-side management: benefits and challenges. Energy Policy, 36:4419–4426, 2008

[2] J. McDonald. Smart grid applications, standards development and recent deployments. IEEE Power & Energy Society Boston Chapter Presentation, September 2009.

[3] K. Etessami, M. Kwiatkowska, M. Vardi, and M. Yannakakis. Multi-objective model check- ing of Markov decision processes. Logical Methods in Computer Science, 4(4):1–21, 2008.

[4] V. Forejt, M. Kwiatkowska, G. Norman, D. Parker, and H. Qu. Quantitative multi-objective verification for probabilistic systems. In Proc. 17th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’11), volume 6605 of LNCS, pages 112–127. Springer, 2011.

Contact: email, web.


The Cyber Security of Safety-Critical Applications - Prof. Chris Johnson

We have developed a range of techniques to protect office-based systems from a growing array of cyber threats. These include intrusion detection systems, firewalls and event logging applications. Most of these techniques are either dangerous or impracticable for safety-critical systems. For example, we can only use intrusion detection systems if we can demonstrate that they will not compromise safety. This is hard when we may have very limited time to install the patches needed to protect against a zero day attack. My research looks at ways of addressing these problems so that we can be both safe and secure across military and civil applications. This work is supported by a host of organizations ranging from the United Nations to the UK nuclear industry.

Contact: email, web.

Tools and Methods for Reproducible Scientific Software Development - Dr. Tim Storer

The use of software is pervasive in all fields of science. Associated software development efforts may be very large, long lived and complex, requiring the commitment of significant resources. However, several authors have argued that the `gap' between software engineering and scientific programming is a serious risk to the production of reliable scientific results, as demonstrated in a huge number of case studies.

Problems can arise because (for example):

    • The source code for reproducing results is not available.
    • The software has been changed since the results were produced in undocumented ways.
    • The project and environmental dependencies are not fully documented or available.
    • The acceptable tolerance for variation in results is not stated.
    • There is a poor separation between source code for 'tools' and source code that describes an experimental method.

The aim of this PhD will be to investigate tools and methods that support the development of software for reproducible experiments by addressing one or more of these challenges. The work may involve the creation of completely novel approaches, or the adaptation of existing software engineering tools and methods to the highly dynamic domain of scientific software development.

Contact: email, web.

Safety and Security of Autonomous Systems - Prof. Chris Johnson

Autonomous systems, including ground based vehicles and aerospace systems but also network monitoring applications raise a host of safety concerns. It is for this reason that we typically segregate them from conventional applications. For instance, autonomous aircraft and even ground-controlled vehicles are banned from entering civil, controlled airspace. They also create unique security issues – when for example, an attacker might communications links to then control the autonomous platform and threaten other assets. Countermeasures often place very tight restrictions on innovative engineering techniques – for example, standards such as IEC61508 prevent the use of AI architectures in software with a high integrity level. This is intended to make the behavior of the autonomous system easier to reason about but also prevents industry from building applications that offer huge potential benefits to the safety industry. Solutions range from the application of formal reasoning and model-based development to the integration of safety and security argumentation. This work is supported by a host of organizations ranging from UK and Asian public transport agencies to the US Department of Defence.

Contact: email, web.

Safety Critical Software Engineering - Prof. Chris Johnson

Many aspects of safety-critical software engineering are not well developed. For example, redundant software provides little benefits without some form of diversity because two identical programs will contain the same bugs. However, getting two different companies to write the same code can be costly and they too may contain common problems if there are errors in the requirements. My research develops new approaches to the software engineering of complex systems based on existing industrial standards such as IEC61508 and ISO26262. Key areas include cost control and securing the supply chain, we also focus on supporting national and international regulators who determine whether applications in the healthcare, nuclear, energy, aerospace industries are sufficiently safe. This work involves a host of organizations ranging from the UK National Health Service to NASA.

Contact: email, web.

Interactive Systems

Multimodality in Urban Interaction - Dr. Julie Williamson

Interactive technologies for public spaces play a significant role in the experience of living and working in an urban environment.  For example, information displays can increase civic engagement, whole body interaction can create playful encounters at shop windows, and digital interactive art can augment public spaces.  The rapid development of novel input techniques means that urban interaction is no longer limited to public displays, mobile devices, and formally curated experiences.  New interaction techniques allow interactivity to be embedded into the built environment for users of all kinds to walk up and enjoy interactive content.

This PhD research will explore how multimodal interaction can be embedded in urban installations, comparing gesture, touch, and proxemic interaction techniques in real world settings.  This interactivity will utilise new forms of input and output, moving away from display focused interaction by adding interactivity to new objects and surfaces in the urban environment.  For example, possible projects could include making railings, benches, or trees interactive using multimodal input and output.

There are four main challenges that this research will address.

  1. Perception – What are the usability parameters of different modalities in an urban context?  This challenge will survey possible input techniques and evaluate the communication potential of different interactions.
  2. Action – How do newly interactive objects and surfaces communicate interactivity to passers-by?  What kinds of cues or outputs are successful without requiring a traditional visual display?
  3. Control – What feedback and output capabilities are required to maximise control and usability?  This challenge will explore a variety of feedback techniques for continuous and discrete interactions.
  4. Context – How are these modalities used in real world settings, exploring the social interaction and social acceptability of these new input techniques?

This research will be completed through a combination of lab-based studies and in the wild deployments.  The installations will include elements of playful interaction and more complex interaction with information.  For example, installations could present information about events in the city, visualise urban data sets, or present other contextually relevant information.  

Contact: email, web

In-car haptics - Professor Steven Brewster

The physical controls in cars are being replaced by touchscreens. Pressing physical buttons and turning dials are being replaced by tapping on touchscreens and performing gestures. These work well on smartphones but can be problematic in cars. For example, they require more visual attention, taking the driver’s eyes off the road; they are smooth and flat, so it is hard to find where to interact.

The aim in this PhD project will be to use haptics (touch) to improve in-car interaction. We can then add back in some of the missing feedback and create new interactions that take advantage of the touchscreen but without the problems. We will investigate solutions such as pressure-based input – by detecting how hard a person is pressing, we can allow richer input on the steering wheel or other controls. Free-hand gesture input can allow the driver to control car systems without reaching for controls. For output, we will study solutions such as tactile feedback, ultrasound haptics and thermal displays to create a rich set of different ways of providing feedback. We will also combine these with audio and visual displays to create effective multimodal interactions that allow fast and accurate interaction from the driver but don’t distract him/her from driving. We will test these in the lab in a driving simulator and on the road to ensure our new interactions are usable.

Contact: email, web

Social Robotics for Humour and Entertainment - Dr. Alessandro Vincarelli

The goal of this project is to make humanoid robots capable to tell jokes. In particular, the research focus is the synthesis of both verbal and nonverbal aspects of humour, the typically human ability to make others laugh. The experiments will be performed using the NAO Aldebaran robot (one of the most popular platforms for social robotics) available at the School of Computing Science. The methodologies that will be applied will be inspired by Social Signal Processing, the computing domain aimed at modelling, analysis and synthesis of nonverbal behaviour in human-human and human-machine interactions. Special attention will be dedicated to the definition of quantitative metrics aimed at assessing the effectiveness of machines expected to entertain users.

The project will be highly interdisciplinary and will contribute to both computing science (making computers capable to entertain users is a major step towards their adoption in everyday life) and psychology (understanding how humour works can provide insights about the way human cognition works). Furthermore, the project will involve extensive experiments revolving around the interaction between robots and humans. The ideal candidate has a solid computing background, in particular machine learning and artificial intelligence, but it is open to collaborate with colleagues active in human sciences (social psychology, anthropology, etc.).

The work will be done under the supervision of Dr Alessandro Vinciarelli ( in the framework of Glasgow Social Robotics ( Dr Vinciarelli has published more than 100 scientific works and he has accumulated over £2M in research funding. His current activities are supported by the European Commission, EPSRC and The Data Lab.

Contact: email, web

Inference, Dynamics and Interaction

Bayesian methods for Metabolomics - Dr. Simon Rogers

Metabolomics is the large-scale study of the molecules involved in the chemical reactions that sustain life (metabolites). Measuring the abundance of metabolites in complex samples (e.g. human blood or urine) using Mass Spectrometry (MS) is potentially very useful in understanding disease, and developing drugs. However, it is very challenging due to various artefacts of the MS systems. This project will address the following particular problems:

  1. The large number of mass peaks produces by each metabolite: each metabolite results in many peaks in the spectrum. Developing clustering methods to group these peaks together will reduce the number of false positive identifications.
  2. Absolute Quantitation: only charged molecules can be detected by the MS. Different molecules ionise with different levels of efficiency making it impossible to compare measured intensities between molecules in a particular sample. Using advanced regression techniques we will attempt to predict ionisation efficiency to enable us to correct measured intensities to values that can be compared.
  3. Text models for fragment analysis: Data in which molecules have been fragmented is useful for performing identification. Mining this data for patterns (e.g. co-occuring groups of fragments) has the potential to uncover novel metabolites and understand the effect of drugs.
  4. Calibrating identification scores: When fragment data is available, metabolites can be annotated by passing the observed fragments to external identification servers. However, each server returns a score on a different scale. We will build statistical models to enable us to map scores from one tool onto another.

Contact: email, web

Casual Interaction: creating novel styles of human-computer interaction that span a range of engagement levels - Professor Roderick Murray-Smith

The focused–casual continuum is a framework for describing interaction techniques according to the degree to which they allow users to adapt how much attention and effort they choose to invest in an interaction conditioned on their current situation. Casual interactions are particularly appropriate in scenarios where full engagement with devices is frowned upon socially, is unsafe, physically challenging or too mentally taxing.

This thesis will involve the design of novel interaction approaches which use a range of novel sensing and feedback mechanisms to go beyond direct touch and enable wider use of casual interactions. This will include ‘around device’ interactions, ‘Internet of things’ style interaction with specific objects. Making these systems work will require the development of systems based on signal processing and machine learning.

To test the performance of the system we will use motion capture systems and biomechanical models of the human body, wearable eye trackers and models of attention to infer the mental and physical effort required to interact with the system.

Related work:

  • H. Pohl, R. Murray-Smith, Focused and Casual Interactions: Allowing Users to Vary Their Level of Engagement ACM SIG CHI 2013 pdf
  • The BeoMoment from B&O is an example of Casual interaction which was designed together with our group in a recent Ph.D. thesis - Boland, Daniel (2015) Engaging with music retrieval.

Contact: email, web

Embedded, Networked, and Distributed Systems

Improving Internet Protocol Standards - Dr. Colin Perkins

The technical standards underpinning Internet protocols are described in the RFC series of documents ( These are supposed to provide accurate descriptions of the protocols used in the network, from which implementors can work. However, they’re often written in an informal and imprecise style, and interoperability and security problems frequently arise because the specifications contain inconsistent terminology, describe message syntax in imprecise ways, and specify protocol semantics informally or by example. The authors of the specifications tend to be engineers, expert at protocol design, but not at writing clear, consistent, specifications. Further, specifications are increasing written, and read, by those for whom English is not their native language, further complicating the issue.

Formal languages and tools for protocol specification and verification have been available for many years, but have not seen wide adoption in the Internet standards community. This is not because such tools offer no benefit, but because they have a steep learning curve. The benefits offered in terms of precision and correctness are not seen to outweigh the complexity in authoring and reading the resulting standards.

The goal of this project is to explore semi-formal techniques that can be used to improve protocol specifications, at a cost that’s acceptable to the engineers involved in the standards community. This will involve tools to parse protocol specifications and to encourage authors towards the use of structured English vocabulary, that is both precise for the reader, especially the non-native speaker, and offers some ability for machine verification of protocol properties and generation of protocol code. Success will not be perfection, but rather uptake by the community of tools and novel techniques that improve specification clarity, and help ensure correctness

Contact: email, web

Costed Computational Offload - Dr. Natalia Chechina, Dr. Jeremy Singer, Prof. Phil Trinder

Computations now often execute in dynamic networks with a range of compute devices available to them. For example a mobile robot may perform route planning or image analysis computations using its own resources, or might offload the computation to a nearby server to obtain a better result faster.

We have already developed cost models to minimise execution time by controlling the movement of mobile computations in networks. Such a computation periodically recalculates network and program parameters and will move if the estimated time to complete at the current location Th exceeds the time to complete at the fastest available location Tn plus the communication time Tc, that is Th > Tn + Tc.

The purpose of the PhD research is to develop and implement appropriate models to decide when and where to offload computation, and how much work to do. A key challenge is to monitor a dynamic network, e.g. can we predict how long a computational resource will be available from past network behaviour? Another challenge is to develop and implement appropriate models that scale the computation. For example how detailed should the offloaded planning activity be? If the computation takes too long we risk losing the connection.

The project will run within the vibrant Glasgow Parallelism research Group with both experienced and youthful supervisors, and is associated with the EPSRC AnyScale Apps project

Contact: email, web

Compact Routing for the Internet Graph - Dr. Colin Perkins

Internet routing algorithms do not scale. This project will build on a class of centralised algorithms, known as compact routing algorithms, that have appealing theoretical scaling properties, and develop them to form practical distributed algorithms and protocols that scale in real-world topologies, while also supporting traffic engineering and routing policies.

The currently deployed Internet routing protocol is BGPv4. This is a path vector protocol that, in the absence of policy constraints, finds shortest path routes, but that offers a wide range of tools to enforce routing policy and influence the chosen routes. Due to the underlying shortest path algorithm, however, state requirements for each node in the BGP routing graph have been proven to scale faster than linearly with the size of the network (i.e., with the number of prefixes). This has been manageable until now because the limited size of IPv4 address space has constrained the number of prefixes. However, with the uptake in deployment of IPv6, this can no longer be guaranteed, and we need to find long-term scalable routing protocols.

The so-called compact routing algorithms achieve sub-linear scaling with the size of the network by abandoning shortest path routing, and using landmark-based routing. While the theoretical worst case stretch for compact routing relative to shortest path routing is large, previous work in the School has demonstrated that the stretch achieved on the Internet graph is not significant, and has developed a distributed algorithm for landmark selection. This project will extend this work to develop a fully distributed version of the compact routing algorithm, and realise it as a prototype routing protocol that could, conceivably, be deployed on the Internet as a long-term scalable routing solution.

Contact: email, web

TCP Hollywood - Dr. Colin Perkins

On-demand TV distribution using Dynamic Adaptive Streaming over HTTP (DASH) is ubiquitous, but disruptive to interactive traffic on the Internet. This project will explore novel TCP congestion control, to allow DASH traffic to better coexist with other applications.

Systems such as NetFlix and the BBC iPlayer are converging to use the MPEG DASH standard for video streaming. This uses web content delivery networks and protocols to support near real time download of chunks of video content, providing each in multiple rate and quality levels, and performing receiver-driven rate adaptation on a per-chunk basis. DASH has been extremely successful, with recent estimates showing that the majority of Internet traffic is video delivered in this manner.

The combination of DASH streaming with TCP congestion control is known to be unstable, due to mismatched timing between the video and the TCP rate control loops, and due to the dynamics of TCP congestion control. This instability is hidden by over-buffering in the network, but this, in turn, increases network latency and causes problems for interactive applications.

This project will explore the interplay between video coding and rate adaptation, TCP congestion control, and network buffering/caching, to develop new transport protocols and congestion control algorithms that avoid instability, improve in-network cacheing, and reduce latency for interactive traffic sharing the network. It will build on previous work in the School on IPTV measurement, congestion control, and transport protocol design, and our close relations with industry partners in the area and the standards community. Outcomes will be novel congestion control algorithms and protocol, and can feed into the standards process for wide deployment.

Contact: email, web

Securing Future Networked Infrastructures through Dynamic Normal Behaviour Profiling - Dr. Dimitrios Pezaros and Dr. Simon Rogers

Since its inception, the Internet has been inherently insecure. Over the years, much progress has been made in the areas of information encryption and authentication. However, infrastructure and resource protection against anomalous and attack behaviour are still major open challenges. This is exacerbated further by the advent of Cloud Computing where resources are collocated over virtualised data centre infrastructures, and the number and magnitude of security threats are amplified.

Current techniques for statistical, network-wide anomaly detection are offline and static, relying on the classical Machine Learning paradigm of collecting a corpus of training data with which to train the system. There is thus no ability to adapt to changing network and traffic characteristics without collecting a new corpus and re-training the system. Assumptions as to the characteristics of the data are crude: assuming measured features are independent through a Naïve Bayes classifier, or that projections that maximise the variance within the features (PCA) will naturally reveal anomalies. Moreover, there currently is no framework for profiling the evolving normal behaviour of networked infrastructures and be able to identify anomalies as deviations from such normality.

The overarching objective of this PhD project is to design a network-wide anomaly detection framework that will be able to operate on (and integrate) partial data, work in short timescales, and detect previously unseen anomalies. The work will bridge machine learning with experimental systems research, and will evaluate the devised mechanisms over real-world virtualised networked environments and traffic workloads.

The student will use recent developments in statistical ML to develop flexible probabilistic models that can capture the rapidly evolving view of the network. For example, Dirichlet Process priors for mixture models that allow new clusters to emerge as new behaviours are observed.

On the systems side, the student will develop traffic monitoring, accounting, and analysis modules that will be distributed and deployed on-demand across the network to then synthesise information and construct network-wide traffic views in order to allow characteristics learnt at one point in the network to be used elsewhere.

The research will be jointly supervised by academics from the Embedded Networked and Distributed Systems (ENDS) and the Inference Dynamics and Interaction (IDI) groups at the School of Computing Science, and will be conducted as part of the Networked Systems Research Laboratory (netlab). The student will be given access to actual Internet traffic traces, and a state-of-the-art virtualised testbed with fully programmable platforms at all software and hardware layers to experiment with.

The work will spread across some very vibrant and cross-disciplinary research areas, and the student will be equipped with highly demanded skills in Machine Learning, CyberSecurity and next generation network architectures.

Contact: emailweb

Performance Verification for Virtualized and Cloud Infrastructures - Dr. Dimitrios Pezaros

  • How do you verify the performance of your distributed applications?
  • How do you configure your Cloud-based network-server farm to deliver maximum throughput?
  • How do you know you are getting the performance you have paid for from your provider?

The Internet has seen great success mainly due to its decentralised nature and its ability to accommodate myriad services over a simple, packet-switched communication paradigm. However, measurement, monitoring, and management of resources have never been a native part of the Internet architecture that prioritised efficient data delivery over accountability of resource usage.

This has led to a global, complex network that has been notoriously hard to debug, to measure its temporal performance, and to verify that it delivers consistent service quality levels. The lack of such capabilities has so far been swept under the carpet due to the distribution of resources across the global Internet, and the over-provisioning of network bandwidth which has also been the main stream of revenue for network operators.

However, the Internet landscape has been changing drastically over the past few years: the penetration of Cloud computing imposes significant centralisation of compute-network-storage resources over data centre infrastructures that exhibit significant resource contention; and providers’ revenue increasingly depends on their ability to differentiate, and offer predictable and high-performing services over this new environment. The increased collocation of diverse services and users over centralised infrastructures, as well as the many layers of virtualisation (VM, network, application, etc.) required to support such multi-tenancy make the development of always-on measurement and troubleshooting mechanisms a challenging research problem to tackle.

The overarching objective of this PhD project is to design native instrumentation and measurement support for performance verification over virtualised collocation infrastructures. This will enable data centre operators to monitor and troubleshoot their (physical and virtual) infrastructure on-demand, and provide “network measurement as a service” to tenants through exposing appropriate interfaces. Application providers (tenants) will in turn be able to define measurement metrics and instantiate the corresponding modules to verify their applications’ performance, and to validate that their service level agreements with the hosting infrastructure providers are being honoured.

The work will entail experimental research in the areas of Network Function Virtualisation (NFV) and Software-Defined Networking (SDN) with a view towards enabling programmable measurement at the various layers (and locations) of future virtualised infrastructures. For example, it will explore how network nodes can efficiently provide accounting and reporting functionality alongside their main forwarding operation; what support from the end-systems (and virtual machines) will be required in order to synthesise and deploy novel end-to-end performance verification services; and what the specification and execution interfaces of such programmable environment should be.

The research will be conducted as part of the Networked Systems Research Laboratory (netlab) at the School of Computing Science and the student will be given access to a state-of-the-art virtualisation infrastructure and relevant platforms. Through the strong experimental nature of this project, the student will contribute to a currently buzzing research area, and will be equipped with highly demanded expertise in virtualised systems design, development, and evaluation. Digging under the surface, this work can have transformative effects on the design of future converged ICT environments that will need to deliver high-performance services, and where the boundaries between network, end-system, and application are becoming increasingly blurry.

Contact: emailweb

ProgNets 2.0 - Dr. Dimitrios Pezaros

Active and programmable networks were a popular research area about 15 years ago but eventually faded due to security and isolation concerns (how do I trust someone else’s code to run on my router’s interface?), and the lack of adoption by the industry that was at the time making money from offering high-bandwidth products and services.

All this has now changed: resource (server, network) virtualisation has been pervasive, allowing the efficient sharing of the physical infrastructure; and network operators and service providers now try to differentiate based on services they offer over virtualised infrastructures. In this new landscape, Software-Defined Networking (SDN) has emerged over the past five years as a new paradigm for dynamically-configured next generation networks, and has already been embraced by major equipment vendors (e.g., HP, Cisco, etc.) and service providers (e.g., Google).

Fundamental to SDN is the idea that the whole control plane is abstracted from individual network nodes and all network-wide functionality is configured centrally in software. Switches and routers are therefore reduced to general-purpose devices (in contrast to the legacy, vertically-integrated and vendor-controller platforms) that perform fast packet switching and are being configured on-demand through a defined API (e.g., Openflow). All functionality that then controls the network (e.g., spanning tree computation, shortest-path routing, access control lists, etc.) is provided by a (set of) central controller(s), and the resulting rules are installed on the switches through the Openflow API. This separation between the network’s data and control planes is a first step in ‘softwarising’ future networks but is still a long way from enabling true programmability through softwarisation.

The overarching objective of this PhD project is to design next generation, fully programmable Software-Defined Networks above and beyond the current state-of-the-art. Currently, the main SDN implementation through Openflow lacks any support for real-time programmable service deployment, since it centralises all intelligence (and programmability) around a (set of) controller(s). Future, service-oriented architectures will need to provide data path programmability through distributing intelligence to the network nodes. This is the only way to support the deployment of real-time programmable services in the data path (e.g., distributed network monitoring and control, performance-based provisioning, anomaly detection, dynamic firewalls, etc.).

The work will entail experimental research in protocols and languages for network programmability, in switch architectures, and the software-hardware interface. It will explore platform-independent language representations and runtimes (e.g., bytecodes and intermediate representations) that can allow custom processing at the switches without requiring the manual extension of protocol fields to support new functionality and at the same time offer bound data forwarding performance. The work will also include the design of exemplar time-critical services that will benefit from such underlying network architecture.

The research will be conducted as part of the Networked Systems Research Laboratory (netlab) at the School of Computing Science and the student will be given access to a state-of-the-art SDN testbed with fully programmable platforms at all software and hardware layers. Through the strong experimental nature of this project, the student will contribute to a currently buzzing research area, and will be equipped with highly demanded expertise in Software-Defined Networks, and next generation network architectures.

Contact: emailweb

Compilers and runtime systems for heterogeneous architectures, in particular FPGAs - Dr. Wim Vanderbauwhede

The topic of this research is the development of a compiler for high-level programming of FPGAs in particular. However, the compiler will target OpenCL (e.g. using the recent SPIR draft for integration with LLVM) so that the code can also run on multicore CPUs and GPUs. The challenge lies in transforming the intermediate code to obtain optimal performance on the target platform by using all devices in the heterogeneous platform. This requires partitioning the code and transforming each part to create an optimal version for the device it will run on. This is a complex problem requiring the development of sophisticated cost models for the heterogeneous architecture as well as run-time systems that can dynamically schedule code to run on different devices. If you are keen to undertake cutting-edge compiler research, this is the topic for you!

Contact: emailweb

Acceleration of scientific code for clusters of multicore CPUs, GPUs and FPGAs - Dr. Wim Vanderbauwhede

The topic of this research is the development and application of automated refactoring and source translation technologies to scientific codes, in particular climate/weather-related simulation code, with the aim of efficient acceleration of these codes on multicore CPUs, GPUs and FPGAs. If you are interested in source-to-source compilation (e.g. the ROSE compiler framework), refactoring and GPUs or FPGAs, and have expertise in compiler technology, FPGA programming or GPU programming, this topic provides an exciting research opportunity. The ultimate aim is to develop a compiler that can take single-threaded legacy code and transform it into high-performance parallel code using MPI, OpenMP, OpenACC or OpenCL, either entirely automatically or based on a small number of annotations.

Contact: emailweb

Acceleration of Information Retrieval algorithms on GPUs and FPGAs ("Greener Search") - Dr. Wim Vanderbauwhede

The topic of this research is on accelerating search and data filtering algorithms using FPGAs and GPUs. In particular FPGAs have great potential for greening the data centres as they offer a very high performance-per-Watt. A lot depends on the actual algorithms, as well as the system partitioning. If you have expertise in FPGA programming would like to take part in the development of the next generation of low-power search technology, this is a great opportunity.

Contact: emailweb

A novel shared-memory overlay for HPC cluster systems - Dr. Wim Vanderbauwhede

Traditional programming models have assumed a shared-memory model; however, modern architectures often have many different memory spaces, e.g. across heterogeneous devices, or across nodes in a cluster. Maintaining the shared-memory abstraction for the programmer is both very useful and highly challenging. In general of course naive shared-memory programming over an exascale HPC cluster would lead to disastrous performance.

However, in the context of a task-based programming model such as the GPRM (Glasgow Parallel Reduction Machine), we have shown that shared-memory parallel programming of manycore systems can be highly effective. The aim of this project is to extend the GPRM framework from homogeneous manycore systems to heterogeneous distributed systems through the addition of a shared-memory overlay. This overlay will allow the programmer to use a shared memory abstraction within the task-based programming model, and will leverage GPRM's sophisticated runtime systems and full-system knowledge to make intelligent caching decisions that will effectively hide the latency of the distributed memory space.

The implementation of this novel shared-memory overlay can be considered either in user space or as functionality in the operating system. The latter approach is more challenging but offers the greatest potential. If you are interested in research into system-level aspects of parallel programming for future HPC systems, this is an excellent choice.

Contact: emailweb

Application-defined Networking for HPC systems - Dr. Wim Vanderbauwhede

Workloads in the area of climate modeling and numerical weather prediction are becoming increasingly irregular and dynamic. Numeric Weather Prediction models such as the Weather Research and Forecasting model already display poor scalabity as a result of their complex communication patterns. Climate models usually consist of four to five coupled models, and the communication between these models is highly irregular. Coupled models are a clear emerging trend, as individual models have become too complex for conventional integration. Combined with the growing scale of supercomputers and the ever-increasing computational needs (for example, for accurate cloud modeling in climate models), this trend poses a huge challenge in terms of performance scalability.

A lot of research and development effort has gone into optimizing the network hardware, the operating system network stack and the communication libraries, as well as optimization of the individual codes. Despite this, current supercomputer systems are not well equipped to deal efficiently with rapidly changing, irregular workloads, as the network is optimised statically, routing is static and there is no elastic bandwidth provisioning.

The aim of this PhD is to investigate a radically different approach to supercomputer networking, where the network is defined by the application in a highly dynamic and elastic way. This Application Defined Networking takes a holistic view of the complete system, from the application code down to the lowest level of the network layer. Decision about routing, prioritising traffic and bandwidth provisioning are taken using the information provided at run time by the application code. In this way, the network will adapt so that traffic will always be transferred in the optimal way (e.g. lowest possible latency or highest possible bandwidth).

As this is a very large topic, the actual PhD research project will likely focus on one or more specific aspects of the problem such as the machine learning algorithms required to predict the network behaviours, the inference of code annotations required for the application to notify the network of impending traffic, or the network subsystem required to handle the dynamic allocation of resources based on the application's needs.

Contact: emailweb

Information, Data, Events, and Analytics at Scale

Big Data Complex Query Analytics and Query Processing - Prof. Peter Triantafillou

Private and public organisations are inundated with a continuously increasing volume of data – giving rise to the much-discussed era of big data. The value emerging from analysing such vast volumes of data is increasingly recognised as a premier source of private or public value-added services. As a result of a decade of extensive research and development efforts, platforms and related data systems for big data analyses have emerged, which has pushed the boundaries, scaling analytics to unprecedented levels. However, a lot remains to be accomplished, as running complex analytics queries across very large datasets in such platforms is still very time-consuming and costly.

Research in this theme will focus on how to optimise execution of complex queries, e.g., involving joins of tables, ranking of table contents and results, and aggregation functions (min/max, mean, media, quartiles, etc.) over large-scale infrastructures running platforms such as Hadoop/MapReduce, Spark, etc.

In particular, research can focus on:

  • deriving and employing specialised index and statistical structures and
  • machine learning (ML) algorithms and models

that when utilised can yield orders of magnitude improvements in analytics-tasks’ running times, money-costs (when running in cloud environments), network bandwidth, etc.

The ML models can be of two types: data-driven (learning to make predictions for query results based on data characteristics) or query-driven (learning to make predictions for query results) based on previous queries and their associated results.

The overall goal of this theme’s research is to achieve near-line, large-scale, big-data analytics.

This theme can support a number of PhD theses, differentiated based on the types of queries considered, indexing techniques, and/or ML model types.

Contact: email web

Big Graph Databases and Graph Query Processing - Prof. Peter Triantafillou

A very large number of emerging and established applications access datasets that are (or can be modelled as) graphs. The list of applications includes spatial-data applications (such as maps in smart city analytics), social networks (such as those in Facebook, Twitter), large crowdsourced document collections (e.g., Wikipedia), biomedical/bioinformatics applications (e.g., protein to protein interaction networks), large software systems, etc. For this reason, the area of graph-analytics, graph-processing engines, and graph querying, has become a hotbed of research and development efforts in the last decade. A number of graph processing engines for graph mining have been produced, such as GraphX, Pregel, GraphLab (and its derivatives), etc. Currently, however, all these engines fail to support graph queries.

Research in this theme will fill this gap. Specifically, it will focus on how to process expensive graph queries (e.g., subgraph and supergraph queries) against databases containing potentially a very large number of graphs or against a single very large graph. Specifically, the problem entails the subgraph isomorphism problem (i.e., given a pattern graph, discover which graphs in the database contain it), which in itself is a task, which is very computationally intensive (proven to be NP-Complete).

The research will focus on expediting such queries, with a particular emphasis on three issues:

    Developing specialised (possibly distributed) graph indices for the task,
  • Developing scale-out systems and distributed data partition and query execution algorithms, which can employ a large number of computers in order to parallelise query execution, and
  • Developing distributed graph caches and managing them.

The aim is for these contributions to yield orders of magnitude improvement in query execution times.

This theme can support a number of PhD theses, differentiated based on the types of queries considered, and on whether the focus will be on indexing techniques, distributed data partitioning and load balancing algorithms, distributed query execution, etc.

Contact: email web

Bridging the gap between big data batch and streaming data management - Dr. Nikos Ntarmos

MapReduce, and related technologies, revolutionised the area of scale-out big data processing when they were first introduced, as they allowed for easy large scale parallelisation of batch data processing. On the other hand, their inherent latencies led to the introduction of scale-out stream processing solutions, purpose-built to handle large amounts of high-speed data. Yet, our current information needs require a mix of these two paradigms. This has led to the rise of the so-called ``lambda'' architecture, comprising both batch and stream processing components, and combining the accuracy and comprehensiveness of scale-out batch processing frameworks with the low latency and responsiveness of near-real-time distributed stream processing, in a fault tolerant and highly scalable ensemble. In this new environment, several data management research questions, solved in traditional systems, remain open. We are looking for an excellent candidate who will pursue a PhD on answering some of these questions, pertaining mainly to supporting high-quality approximate query answering and appropriate data consistency notions in this setting. This is a rather challenging research endeavour, aiming to advance the state-of-the-art closer to the elusive goal of ``the one'' data management infrastructure, capable of providing all of the guarantees we have come to rely on in centralised settings, but in a highly scalable, decentralised, fault tolerant, efficient and high performance package, capable of dealing with very large amounts of both in-rest and streaming data. The PhD candidate will need to identify classes of data processing workflows which benefit from deployment in the lambda architecture, propose novel data consistency notions appropriate in this loosely coupled distributed setting, revisit and revamp statistics and data summaries for use by both components of the architecture, and produce appropriate indexing and query processing methods and data structures providing low -latency but highly accurate approximate query answers.

Contact: email, web

Analytics, learning, and Knowledge Mining for Big Data Explotation - Dr. Christos Anagnostopoulos

Data analytics and the adoption of machine and statistical learning methods over large-scale data become necessity for gaining insight of complex processes to prove scientific theories and discoveries, to support decision making, and enhance strategic planning in different areas e.g., the economy, industry, healthcare, etc. Query-driven exploration over Big Data through discovering insights, patterns and trends in a timely fashion is becoming necessity in this field. We are looking for an excellent candidate who will pursue a PhD on a novel problem of predictive analytics over distributed data in the sense that it can be deployed in environments in which data owners and sources restrict access to their data (e.g., due to security or cost reasons) and allow only certain statistical summaries or aggregation functions to be constructed/executed over the data.

The challenge of this research relies on the idea of gaining insight knowledge only from interrogation/analytics queries and their results over distributed data. Moreover, it is challenging to develop novel query-driven supervised and unsupervised machine and statistical learning techniques over the analytics queries that can mine as much knowledge from restricted access data as possible. Such query-driven predictive analytics methods aim at multiple levels of knowledge abstraction and information fusion to allow the system to learn complex functions, mapping the input to the output indirectly, and provide evidence that on big data, sophisticated predictive analytics algorithms can achieve comparable or even better performance than those ‘traditional’ methods that explicitly require data access. In addition, the research will concentrate on problems associated with optimally scheduling of data access based on learning the query patterns over federated data nodes.

For more information, please visit the link:

Contact: email, web

Adaptive data management for NoSQL data stores - Dr. Nikos Ntarmos

The current state-of-the-art in decentralised NoSQL databases comprises a large number of highly scalable, elastic, fault tolerant, decentralised data management systems, often coupled with data processing frameworks with matching characteristics. As these systems are purpose-built to deal with large amounts of data, a core component of the paradigm is to ``move the processing to the data''. In this setting, the data are sharded and replicated across a large number of storage nodes, with the database/storage tier taking care of data persistency and consistency. Then, the processing framework strives to assign the processing of any part of the data to the same (or a nearby) nodes as those storing replicas of said part. What is still missing, however, is the ability of these systems to sense changes in their access patterns and to appropriately adapt their inner workings, so as to accommodate and expedite their users' changing data needs, without sacrificing any of the safety guarantees provided by the underlying system. We are looking for an excellent PhD candidate to perform research into mechanisms to identify access pattern changes online and to appropriately modify core design parameters of the system -- such as the degree of replication, the replica placement policy, the sharding criteria, etc. -- so as to expedite data processing via lower access costs and a fairer load distribution across the system nodes. On top of that, the proposed solutions should maintain, if not improve upon, the data consistency guarantees provided by the data management infrastructure, all while allowing the system to scale out to a large number of nodes. To accomplish these goals, the candidate will have to draw upon a large body of research, comprising statistical and indexing structures, access monitoring mechanisms, appropriate replication and consistency notions, and decentralised query processing frameworks.

Contact: email, web

Decentralised graph isomorphism testing - Dr. Nikos Ntarmos

Graph structured data are prevalent in many modern big data applications, ranging from chemical and bioinformatic databases and other scientific datasets, to social networking and social-based applications such as recommendation systems. Central to high performance graph analytics over such data, is the ability to locate occurrences of pattern graphs in dataset graphs -- an operation entailing the NP-complete problem of subgraph isomorphism (sub-iso). Relevant theoretical research has produced a large number of fast and efficient algorithms to deal with this problem, but a treatise in fully distributing these algorithms is still lacking. On the other hand, the data management community has produced a number of index-based techniques to reduce the number of sub-iso tests required for datasets consisting of a large number of small-to-medium sized graphs, but they all still require a final ``verification'' stage of sub-iso testing, which is largely serial on a per stored graph basis. Lately researchers started looking into the problem of sub-iso testing against a single very large graph, producing some very promising scale-out solutions, although usually for a limited set of queries or query types. We are looking for an excellent PhD candidate who will pursue a PhD on scale-out/scale-up subgraph isomorphism testing of arbitrary query pattern graphs, supporting graph node/edge labels, borrowing and augmenting ideas and intuitions from the leading state-of-the-art centralised sub-iso solutions, as well as the latest index-based and index-less subgraph matching solutions and scalable graph processing frameworks.

Contact: email, web

Large-Scale Learning over Distributed Streaming Data - Dr. Christos Anagnostopoulos

We are looking for an excellent candidate who will pursue a PhD on the development of new large-scale machine learning methods for distributed, streaming contextual data generated in the context of the Internet of Things (IoT). Such methods will become the basis for building intelligent applications over IoT streaming data. IoT is a part of future Internet and comprises many billions of devices (‘things’) that sense, compute, communicate, share knowledge, and actuate. Such devices incorporate machine intelligence, physical/virtual identities, contextual sensors, RFIDs, social media, etc. The vision of IoT is to allow ‘things’ to be connected any-time, any-place, with anything and anyone. Some emerging Big Data applications based on knowledge derived from streaming contextual information include emergency situations awareness, smart city applications, remote sensing and environmental monitoring.

Learning from contextual data and adaptation to changes in the context of IoT sets forth a number of challenges that have to do with the nature of the contextual data and the processes that generate them. Contextual information coming from IoT devices has a strong spatiotemporal dimension, which needs to be taken into account during the data modelling and learning process heading for reliable knowledge inference/reasoning and context awareness. Moreover, research challenges relate to in-network contextual data and knowledge fusion, and localized decision making, which deals with the redundancies and interactions that exist among the distributed contextual data sources. In addition, IoT devices regularly fail, e.g. limited battery life-time and loss of connectivity, thus, resulting to incomplete contextual data availability. It is challenging for a learning model, context prediction and adaptation algorithm to cope with incomplete and missing contextual information, concept drift and changing data distributions.

For more information, please visit the link:

Contact: email, web

Information Retrieval

A Framework for Selective Tradeoffs in Web Search - Dr. Craig Macdonald and Dr. Iadh Ounis

There are a variety of different information retrieval techniques that can enhance the effectiveness of a search engine (also known as Quality of Experience, QoE). However, these often come at the cost of decreased efficiency (degraded Quality of Service, QoS). Similarly, various techniques for efficient retrieval can degrade effectiveness. Recent research has demonstrated that it is possible to accurately predict the efficiency of a search engine for an unseen query, which has led to new selective approaches that can significantly benefit overall search engine server utilisation while maintaining QoS constraints. Similarly, different query classifications such as query performance prediction are also becoming increasing important within deployed search engines, to predict how to apply machine learned models, vertical results, diversification etc. However, an all-encompassing framework that can select appropriate configurations for the search engine, encapsulating all of the QoE characteristics, as well as QoS attributes, remains unsolved in the literature. This PhD scholarship will work towards the development of a framework for planning the execution of a query, considering aspects such as efficient retrieval, query expansion, stemming, learned models, diversification, to name but as few. By predicting efficiency cost, as well as effectiveness gain & risk for different aspects of control for each query, the overall QoS and QoE of a search engine can be upheld. Evaluation of the proposed framework will be conducted using open datasets e.g. from TREC.

Contact: email web

Risk-Sensitive Information Retrieval: Theory, Models, Evaluation Methodologies and Applications - Dr. Craig Macdonald and Dr. Iadh Ounis

Typically search engines apply a single retrieval strategy uniformly for all queries. This has led to a focus upon approaches that improve the mean average quality of results for a set of unseen queries. However, a system that performs poorly on some queries can also have a detrimental impact on user satisfaction. Indeed, retrieval approaches that are chosen solely based on mean average performance fail in general to perform every query equally well. Recently, there has been a shift from creating ranking techniques that behave well on average towards techniques that perform robustly for all queries. The proposed project aims to provide a new theoretical framework allowing the development and evaluation of new IR models that, based on the estimation of risk for each query, apply the best retrieval strategy that optimises effectiveness on that given query. We will study various applications of the framework, including the development of new machine-learned ranking approaches that are risk-sensitive, as well as new risk-sensitive evaluation measures for the online evaluation of web search engines, which allow to decide more quickly whether a new alternative retrieval variant is better than the existing baseline.

Contact: email web

Improving Emergencies and Event Responses in Smart Cities - Dr. Craig Macdonald and Dr. Iadh Ounis

With the rise of mobile internet-connected devices such as smart phones, reporting of events by the general public has become the norm. For instance, the helicopter crash in London in 2013 was captured on video via camera-phone and immediately shared by a resident who happened to be nearby. Indeed, over 72% of UK residents now own a smart phone, creating a huge user base that can report events in real-time. This is particularly true for urban areas with high population density and good mobile internet coverage. Meanwhile, new smart city infrastructures present an opportunity to better respond to events within such cities as they happen. For example, the remote manipulation of traffic-light patterns can be used to open routes for fast access by emergency services, while digital billboards can be used to divert the public from dangerous areas. By combining the real-time reporting of events by the general public with the activity-shaping capabilities of smart cities, there is great potential to more quickly and effectively respond to life-threatening events as they happen.

This PhD project will investigate how emergencies and similar events within smart cities can be automatically identified and how smart city infrastructures can be leveraged to better respond to these events. In particular, it will investigate how the detection of life-threatening events leveraging both social sensors and smart city sensors can be used to suggest prepared response patterns that increase the efficacy of the emergency services. For example, traffic diversion patterns can be used to either decrease the time it takes for emergency services to respond to those events or to minimise ongoing risks to the general public in the vicinity of the event.

The project will build upon our recent research within the SUPER FP7 project into automatic real-time event detection using social media platforms such as Twitter.

Contact: email web

Real-time Analysis of Smart city Data for Venue and Area Marketing - Dr. Craig Macdonald and Dr. Iadh Ounis

Touristic boards aim to attract tourists and citizens to various areas of the cities in order to grow the local economy as well as to promote its cultural heritage. However, the success of the promotions of venues or touristic areas on billboards and promotional marketing material is difficult to quickly measure, often involving the expensive and time-consuming surveys of venues in the participating areas, thereby reducing the possibility to quickly adapt and tailor the touristic marketing strategy to the needs of both tourists and the local businesses.

The advent of real-time sources of information such as CCTV crowd sensors, transport tracking, and location-based social networks such as Foursquare all permit to collect in real-time new valuable sources of information about how busy various areas of the city are. Hence, they can facilitate the analysis of emerging trends of venue occupancies, and the direct measurement of the success of any marketing promotions. Such analytics is challenging, as trends need to be separated from daily and weekly periodic cycles of citizen behaviour and venue occupancy, as well as several confounding factors such as bank holidays and weather (e.g. some areas of bars are busier during sunny days).

This PhD proposal will build on some of our recent research into venue recommendation as part of the SMART FP7 project.

Contact: email web

Temporal Update Hierarchies for Automatic Timeline Generation - Dr. Craig Macdonald and Dr. Iadh Ounis

With the rise of social media platforms such as Twitter and Facebook, users are increasingly using applications that are timeline-centric to access information. Timeline-centric applications are useful, as they are effective at presenting key information and are a natural fit for consumption on smaller mobile devices, such as smart phones. However, current technologies rely on human curation of content to produce the timelines. For instance, editors produce the live update streams that accompany many breaking stories reported on the BBC website. However, employing humans to produce high-quality timelines is costly and there may also be additional language barriers to creating such timelines. As a result, it is not possible to manually create timelines for many local events, or events originating in areas that use a different first language.

On the other hand, there are large volumes of information being reported over diverse news and social media streams that could be used to drive automatic timeline generation. However, so-far, there has been little research into how to produce timelines automatically. Current approaches suffer from the 'needle-in-a-haystack' problem, where relevant content is difficult to distinguish in a timely manner. Moreover, what a user might want to see within their timeline is highly contextual, i.e. effective timeline generation should customise the updates based on what the user read previously and the time period within the event. Some updates may be superseded by later updates, and users may only be interested in particular threads of discussion. This project will examine how to automatically build structured representations of ongoing events using news and social media content that can be used to generate personalized timelines for users tracking those events.

Contact: email web

Computer Vision for Autonomous Systems

Physical realism and reactive systems - Dr. Paul Cockshott

Investigate the extent to which existing and widely accepted formalisms within the Computing Science community like the π calculus and (PEPA) Performance Evaluation Process Algebra are well founded in terms of physics principles.

If you find that they are well founded you will provide proofs of this, if they turn out to be ill founded you will aim to provide a new process algebra whose axioms are physically sound. The student will attend courses in physics such as Quantum Theory, Quantum Information and Relativity and/or Computing Science such as Networked Systems and Modelling Reactive Systems, Research Methods and Techniques, dependent on prior training. The student will then examine a process algebra and map the abstract concepts to relativistic quantum mechanics in order to provide either a contradiction or a confirmation of the axiomatic basis. The student will then apply relativistic quantum mechanics to axiomatically construct a new process algebra. They would then apply this new algebra to high speed circuits such as a PCI bus arbiter.

It is envisaged that this will be co-supervised with the School of Physics and Astronomy.

Contact: email web

LSD-SLAM: versus stereo pyramids - Dr. Paul Cockshott

The project will involve comparing the algorithmic techniques used in large scale monocular SLAM with the established pyramidal stereo vision system used on the Departmental cloth folding robot. The aim would be to investigate whether the performance of cloth manipulation tasks could be improved in terms of speed of hand eye coordination by integrating LD-SLAM techniques. The research would involve both investigations of algorithmic complexity and suitability for GPU and CPU parallelisation of these algorithms.

Contact: email web

Markov Random Fields and History - Dr. Paul Cockshott

The research project would aim to investigate whether geographical markov random field models could be used to evaluate theories of historical evolution.

Markov random field models, and cellular automata models are popular in geographical computer gaming. However the potential of this sort of model for serious historical modelling has yet to be properly investigated. The advantage of such modelling is that it both allows the evaluation of the internal consistency of causal hypotheses about historical evolution and may also allow one to determine what is the simplest causal model that could be consistent with known events. Particular attention would be paid to for example the modeling of the European neolithic transition with a view to determining what may be the key parameters for simulating the Renfrew hypothesis for the origin of Indo European Languages.

Contact: email web

Reinforcement learning in robotics manipulation using visual feedback - Dr J. Paul Siebert and Dr Simon Rogers

The current state-of-the-art in robotics has reached the stage where affordable standard hardware platforms, such as arms and manipulators, controlled by powerful computers, are now available at relatively low cost. What is holding back the development of general purpose robots as the next mass-market consumer product is the current approach to painstakingly designing every behaviour control algorithm necessary to operate the robot for each task it is required to perform.

The objective of this research is to investigate methods that allow the robot to discover how to undertake tasks, such as manipulating objects, by itself by observing the outcome of hand-eye-object interactions. In this approach we propose to by combine visual representations of the scene observed by the robot with manipulator interaction to allow the machine to discover, i.e. learn, how to manipulate objects in specific situations, e.g. how to manipulate cloth to flatten it, or how to unfold or fold clothing, grasp rigid objects of different shapes etc., or how to grasp non rigid objects, or objects of widely differing shapes.

The project would comprise investigating:

    • Appropriate visual representations for surfaces and objects which can be extracted from images captured by the robot's cameras using computer vision techniques – we already have appropriate 3D vision systems able to represent clothing and certain classes of object.
    • The relationship between the physical action applied by a manipulator and the outcome of this action as determined by an algorithm that ranks to what degree the applied action has brought the observed world-state towards a desired world-state, e.g. by how much did this flattening action make the cloth flatter, or did this grasping action allow the griper to grab the cup?
    • Action sequences and their overall consequences for carrying our a task: for example, in non-monotonic learning strategies, early actions may actually reduce any intuitive measure of task progress prior to achieving larger gains later in the task (and therefore achieves greater gains overall than a behaviour that attempts to achieve the same goal by means of monotonic, and incremental improvements).

Our research group has excellent robot facilities on which this project will be based, including Dexterous Blue (housed in it's own laboratory on Level 7 of the Boys Orr building) – a large two armed robot richly sensorised with a high resolution (sub-millimetre) binocular vision system, lower resolution Xtion RGBD sensors, in-gripper tactile sensing and microphonic sensing for clothing and surface texture perception.

Our research using Dexterous Blue can be seen in action at: Access to the Baxter robot, situated in the Computing Science foyer ,will also be available and and example of a student project using Baxter can be viewed at:

Contact: email web

Cognitive Visual Architectures for Autonomous and Robotic Systems - Dr J. Paul Siebert

How well a robot senses the world in effect defines the types of activity it is able to undertake. Vision is the richest of the senses, and recent advances in computer vision have enabled robots to perceive the world sufficiently well to be able to map their environment, localise their position and recognise objects, other robots and the robots own actuators. As a result robots are now capable of navigating autonomously and interacting with specific objects (whose appearance has been learned) or classes of object, such as people or cars, again learned from large image databases.

Despite this progress the state-of-the-art in robot vision is still incomplete in terms of being able to represent objects richly, whereby a full set of visual characteristics are extracted and represented to feed higher reasoning processes. The class of an object gives a noun (e.g. human, car, cup, etc.) while object attributes such as colour, texture and spatial location and motion provide adjectives (blue car, distant person). Shape, geometry, motion and other visual properties discovered through interaction can provide affordances (verbs from the robot's perspective) – what the robot object can do with the object (move it by pushing, knock it over, lift it etc.). These object properties discovered by means of vision and interaction can then be used in reasoning processes, the classic being the ability of a machine to respond to the command “grasp the red box closest to the green bottle”. Therefore to be truly autonomous, a robot must be capable of understanding a scene and be able to reason about it. To achieve such cognitive ability it is necessary to couple sensed visual attributes (and potentially attributes obtained from other sensing modalities) to a reasoning engine to allow action plans to be deduced that allow the robot to achieve goals without the need for wholly pre-programmed interactions actions.

There are potentially a number of projects that could be based on the above theme of vision for robots and other autonomous systems, based on the following investigations:

    • Visual processing architectures to support the extraction of a rich set of visual attributes (2D and 3D) from images captured by the robot's camera systems. This might include space-variant (foveated) visual architectures, as found in the mammalian visual system, supporting attention control and real-time execution.
    • Reasoning and planning systems for controlling interaction of the robot with the environment coupled with visual attention and perception, binding visual perception to action and learning.

Our research group has excellent robot facilities on which this project can be based, including Dexterous Blue (housed in it's own laboratory on Level 7 of the Boys Orr building) – a large two armed robot richly sensorised with a high resolution (sub-millimetre) binocular vision system, lower resolution Xtion RGBD sensors, in-gripper tactile sensing and microphonic sensing for clothing and surface texture perception. Our research using Dexterous Blue can be seen in action at: Access to the Baxter robot, situated in the Computing Science foyer, will also be available and and example of a student project using Baxter can be viewed at:

Contact: email web