GLAsgow Systems Section (GLASS)
GLAsgow Systems Section (GLASS)
The GLAsgow Systems Section (GLASS) researches parallel and distributed systems, networked systems and (safety-critical) software systems. It is led by Professor Phil Trinder. We have a strong focus on real-world systems, and cover all scales and across the hardware-software spectrum. We contribute to, develop and release open source research software. There are several research groups within the section:
- GPG: The Glasgow Parallelism Group (led by Phil Trinder)
- Networked Systems (led by Colin Perkins)
- Systems Engineering (led by Tim Storer)
Much of the research we undertake is collaborative and has industrial partners. We work closely with other groups in Computing Science as well as other schools including Engineering. We also work closely with other world leading Universities and many private and public sector organisations (recently: Airbus, Cisco Systems, EDF, Ericsson, GCHQ, IETF, Microsoft Research, NASA).
Ms Maria Evangelopoulou [staff page] [personal page]
Ms Natascha Harth [staff page] [personal page]
Dr Christos Anagnostopoulos [staff page] [personal page]
Dr Nikos Ntarmos [staff page] [personal page]
The section has around 30 research students, and their details are available on the research group pages and on the School postgraduate student page.
The group is engaged in a wide range of research projects and activities. Most of the research projects below are funded by national or international research programmes.
Crucially the GPG leads the *MaRIONet* network of Excellence and three (from a total of five) of the SADEA projects in the EPSRC Many-core Architectures and Concurrency in Distributed and Embedded Systems (MACDES) priority area.
SADEA 1 AnyScale Apps is an EPSRC project (EP/L000725) joint with Manchester and Edinburgh. The project aims to develop a "write once, scale anywhere" software development and deployment paradigm. Apps are composed of runtime components which interact. Each component has several API-compatible variants that offer different cost/benefit tradeoffs. At runtime, the dynamic resource availability determines which variants are executed at particular nodes in the distributed heterogeneous infrastructure. October 2013 - September 2017.
SADEA 2 Exploiting Parallelism through Type Transformations for Hybrid Manycore Systems (TyTra) is an EPSRC Project (EP/L00058X) joint with Imperial and Heriot-Watt that uses compilation technology to exploit parallelism without changing the program. January 2014 - January 2019.
SADEA 3 Adaptive JIT-based Parallelism (AJITPar) is an EPSRC Project (EP/L000687) in partnership with Microsoft Labs in Cambridge. The project investigates the feasibility of providing performance portability using a combination of dynamic scheduling and dynamic trace-based Just-In-Time (JIT) compilation. October 2013 - May 2017.
From Data Types to Session Types: A Basis for Concurrency and Distribution is an EPSRC Programme Grant (EP/K034413) exploring the potential for session types to structure concurrent and distributed software. It is lead by Simon Gay (Glasgow), Philip Wadler (Edinburgh), and Nobuko Yoshida (Imperial). June 2013 - May 2018.
Border Patrol is an EPSRC project (EP/N028201) that addresses hardware and software cyber security concerns. Joint with Imperial College and Heriot-Watt Universities and EDF, ABB and Xilinx. February 2017 - January 2022.
COST Action IC1201: Behavioural Types for Reliable Large-Scale Software Systems (BETTY) is chaired by Simon Gay. October 2012 - October 2016.
A Situtation-Aware Information Infrastructure (SAI2) is an EPSRC project (EP/L026015) that investigates the possibility of designing and developing a generic, resilient and adaptive situation-aware information infrastructure that would predict and confront the broad range of challenges faced by ICT networks. Joint with Lancaster University. February 2015 - August 2017.
Network Measurement as a Service (NMaaS) is an EPSRC project (EP/N033957) that aims to to design and develop a native Network Measurement-as-a-Service (NMaaS) framework that will allow tenants to express their measurement needs, and to subsequently synthesise the corresponding complex service-level performance functions out of simple monitoring primitives. Joint with University of Edinburgh. September 2016 - August 2019.
The University of Glasgow Raspberry Pi Cloud project is a teaching and research project centred around the construction of a "scale-model" of a Cloud data centre using energy and cost efficient Raspberry Pi devices.
RELEASE: A High-Level Paradigm for Reliable Large-Scale Server Software is an EU FP7 STREP (287510) that aims to scale the radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines (100 000 cores). Phil Trinder at Glasgow is the coordinator for the project, and partners include Ericsson, Uppsala Universitet, Kent University, Erlang Solutions Ltd., NTU Athens, and EDF (France). October 2011 - April 2015.
Instrumentation, Measurement, and Control for the Cloud (IMC2) is an EPSRC-funded Project (EP/L005255) that aims to design and develop an always-on Instrumentation, Measurement, and Control (IMC) framework that will dynamically and adaptively provision unified cloud resources in a unified manner and in short timescales. Joint with JANET. April 2014 - July 2015.
“WebRTC: Media Transport Protocols and Adaption” funded by Ericsson and developing standards for web-based interactive video conferencing, and researching circuit breakers for multimedia congestion control. 2012 - 2015.
CLOPEMA a robot project funded by the EU to develop a clothes folding robot. The Glasgow group is working on the parallel robot vision system. April 2012 - April 2015.
HPC-GAP: High Performance Computational Algebra and Discrete Mathematics is an EPSRC project (EP/G05553X) to improve the software development technologies for HPC software. Our part of the project is to extend high-level parallel Haskell technologies to large scale HPC platforms. The project is a collaboration with Aberdeen, St Andrews and Edinburgh Universities. September 2009 - October 2013.
“Robust Video Streaming over IP” funded by Cisco, studying performance of HTTP-based adaptive streaming for IPTV applications, and improvements to TCP for multimedia. 2010 - 2012.
““Understanding and Reporting on IPTV Behaviour” funded by Cisco (2010-2012), studied scalable reporting for multicast RTP-based IPTV, performance of residential broadband links, and forward error correction. 2010 - 2012.
Members of the Systems section helped design and build Glasgow Parallel Haskell (GpH). It's one of the early robust parallel functional languages, and remains one of the most widely used parallel Haskell models, e.g. the most popular Haskell compiler, GHC supports it on multicores. The sophisticated GUM runtime system supports GpH on distributed-memory machines like clusters. The new GUMSMP runtime system supports GpH on hierarchical architectures like NUMAs or clusters of multicores.
Members of the Systems section helped design and build Haskell distributed parallel Haskell (HdpH). It's a parallel Haskell for large scale distributed-memory machines like clusters or HPC platforms. Crucially, HdpH is implemented in vanilla (GHC) Haskell.
Glasgow Network Functions (GNF).
Extending the matching abilities of OpenFlow.
SDN-based Virtual Machine Management for Cloud Data Centers.
Systems seminars are usually held on Wednesdays. Everyone from the University of Glasgow and beyond is welcome to attend these talks - see the Events tab for more details. We are happy to hear from anyone that would like to visit us to give a talk.
The Systems seminar is coordinated by Anna Lito Michala.
Events this week
There are currently no events scheduled this week
There are currently no upcoming events
Meeting Sociey Challenges: Big Data Driven Approaches(05 October, 2018)
Speaker: Liangxiu Han
By 2020, the total size of digital data generated by social networks, sensors, biomedical imaging and simulation devices, will reach an estimated 44 Zettabytes (i.e. 44 trillion gigabytes) according to IDC reports. This type of 'big data', together with the advances in information and communication technologies such as Internet of things (IoT), connected smart objects, wearable technology, ubiquitous computing, is transforming every aspect of modern life and bringing great challenges and spectacular opportunities to fulfill our dream of a sustainable smart society.
This talk will focus on our recent research and development based on big data driven approaches to address society challenges through several real case studies in various application domains such as Health, Food, Smart Cities etc.
Algebraic Graphs (03 October, 2018)
Speaker: Andrey Mokhov
Abstract: Are you tired of fiddling with sets of vertices and edges when working with graphs? Would you like to have a simple algebraic data type for representing graphs and manipulating them using familiar functional programming abstractions? In this talk, we will learn a new way of thinking about graphs and a new approach to working with graphs in a functional programming language like Haskell. The ideas presented in the talk are implemented in the Alga library: https://github.com/snowleopard/alga. I hope that after the talk you will be able to implement a new algebraic graph library in your favourite programming language in an hour.
Speaker: Andrey Mokhov is a senior lecturer in computer engineering at Newcastle University and a Royal Society Industry Fellow. Andrey is interested in applying abstract mathematics and functional programming to industrial engineering problems. In 2015 he was a visiting researcher at Microsoft Research Cambridge redesigning the build system of the Glasgow Haskell Compiler; this project is currently continued as part of a 4-year Royal Society research fellowship dedicated to studying build systems in general.
Experts need not apply(26 September, 2018)
Speaker: Pavlos Petoumenos
Stagnating single core performance is driving us towards more complex hardware configurations. Extracting all available performance from such systems is not a straightforward task. It involves trial and error and expert knowledge about how hardware and software interact. Development tools have failed to keep up with this challenge. Even for simple optimisation problems, they might fail to extract three quarters of the available performance. Unless we find radical new ways for analysing and optimising applications, the gap between possible and typically achievable performance will only widen.
My research aims at bridging this gap with analysis and optimisation methodologies which are fast, easy to use, and require no supervision or expert guidance. My earlier work used state capturing and lightweight instrumentation to evaluate optimisation decisions without any input from the developer or the user. My current research goes even further, with deep language learning to test, analyse, and improve complex code. This line of work has the potentially of permanently changing how we create optimisation and analysis heuristics.
Efficient Cross-architecture Hardware Virtualisation(19 September, 2018)
Speaker: Tom Spink
Virtualisation is a powerful tool used for the isolation, partitioning, and sharing of physical computing resources. Employed heavily in data centres, becoming increasingly popular in industrial settings, and used by home-users for running alternative operating systems, hardware virtualisation has seen a lot of attention from hardware and software developers over the last ten–fifteen years.
From the hardware side, this takes the form of so-called hardware assisted virtualisation, and appears in technologies such as Intel-VT, AMD-V and ARM Virtualization Extensions. However, most forms of hardware virtualisation are typically same-architecture virtualisation, where
virtual versions of the host physical machine are created, providing very fast isolated instances of the physical machine, in which entire operating systems can be booted. But, there is a distinct lack of hardware support for cross-architecture virtualisation, where the guest machine architecture is different to the host.
I will talk about my research in this area, and describe the cross-architecture virtualisation hyper-visor Captive that can boot unmodified guest operating systems, compiled for one architecture in the virtual machine of another.
I will talk about the challenges of full system simulation (such as memory, instruction, and device emulation), our approaches to this, and how we can efficiently map guest behaviour to host behaviour.
Finally, I will discuss our plans for open-sourcing the hypervisor, the work we are currently doing and what future work we have planned.
Machine Learning in Compiler Optimisation(12 September, 2018)
Speaker: Zheng Wang
Developing an optimising compiler is a highly skilled and arduous process and there is inevitably a software delay whenever a new processor is designed. It often takes several generations of a compiler to start to effectively exploit the processors' potential, by which time a new processor appears and the process starts again. This never-ending game of catch-up means that we rarely fully exploit a shipped processor and it inevitably delays time to market. As we move to multi- and many-core platforms, this problem increases.
This talk will look at some of our award-winning studies on using machine learning to automate the design process of compiler optimisation heuristics. It will demonstrate how machine learning can be employed to reduce expert involvement in compiler design yet yielding significantly better performance than hand-tuned heuristics.
Zheng Wang is a Senior Lecturer at Lancaster University where he leads the distributed systems group. He develops methods and builds systems to allow computer to adapt to the ever-changing environments. His research spans code optimisations, performance, energy efficiency, and systems security. His previous work has won three best paper awards and two best presentation awards at prestigious conferences at compilation and parallel computing.
TBC(29 May, 2018)
Speaker: Peter Inglis
TBC(22 May, 2018)
Speaker: Femi Olukoya
Virtualized Environment Memory Management for Future System Architectures(16 May, 2018)
Speaker: Paul V. Gratz
Hardware virtualization is a major component of large scale server and
data center deployments due to its facilitation of server
consolidation and scalability. Virtualization, however, comes at a
high cost in terms of system main memory utilization. Current virtual
machine (VM) memory management solutions impose a high performance
penalty and are oblivious to the operating regime of the system.
Therefore, there is a great need for low-impact VM memory management
techniques which are aware of, and reactive to, current system state
to drive down the overheads of virtualization. Further, as new memory
technologies become available in the cloud, traditional systems and
hypervisor software must be adapted to the changing systems
architectures to achieve optimal performance and efficiency. This talk
will examine techniques to address these challenges in memory
management for virtualized environments. First, we observe that the
host machine operates under different memory pressure regimes, as the
memory demand from guest VMs changes dynamically at runtime. Adapting
to this runtime system state is critical to reduce the performance
cost of VM memory management. We propose a novel dynamic memory
management policy called Memory Pressure Aware (MPA) ballooning. MPA
ballooning dynamically allocates memory resources to each VM based on
the current memory pressure regime. Moreover, MPA ballooning
proactively reacts and adapts to sudden changes in memory demand from
guest VMs. Next, I will discuss our work characterizing the impact of
current OS timeslice behavior modern, shared, large last-level caches.
Here we show that, counter to recent trends shortening timeslices,
large last-level caches require lengthened timeslices to amortize
their fill time.
Paul V. Gratz is an Associate Professor in the department of
Electrical and Computer Engineering at Texas A&M University, currently
visiting the University of Edinburgh on sabbatical. His research
interests include efficient and reliable design in the context of high
performance computer architecture, processor memory systems and
on-chip interconnection networks. He received his B.S. and
M.S. degrees in Electrical Engineering from The University of Florida
in 1994 and 1997 respectively. From 1997 to 2002 he was a design
engineer with Intel Corporation. He received his Ph.D. degree in
Electrical and Computer Engineering from the University of Texas at
Austin in 2008. His papers "Path Confidence based Lookahead
Prefetching" and "B-Fetch: Branch Prediction Directed Prefetching for
Chip-Multiprocessors" were nominated for best papers at MICRO '16 and
MICRO '14 respectively. At ASPLOS '09, Dr. Gratz received a best
paper award for "An Evaluation of the TRIPS Computer System." In 2016
he received the "Distinguished Achievement Award in Teaching - College
Level" from the Texas A&M Association of Former Students and in 2017
he received the "Excellence Award in Teaching, 2017" from the Texas
A&M College of Engineering.
TBC(15 May, 2018)
Speaker: Dejice Jacob
Curried C++ Template Metaprogramming(09 May, 2018)
Speaker: Paul Keir
Rather than solely confirming aspects of correctness, types hold the potential to be the primary input in a program's construction. As a Turing complete sublanguage, C++ templates can run arbitrarily complex code at compile time. C++ templates are also a strict, functional language, though this aspect can be marginalised, owing to weak support for higher-order (meta)functions; and other omissions such as currying and type classes. In this talk we introduce a small library to allow idiomatic higher-order C++ metafunction classes to be implicitly curried, and demonstrate its application to a selection of interesting folds; with the assistance of the tacit/pointfree paradigm.
TBC(08 May, 2018)
Speaker: Ana Ibrahim
Novel certification challenges to airworthiness(02 May, 2018)
Speaker: Paul Caseley
New technologies pose evaluation challenges for airworthiness for future aircraft and their supporting systems. The UK Ministry of Defence has been investigating some of these new challenges through a 3 year research program which included academic and industry contributions. This presentation discusses some of the proposed and implemented solutions and ongoing challenges the research has highlighted for safety practitioners, developer policy holders and regulators. Topic areas for discussion include: Additive Manufacture (evidence based qualification /certification guidance), Multi Core processing environments (evaluating performance and provision of tools), Data Driven Safety (increased reliance on data in airworthiness, guidance strategies to evaluate), Pilot Substitution Functions (assisting operators through autonomous functions and reasoning of the system architectures including possible non deterministic functions), Mitigating Cyber – strategies to evaluate application of guidance / standards for threats and vulnerabilities affecting airworthiness.
TBC(01 May, 2018)
Speaker: Wing Li
On Uncoordinated Service Placement in Edge-Clouds, IEEE CloudCom 2017(25 April, 2018)
Speaker: Ioannis Psaras
TBC(24 April, 2018)
Speaker: Saad Nasser S Altamimi
SGXBounds: Memory Safety for Shielded Execution(18 April, 2018)
Speaker: Pramod Bhatotia
In this talk, I will present our work on how to efficiently
ensure memory safety for shielded execution in the untrusted environment
TBC(17 April, 2018)
Speaker: Colin Perkins
TBC(10 April, 2018)
Speaker: Tim Storer
TBC(27 March, 2018)
Speaker: Ibrahim Alghamdi
Accelerating Deep Neural Networks on Low Power Heterogeneous Architectures(21 March, 2018)
Speaker: Jose Cano Reyes
Deep learning applications are able to recognise images and speech with great accuracy, and their use is now everywhere in our daily lives. However, developing deep learning architectures such as deep neural networks in embedded systems is a challenging task because of the demanding computational resources and power consumption. Hence, sophisticated algorithms and methods that exploit the hardware of the embedded systems need to be investigated. This paper is our first step towards examining methods and optimisations for deep neural networks that can leverage the hardware architecture of low power embedded devices. In particular, in this work we accelerate the inference time of the VGG-16 neural network on the ODROID-XU4 board. More specifically, a serial version of VGG-16 is parallelised for both the CPU and GPU present on the board using OpenMP and OpenCL. We also investigate several optimisation techniques that exploit the specific hardware architecture of the ODROID board and can accelerate the inference further. One of these optimisations uses the CLBlast library specifically tuned for the ARM Mali-T628 GPU present on the board. Overall, we improve the inference time of the initial serial version of the code by 2.8X using OpenMP, and by 9.4X using the most optimised version of OpenCL.
TBC(20 March, 2018)
Speaker: Abeer Ali
TBC(13 March, 2018)
Speaker: Mohammed Alhamed
Solving the Task Variant Allocation Problem in Distributed Robotics(06 March, 2018)
Speaker: Anna Lito Michala
We consider the problem of assigning software processes (or
tasks) to hardware processors in distributed robotics environments. We
introduce the notion of a task variant, which supports the adaptation
of software to specific hardware configurations. Task variants
facilitate the trade-off of functional quality versus the requisite
capacity and type of target execution processors. We formalise the
problem as a mathematical model that incorporates typical constraints
found in robotics applications; the model is a constrained form of a
multi-objective, multi-dimensional, multiple-choice knapsack problem.
We propose and evaluate three different solution methods to the
problem: constraint programming, a greedy heuristic and a local search
metaheuristic. We demonstrate the use of task variants in a real
interactive multi-agent navigation system, showing that constraint
programming improves the systems quality of service, as compared to the
local search metaheuristic, the greedy heuristic and a randomised
solution, by an average of 16%, 31% and 56% respectively.
TBC(27 February, 2018)
Speaker: Wim Vanderbauwhede
Safe and Efficient Data Representations for Dynamic Languages with Shared-Memory Parallelism(21 February, 2018)
Speaker: Stefan Marr
The performance of dynamic languages improved ever since the days of Self. Even so they provide a plethora of features seemingly at odds with an efficient
Ruby, or Smalltalk as fast as other less dynamic languages. However, there
remains one domain where dynamic languages haven't reached their more
conservative counterparts: shared-memory parallelism.
So far, the fewest dynamic language implementations shed their global
interpreter locks, which allow for simple and efficient data structures for
objects or collections because data races don't have to be considered. The few
implementations that did, unfortunately expose applications to data races
originating in the VM.
This talk presents work on safe data representations for objects and built-in
collections that neither prevent parallel execution nor expose data races
originating in VM-level implementation choices. We show that it is possible to
avoid any overhead on single threaded code as well as making the data
structures scalable for parallel use cases.
Translating system models and paradigms(20 February, 2018)
Speaker: Tom Wallis
When we architect a new system, optimise an existing one, or investigate a system’s failure, we build models. Being able to represent and analyse what we build is essential. Unfortunately, different types of analysis require models of different kinds, and for very large systems the process of capturing and analysing that system model can be intractable.
One method to circumvent this is to translate between different sorts of system model. This way, we can capture information in a sensible format, but convert to something more sensible when performing analysis, and we already do this to a small extent when the paradigms being converted between are relatively similar. However, vastly different kinds of model — representing different system properties selected from a gamut of behaviours, traits, and environmental properties such as models of time — are difficult to convert. Why is this? What is the current state of the art? What are the potential risks? And how might we go about performing these translations?
A Design-Driven Methodology for the Development of Large-Scale Orchestrating Applications(14 February, 2018)
Speaker: Milan Kabáč
Our environment is increasingly populated with large amounts of smart objects to monitor free parking spaces, analyze material conditions in buildings, detect unsafe pollution levels in cities, etc. These massive amounts of sensing and actuation devices constitute large-scale infrastructures that span over entire parking lots, campuses of buildings or agricultural fields. Despite the fact that large-scale sensor infrastructures have been successfully deployed in a number of domains, the development of applications for these infrastructures remains challenging. In particular, considerable knowledge about the hardware/network specificities of the sensor infrastructure is required on the part of the developer. To address this issue, software development methodologies and tools raising the level of abstraction need to be introduced to allow non-expert developers program applications.
Why aren't more users more happy with our VMs?(07 February, 2018)
Speaker: Laurence Tratt
Programming language Virtual Machines (VM)s are now widely used, from server applications to web browsers. Published benchmarks regularly show that VMs can optimise programs to the same degree as, and often substantially better than, traditional static compilers.
Yet, there are still people who are unhappy with the VMs they use. Frequently their programs don't run anywhere near as fast as benchmarks suggest; sometimes their programs even run slower than more naive language implementations. Often our reaction is to tell such users that their programs are "wrong" and that they should fix them.
This talk takes a detailed look at VM performance, based on a lengthy experiment: we not only uncovered unexpected patterns of behaviour, but found that VMs perform poorly far more often than previously thought. I will draw on some of my own experiences to suggest how we may have gotten into such a pickle. Finally, I will offer some suggestions as to how we might be able to make more VM users more happy in the future.
YewPar: A Framework for Scalable Re-useable Parallel Tree Search(06 February, 2018)
Speaker: Blair Archibald
How can applications benefit from NVRAM technology? Evaluation Methodology and Testing(31 January, 2018)
Speaker: Juan Herrera
The objective of NEXTGenIO, an EC H2020 project, is to design and implement a platform that can address the challenge of delivering scalable I/O performance to applications at the Exascale. NVRAM is used to reduce the latency gap between memory and storage. In order to evaluate the platform's effectiveness regarding I/O performance and throughput, a set of eight memory and I/O-bound applications have been selected. In this talk, the methodology for testing the NEXTGenIO platform will be presented. One of the topics to be discussed is how NVRAM will impact the end-to-end performance of these applications.
TBC(30 January, 2018)
Speaker: Niall Barr
Brain Computer Interface for Neurorehabilitation and Inclusive Gaming(24 January, 2018)
Speaker: Aleksandra Vuckovic
Two major clinical applications for Brain Computer Interfaces are for assistive devices and for neurorehabilitation. In my talk I will present research activities of Rehabilitation Engineering group in these areas. First part of the talk will be dedicated to two clinical applications that I’ve been working on for a number of years, rehabilitation of hand function and treatment of chronic pain in people with spinal cord injury. I will present our Impact Case study, continuation from REF 2014, development of BCI software for portable home based applications as medical device class 1 (ISO 62304) and first steps towards creating BCI service design. Following this, I will present some initial results and plans for the future for BCI inclusive serious games for rehabilitation and entertainment. I am looking forward to discuss areas of possible collaborations.
Sigma16: A computer architecture for teaching and research(23 January, 2018)
Speaker: John O'Donnell
On the Effort to Build Efficient Deep Learning Systems(13 December, 2017)
Speaker: Partha Maji
Quantum Computing and Computational Fluid Dynamics: potential applications and simulation on classical HPC systems(06 December, 2017)
Speaker: Rene Steijl
In recent years the field of quantum computing (QC) has grown into an active and diverse field of research and significant progress has been made with building quantum computers. For a small number of applications, quantum algorithms have been developed that would lead to a significant speed-up relative to classical methods when executed on a suitable quantum computer. Despite this research effort, progress in defining suitable applications for quantum computers has been relatively limited and two decades after their invention, Shor's algorithm for factoring composite integers and Grover's algorithm for quantum search are still among the main applications.
In the present work, we investigate the potential of quantum computing and suitably designed algorithms for future computational fluid dynamics (CFD) applications. In the absence of the required quantum hardware, large-scale parallel simulations on 'classical' parallel computers are required in developing such algorithms. The presentation will cover a number of quantum algorithms which can potentially be used effectively as part of larger CFD algorithms/methods. Errors introduced by quantum decoherence, gate errors as well as uncertainties introduced by quantum measurement operations all need to be accounted for in the analysis and design of new quantum algorithms for practical use. A parallel quantum computing simulator was developed as part of the present investigation. Challenges and results from simulations on HPC facilities form the second main aspect of this presentation. Finally, ideas and prospects for future developments will be presented.
Structural and Behavioural Types for SoC Design: Motivations and Challenges(05 December, 2017)
Speaker: Jan de Muijnck-Hughes
The Border Patrol Project seeks to investigate how state-of-the-art
advances in programming language theory can provide better guarantees
towards System-On-Chip (SoC) design and execution. Specifically we are
interested in extending existing work on structural type systems for
SoC with behavioural information. Specifically, we are looking to
incorporate Value-Dependent Multi-Party Session Types using dependent
In this talk I will aim to: discuss the goals of the Border Patrol
Project; outline some of the design challenges we have encountered so
far when looking to adapt multi-party session types for describing
hardware; and demonstrate how dependent types can help reason about the
structure of SoC architectures.
Towards High-Performance Code Generation for Streaming Applications on FPGA Clusters(28 November, 2017)
Speaker: S Waqar Nabi
High-performance computing on heterogeneous platforms in general and those with FPGAs in particular presents a significant programming challenge. We contend that compiler technology has to evolve to automatically optimize applications by transforming a given original program. We are developing a novel methodology based on “TYpe TRAnsformations” (TyTra) of a given scientific kernel.
I am going to talk about the overall TyTra framework, with a focus on the memory access optimizations required to maintain “streaming” on the FPGAs, which is essential to get performance out of these devices. A “2d-shallow-water” scientific model will be used as an illustration and I will discuss some recent results.
I will also discuss my experience of visiting the CHREC laboratory at the University of Florida this summer as part of a HiPEAC collaboration grant. I will talk a little about their FPGA cluster(s), and how working with them informs the development the TyTra optimizing compiler.
Design Patterns for Robustness in Community Network Infrastructure(22 November, 2017)
Speaker: William Waites
End-host Driven Troubleshooting Architecture for Software-Defined Networks(21 November, 2017)
Speaker: Levente Csikor
The high variability in traffic demands, the advanced networking services at various layers (e.g., load-balancers), and the steady penetration of SDN technology and virtualization make the crucial network troubleshooting tasks ever more challenging over multi-tenant environments.
Service degradation is first realized by the users and, as being the only one having visibility to many relevant information (e.g., connection details) required for accurate and timely problem resolution, the infrastructure layer is often forced upon continuous monitoring resulting in wasteful resource management, not to mention the long time frames.
In this talk, I will propose an End-host-Driven Troubleshooting architecture (EDT), where users are able to share the application-specific connection details with the infrastructure to accelerate the identification of root causes of performance degradation, and to avoid the need for always-on, resource-intensive, and network-wide monitoring.
Utilizing EDT, I will show some essential tools for real end-to-end trace routing (PTR), identifying packet losses, and carry out hop-by-hop latency measurements (HEL).
Energy Consumption per Operation in a Deep Learning Neural Network(15 November, 2017)
Speaker: Shufan Yang
A wide range of video/vision applications including robotics, advanced driver assistance and autonomous vehicles currently require high performance processing for object recognition. Many popular deep learning based object detection frameworks are quite impressive, however these frameworks still require very high computation power. With the performance/power/area ratio limited in embedded systems, this poses an interesting problem. To build a quicker and more accurate video processing system, we have constructed a CPU/GPGPU/FPGA hybrid system to provide a flexible solution that combines software and hardware programmability to investigate energy consumptions. This talk will cover our latest cross-cutting software and hardware programmable approach to address the performance/power/area ratio challenges posed by convolutional neural networks based on complex machine vision applications.
Bio: Dr Shufan Yang received a Ph.D. degree in Computer Science from University of Manchester in 2010, under supervisor Professor Steve Furber. She is currently a lecturer in the School of Engineering at University of Glasgow. Prior to joining UoG, she was a Post-doc at one of European largest Intelligent Robotic group (ISRC) in University of Ulster from 2010 to 2012. Her research interests include System-on-Chip, Machine Vision and the implementation of reconfigurable architectures. She has published over 50 journal and conference papers. She has joined research projects in SpiNNaker, EUFP7 IM-CleveR and EU Si Elegans. Her research has been sponsored by DstL, EU and NCFS as well as industry partners from Rolls-Royce, Xilinx, ARM, NVidia and TIC Clean Companies.
From Monoliths to Microservices(14 November, 2017)
Speaker: Mircea Iordache
Network security has been struggling to protect modern networks because of non-parallelizable behaviour and monolithic functionality that doesn’t perform well in distributed environments due to design, often impacting overall performance for network users. Reimplementing security applications to conform to modern standards is unfeasible due to the scale of the undertaking, so I propose an alternative based on controlling behaviour to create flexible microservices that improve user-to-service latency and network infrastructure utilisation.
Caching the Internet(07 November, 2017)
Speaker: Marcel Flores
Towards a predictable cloud(01 November, 2017)
Speaker: Thomas Karagiannis
These are exciting times for technologies in the cloud. One of the key requirements for high-performant applications in today’s multi-tenant datacenters is performance predictability, a traditionally elusive property for shared resources like the network or storage. Yet, online services running in infrastructure datacenters need such predictability to satisfy applications SLAs. Cloud datacenters require guaranteed performance to bound customer costs and spur adoption. In this talk, through the story of the Predictable Datacenters Project at Microsoft Research which resulted in key QoS features in Windows Server, I will describe how simple abstractions and mechanisms can offer predictable performance for shared cloud resources like the network and even storage. Finally, I will shortly discuss projects that the group in Cambridge is focusing on.
Thomas Karagiannis is a senior researcher with the Systems and Networking group of Microsoft Research Cambridge, UK. His research interests span most aspects of computer communications and networks with his current focus being on data centers and the cloud. His past work spans Internet measurements and monitoring, network management, home networks and social networks. He holds a Ph.D. degree in Computer Science from the University of California, Riverside and B.S at the Applied Informatics department of the University of Macedonia, in Thessaloniki, Greece. Thomas has published several papers in the premier venues for computer communications and networking and has served in several of the corresponding technical program committees.
Yesterday my Java profiler worked. Today it does not. Why?(31 October, 2017)
Speaker: Jeremy Singer
The Java virtual machine (JVM) and its hosted programming languages are evolving rapidly. Unfortunately there are two side effects. (1) Quantitative studies of characteristic behaviour are quickly outdated. (2) JVM profiling requires constant tool maintenance effort. This presentation explores how to make JVM profiling great again.
Software-Defined Datacenter Network Debugging(25 October, 2017)
Speaker: Myungjin Lee
Datacenter network debugging is complex. Existing network debuggers are even
more complex, requiring in-network techniques like dynamic switch rule
updates, collecting per-packet per-switch logs, collecting data plane
snapshots, packet mirroring, packet sampling, traffic replay, etc.
In this talk, I will call for a radically different approach: in contrast to
existing tools that implement the functionality entirely in-network (i.e., on
network switches), we should carefully partition the debugging tasks between the
edge and the network. To that end, I present PathDump, a minimalistic network
debugger that achieves two main goals: (1) implement a large fraction of
published network debugging functionalities using the network edge only; and (2)
for functionalities that cannot entirely be implemented at the edge, use
debugging at the edge to reduce the problem to a small subset of the network. In
particular, I will discuss the design, implementation and evaluation of PathDump
that runs over a real network comprising only of commodity network components.
Mapping an Anycast CDN Using RIPE Atlas(24 October, 2017)
Speaker: Stephen McQuistin
Anycast CDNs announce the same IP address blocks from different physical sites, or Points-of-Presence (PoPs). They then rely upon Internet routing to map clients to PoPs, creating catchments: the set of clients that map to a given PoP. Optimisation of these catchments is important, as performance, scalability, and resilience are reduced by poor catchments (e.g., clients connecting to distant PoPs). Exploring new anycast configurations requires changes in anycast announcements, and understanding the impact of these changes is challenging. A large, diverse set of vantage points is required for coverage, but this makes it difficult to surface changes that are most significant to the CDN. In this talk, I’ll describe a methodology for mapping anycast catchments and evaluating changes in anycast configuration at a large CDN.
Device Architectures, Networks and Applications: A Semiconductor Perspective(18 October, 2017)
Speaker: Tim Summers
The Lift Project: Performance Portable Parallel Code Generation via Rewrite Rules(17 October, 2017)
Speaker: Michel Steuwer, Adam Harries, Naums Mogers, Federico Pizzuti, Toomas Remmelg, Larisa Stoltzfus
The Lift project aims at generating high-performance code for parallel processors from a portable high-level program. Starting from a single high-level program an optimisation process based on rewrite rules transforms the portable program into highly-specialised low-level code delivering high-performance.
This talk will motivate the indispensability of performance portability given the increasing pace of the development of specialised hardware such as GPUs, FPGAs, or Google's TPU. After a brief introduction of the core aspects of Lift, the Lift team will give an overview of our ongoing research of using Lift for accelerating areas such as machine learning, physics simulations, graph algorithms, and linear algebra.
Improving Fuzzing with Deep Learning(10 October, 2017)
Speaker: Martin Sablotny
Today’s software products are complex entities with many functions. Those functions have to be tested thoroughly in order to prevent security issues. In the modern software development process fuzzing an important role in finding security related bugs.
Nonetheless developing generation based fuzzers for complex input formats is a time consuming work and requires a lot of knowledge about those formats. This work focuses on the use of deep learning algorithms in order to create HTML tags, which are combined to test cases and executed inside a browser. First results have shown that it is possible to learn the format from a generation based fuzzer and outperform it in terms of code coverage.
Winning the War in Memory(27 September, 2017)
Speaker: Prof Simon Moore
Memory safety bugs result in many vulnerabilities in our computer systems allowing exploits including recent security breaches: WannaCry, HeartBleed, CloudBleed and StackClash. To fundamentally improve computer system resilience to these attacks, we propose a new processor (CHERI) together with compiler and operating system support that mitigate these bugs with few changes to applications. CHERI provides fine grained memory protection using a new hardware supported type: the capability. Capabilities provide hardware enforced provenance, integrity and bounds checking for code and data references. We demonstrate how (code, data) capability pairs can be used for highly scalable and performant compartmentalisation. Efficient compartmentalisation allows the principle of least privilege to be widely applied, mitigating both known and unknown attacks. Though these changes to computer systems are radical, there is a clear adoption path and we are currently working with major commercial partners to transition the technology.
Towards Composable GPU Programming: Programming GPUs with Eager Actions and Lazy Views(26 September, 2017)
Speaker: Michel Steuwer
In this work, we advocate a composable approach to programming systems with Graphics Processing Units (GPU): programs are developed as compositions of generic, reusable patterns. Current GPU programming approaches either rely on low-level, monolithic code without patterns (CUDA and OpenCL), which achieves high performance at the cost of cumbersome and error-prone programming, or they improve the programmability by using pattern-based abstractions (e.g., Thrust) but pay a performance penalty due to inefficient implementations of pattern composition.
We develop an API for GPUs based programming on C++ with STL-style patterns and its compiler-based implementation. Our API gives the application developers the native C++ means (views and actions) to specify precisely which pattern compositions should be automatically fused during code generation into a single efficient GPU kernel, thereby ensuring a high target performance. We implement our approach by extending the range-v3 library which is currently being developed for the forthcoming C++ standards. The composable programming in our approach is done exclusively in the standard C++14, with STL algorithms used as patterns which we re-implemented in parallel for GPU. Our compiler implementation is based on the LLVM and Clang frameworks, and we use advanced multi-stage programming techniques for aggressive runtime optimizations.
We experimentally evaluate our approach using a set of benchmark applications and a real-world case study from the area of image processing. Our codes achieve performance competitive with CUDA monolithic implementations, and we outperform pattern-based codes written using Nvidia’s Thrust.
Designing Processors to Accelerate Robot Motion Planning(20 September, 2017)
Speaker: Prof. Daniel J. Sorin
We have developed a hardware accelerator for motion planning, a critical operation in robotics. I will present the microarchitecture of our accelerator and describe a prototype implementation on an FPGA. Experimental results show that, compared to the state of the art, the accelerator improves performance by three orders of magnitude and improves power consumption by more than one order of magnitude. These gains are achieved through careful hardware/software co-design. We have modified conventional motion planning algorithms to aggressively precompute collision data, and we have implemented a microarchitecture that leverages the parallelism present in the problem.
Alternative Explicit Congestion Notification Backoff for TCP: Or how one small change makes the Internet Better(17 May, 2017)
Speaker: Gorry Fairhurst
Active Queue Management (AQM) with Explicit Congestion Notification (ECN) has been deployed in cloud data centres to minimise the latency and improve the near real-time deadlines for workflows such as Partition/Aggregate tasks. The talk explores how ECN can also reduce latency of transactional applications using the Internet. This leads to a simple sender-side change to TCP, “Alternative Backoff with ECN”, and how this can offer a compelling reason to deploy and enable ECN across the Internet. It finally outlines the path to standarisation and how future research can enable new applications.
Gorry Fairhurst is a Professor in the School of Engineering at the University of Aberdeen. His current research include performance evaluation and protocol design, Internet transport architecture, rural broadband access and satellite networking. He has 20 years experience working as an Internet Engineer, and is committed to open Internet standards and chairs the IETF’s Transport and Services Working Group (TSVWG).
OpenCL Just-In-Time Compilation for Dynamic Programming Languages(03 May, 2017)
Speaker: Michel Steuwer & Juan Fumero
Computer systems are increasingly featuring powerful parallel devices with the advent of many-core CPUs and GPUs. This offers the opportunity to solve computationally-intensive problems at a fraction of the time traditional CPUs need. However, exploiting heterogeneous hardware requires the use of low-level programming language approaches such as OpenCL, which is incredibly challenging, even for advanced programmers.
On the application side, interpreted dynamic languages are increasingly becoming popular in many domains due to their simplicity, expressiveness and flexibility. However, this creates a wide gap between the high-level abstractions offered to programmers and the low-level hardware-specific interface. Currently, programmers must rely on high performance libraries or they are forced to write parts of their application in a low-level language like OpenCL. Ideally, nonexpert programmers should be able to exploit heterogeneous hardware directly from their interpreted dynamic languages.
In this talk, we present a technique to transparently and automatically offload computations from interpreted dynamic languages to heterogeneous devices. Using just-in-time compilation, we automatically generate OpenCL code at runtime which is specialized to the actual observed data types using profiling information. We demonstrate our technique using R, which is a popular interpreted dynamic language predominately used in big data analytic. Our experimental results show the execution on a GPU yields speedups of over 150x compared to the sequential R implementation and the obtained performance is competitive with manually written GPU code. We also show that when taking into account start-up time, large speedups are achievable, even when the applications run for as little as a few seconds.
Simulating Variance in Socio-Technical Behaviours using Executable Workflow Fuzzing(02 May, 2017)
Speaker: Tim Storer
Socio-technical systems model the structure and interactions of
heterogeneous collections of actors, including human operators,
technical artefacts and organisations. Such systems are characterised
by the interactions of actors at different scales of activity and behave
according to a complex interplay of factors, including formally defined
business processes, legal or regulatory standards, technological
evolution, organisational culture or norms and interpersonal
relationships and responsibilities. The modelling and engineering of
such systems is still very much a a craft, requiring repeated trial,
error and subsequent revision. Application of conventional systems
modelling methods is difficult, because socio-technical systems are not
readily disposed to functional decomposition, as the complex
interactions between components makes a separation of concerns
difficult. As a consequence, existing techniques result in models that
either lack sufficient detail to capture the effect of subtle
contingencies; are too narrow to make useful assessments about the
larger system; are unable to
capture evolution in behaviours; or are so complex that analysis and
interpretation becomes intractable.
In this work, I will present a novel method for modelling
socio-technical systems that substantially reduces the difficulty of
simulating complex contingent behaviours. In our approach, informal,
contingent behaviours are modelled as aspects that can be applied
obliviously to alter actor behaviour described in idealised workflows.
The aspects apply code fuzzers to the workflow descriptions, adjusting
the flow of execution of a workflow and representing the variability
that can occur in real life systems. I will present a proof of concept
tool, Fuzzi Moss, and evaluate the approach using a case study of
software development workflows.
ePython: An implementation of Python for the micro-core Epiphany co-processor(26 April, 2017)
Speaker: Nick Brown
The Epiphany is a many-core, low power, low on-chip memory co-processor typical of a number of innovative micro-core architectures. The very low power nature of these architectures means that there is potential for their use in future HPC machines, and their low cost makes them ideal for HPC education & prototyping. However there is a high barrier to entry in programming due to the associated complexities and immaturity of supporting tools.
I will talk about ePython, a subset of Python for the Epiphany. Due to the idiosyncrasies of the hardware we have developed a new Python interpreter and this, combined with additional support for parallelism, has meant that novices can take advantage of Python to very quickly write parallel codes on the Epiphany and easily prototype their codes. In addition to running codes directly we have developed support for decorating kernels in existing Python codes and for these to be seamlessly offloaded, via ePython, to the Epiphany. I will discuss a prototype machine learning code for detecting lung cancer in 3D CT scans, where our decorators are used to offload the neural network onto the Epiphany in order to evaluate whether this technology is appropriate for these sorts of codes and what sort of performance once can expect.
Analytic Hierarchy Process Objective Function(25 April, 2017)
Speaker: Walaa Alayed
The IPv6 Routing Protocol for Low Power and Lossy Networks (RPL) rely on the use of external Objective Functions for selecting the best path, where the majority of OFs are based on a single routing metric. In this talk I’ll be presenting an Analytic Hierarchy Process Objective Function (AHP-OF) inspired by multi-criteria decision making techniques. The idea of AHP-OF is to combine several routing metrics by using the Analytic Hierarchy Process (AHP) technique to provide a better neighbour selection compared to existing OFs. The motivation of designing AHP-OF is to satisfy the different application requirements for Low Power and Lossy Networks (LLN) such as reliable, real time and highly available applications.
Walk this Way(19 April, 2017)
Speaker: Prof. Des Higham
Many applications require us to summarize key properties of a large, complex network. I will focus on the task of quantifying the relative importance, or "centrality" of the network nodes. This task is routinely performed, for example, on networks arising in biology, security, social science and telecommunication. To derive suitable algorithms, the concept of a walk around the network has proved useful; through either the dynamics of random walks or the combinatorics of deterministic walks.
In this talk I will argue that some types of walk are less relevant than others. In particular, eliminating backtracking walks leads to new network centrality measures with attractive properties and, perhaps surprisingly, reduced computational cost. Defining, analysing and implementing these new methods combines ideas from graph theory, matrix polynomial theory and sparse matrix computations.
Scalable Computing Beyond the Cloud(29 March, 2017)
Speaker: Blesson Varghese
It is forecast that over 50 billion devices will be added to the Internet by 2020. Consequently, 50 trillion gigabytes of data will be generated. Currently, applications generating data on user devices, such as smartphones, tablets and wearables use the cloud as a centralised server. This will soon become an untenable computing model. The way forward is to decentralise computations away from the cloud towards the edge of the network closer to the user. In my talk, I will present challenges, my current research and vision to harness computing capabilities at the edge of the network. More information is available at www.blessonv.com.
Generative Programming and Product Family Engineering with WizardsWorkbench(21 March, 2017)
Speaker: Niall Barr
Language Workbenches are tools used to support the creation and use of Domain Specific Languages (DSLs), frequently for the purpose of supporting Language Oriented Programming (LOP) or Generative Programming. LOP is an approach to application development where a language that is close to the problem domain is created, and the application is developed in this language. Generative programming is the related approach where a language at a high level of abstraction is used, and source code in a more general purpose language is generated from that code. In this talk I will describe the approach to web application development using generative programming that I have been using and evolving over several years, and my simple language workbench, WizardsWorkbench. As these web applications tend to follow a fairly similar pattern, DSLs are reused and evolved as required, my approach can be considered to be a form of Product Family Engineering that utilises generative programming. I will also describe the example driven approach which is used with WizardsWorkbench to develop both the parsers and the code generation output templates as well as the two DSLs used internally by WizardsWorkbench for parsers and templates.
Programmable Address Spaces(15 March, 2017)
Speaker: Paul Keir
In the last decade, high-performance computing has made increasing use of heterogeneous many-core parallelism. Typically the individual processor cores within such a system are radically simpler than their predecessors; and an increased portion of the challenge in executing relevant programs efficiently is reassigned. Tasks, previously the responsibility of hardware, are now delegated to software. Fast, on-chip memory, will primarily be exposed within a series of trivially distinct programming languages, through a handful of address spaces annotations, which associate discrete sections of memory with pointers; or similar low-level abstractions. Traditional CPUs would provide a hardware data cache for such functionality. Our work aims to improve the programmability of address spaces by exposing new functionality within the existing template metaprogramming system of C++
GPU Concurrency: The Wild West of Programming(08 March, 2017)
Speaker: Tyler Sorensen
GPUs are co-processors originally designed to accelerate graphics computations. However, their high bandwidth and low energy consumption have led to general purpose applications running on GPUs. To remain relevant in the fast-changing landscape of GPU frameworks, GPU programming models are often vague or underspecified. Because of this, several programming constructs have been developed which violate the official programming models, yet execute successfully on a specific GPU chip, enabling more diverse applications to be written for that specific device. During my PhD, we have examined one such construct: a global synchronisation barrier (or GSB). In this talk, we will address three key questions around this rogue programming construct: (1) Is it *possible* to write a portable GSB that successfully executes on a wide range of today's GPUs? (2) Can a GSB be *useful* for accelerating applications on GPUs? And (3) can a programming model that allows a GSB be *sustainable* for future GPU frameworks? Our hope is that this investigation will help the GSB find a permanent home in GPU programming models, enabling developers to exciting new applications in a safe and portable way.
Short Bio: Tyler’s research interests are in developing and understanding models for testing and safely developing GPU applications which contain irregular computations. In particular, he examines issues related to the GPU relaxed memory model and execution model. He received his MSc from University of Utah in 2014 and worked as an intern for the Nvidia compiler team during the summers of 2013 and 2014.
A Framework for Virtualized Security(07 March, 2017)
Speaker: Abeer Ali
Traditional network security systems consist of deploying high-performance and high-cost appliances (middleboxes) in fixed locations of the physical infrastructure to process traffic to prevent, detect or mitigate attacks. This limits their provisioning abilities to a static specification, hindering extensible functionality and resulting in vendor lock-in.Virtualizing security function avoids these problems and increases the efficiency of the system. In this talk, we present the requirements and challenges of building a framework to deploy and manage virtualized security functions in a multitenant virtualized infrastructure like Cloud and how we can exploit latest advances in Network Function Virtualization (NFV) and network services offered by Software-Defined Networking (SDN) to implement it.
Type-Driven Development of Communicating Systems using Idris(01 March, 2017)
Speaker: Dr. Jan de Muijnck-Hughes
Communicating protocols are a cornerstone of modern system design. However, there is a disconnect between the different tooling used to design, implement and reason about these protocols and their implementations. Session Types are a typing discipline that help resolve this difference by allowing protocol specifications to be used during type-checking to ensure that implementations adhere to a given specification.
Idris is a general purpose programming language that supports full-dependent types, providing programmers with the ability to reason more precisely about programs. This talk introduces =Sessions=, our implementation of Session Types in Idris, and demonstrates =Sessions= ability to design and realise several common protocols.
=Sessions= improves upon existing Session Type implementations by introducing value dependencies between messages and fine-grained channel management during protocol design and implementation. We also use Idris' support for EDSL construction to allow for protocols to be designed and reasoned about in the same language as their implementation. Thereby allowing for an intrinsic bond to be introduced between a protocol's implementation and specification, and also with its verification.
Using =Sessions=, we can reduce the existing disconnect between the tooling used for protocol design, implementation, and verification.
Next Generation Cyber-physical systems(22 February, 2017)
Speaker: Dr Steven J Johnston
Cyber-physical systems (CPS) have peaked in the hype curve and have demonstrated they are here to stay in one form or another. Many cities have attempted to retrofit 'smart' capabilities and there is no shortage of disconnected, often proprietary CPS addressing city infrastructure.
In the same way that online activity evolved from simplistic webpages to feature rich web 2.0, CPS also need to evolve. What will the Smart City 2.0 of tomorrow will look like, how will the architectures will evolve and most importantly how does this address the key challenges of cities; energy, environment and citizens. (Audience interaction welcomed)
Get Your Feet Wet With SDN in a HARMLE$$ Way(21 February, 2017)
Speaker: Levente Csikor
Software-Defined Networking (SDN) offers a new way to operate, manage, and deploy communication networks and to overcome many of the long-standing problems of legacy networking. However, widespread SDN adoption has not occurred yet, due to the lack of a viable incremental deployment path and the relatively immature present state of SDN-capable devices on the market. While continuously evolving software switches may alleviate the operational issues of commercial hardware-based SDN offerings, lagging standards-compliance, performance regressions, and poor scaling, they fail to match the cost-efficiency and port density. In this paper, we propose HARMLESS, a new SDN switch design that seamlessly adds SDN capability to legacy network gear, by emulating the OpenFlow switch OS in a separate software switch component. This way, HARMLESS enables a quick and easy leap into SDN, combining the rapid innovation and upgrade cycles of software switches with the port density and cost-efficiency of hardware-based appliances into a fully dataplane-transparent and vendor-neutral solution. HARMLESS incurs an order of magnitude smaller initial expenditure for an SDN deployment than existing turnkey vendor SDN solutions while, at the same time, yields matching, or even better, data plane performance.
Network-layer QoE-Fairness for Encrypted Adaptive Video Streams(15 February, 2017)
Speaker: Dr Marwan Fayed
Netflix, YouTube, iPlayer, are increasingly targets of the following complaint: "How come my child gets HD streams on her phone, while I'm stuck with terrible quality on my 50 inch TV?" Recent studies observe that competing adaptive video streams generate flows that lead to instability, under-utilization, and unfairness behind bottleneck links. Additional measurements suggest there may also be a negative impact on users' perceived quality of experience as a consequence. Intuitively, application-generated issues should be resolved at the application layer. In this presentation I shall demonstrate that fairness, by any definition, can only be solved in the network. Moreover, that in an increasingly HTTP-S world, some form of client interaction is required. In support, a new network-layer 'QoE-fairness' metric will be be introduced that reflects user experience. Experiments using our open-source implementation in the home environment reinforce the network-layer as the right place to attack the general problem.
Bio: Marwan Fayed received his MA from Boston University and his PhD from the University of Ottawa, in 2003 and 2009 respectively, and in between worked at Microsoft as a member of the Core Reliability Group. He joined the faculty at the University of Stirling, UK in 2009 as under the Scottish Informatics and Computer Science Alliance (SICSA) scheme. He recently held the appointment of 'Theme Leader' for networking research in Scotland. His current research interests lie in wireless algorithms, as well as general network, transport, and measurement in next generation edge networks. He is a co-founder of HUBS c.i.c., an ISP focussed on rural communities; recipient of an IEEE CCECE best paper award; and serves on committees at IEEE and ACM conferences.
The Last of the Big Ones: Crazy Stone, AlphaGo, and Master(14 February, 2017)
Speaker: John O'Donnell
The computer program AlphaGo made history in 2016 by defeating Lee Sedol, one of the top professional go players, in a five game match. A few weeks ago, an updated version of AlphaGo played 60 games against professionals and won them all. The current generation of strong go programs use neural networks and Monte Carlo tree search. These programs have a distinctive playing style and occasionally make astonishing moves, raising questions that are presently the focus of intensive research. This talk will explore some of these issues, and illustrate them with incidents from the history of go as well as from the recent games by computers.
Inference-Based Automated Probabilistic Programming in Distributed Embedded Node Networks(08 February, 2017)
Speaker: Dr. Mark Post
Driven by ever more demanding applications, modern embedded computing and automation systems have reached unprecedented levels of complexity. Dr. Post’s research focuses on applying novel software and hardware architectures to simplify and distribute the structure of robots and other embedded systems, to make them robust and able to operate under uncertainty, and also allow to for more efficient and automated development processes. One way to achieve this is via the unification of programming and data, made possible by using probabilistic abstractions of exact data. In a new methodology for embedded programming developed through this research, exact variables are replaced with random variables and a computation process is defined based on evidence theory and probabilistic inference. This has many advantages including the implicit handling of uncertainty, a guarantee of deterministic program execution, and the ability to apply both statistical on-line learning and expert knowledge from relational semantic sources. Implementation on real-time systems is made reliable and practical by applying modular and lock-free inter-process communication, semantic introspection and stochastic characterization of processes to build robust embedded networks based on wide-computing concepts. This methodology in general has a vast array of potential real-world applications, and some aspects have been applied successfully to embedded programming of planetary rovers and agricultural robots.
The Problem of Validation in Systems Engineering(07 February, 2017)
Speaker: Robbie Simpson
Systems Engineering makes extensive use of modelling and analysis methodologies to design and analyse systems. However, it is rare for these methodologies to be effectively validated for correctness or utility. Additionally, the common use of case studies as an implicit validation mechanism is undermined by the lack of validation of these case studies themselves. This talk explores the problem of validation with specific reference to requirements engineering and safety analysis techniques, identifies the main shortcomings and attempts to propose some potential solutions.
intra-systems: TBA(07 February, 2017)
Speaker: Robbie Simpson
Research On Network Intrusion Detection Systems and Beyond(06 February, 2017)
Speaker: Dr Kostas Kyriakopoulos
The talk will go through the overview of research conducted in the "Signal Processing and Networks" group at Loughborough University, with a strong emphasis on the “Networks" side. We have developed algorithms for fusing cross layer measurements using the Dempster Shafer evidence framework to make decisions on whether packets/frames in the network are coming from a malicious source or from the legitimate Access Point. We are currently researching on how to infuse this system with contextual information besides the direct measurements from the network. The talk will also discuss other Networks relevant topics, including Ontologies for management of networks and some brief introduction to the group’s Signal Processing expertise in Signal Processing for defence areas.
Pycket: A Tracing JIT for a functional language(01 February, 2017)
Speaker: Sam Tobin-Hochstadt
Intra-Systems Seminar(24 January, 2017)
Speaker: Jeremy Singer
Jermey presents an analysis of beginner Haskell code.
Exploiting Memory-Level Parallelism(18 January, 2017)
Speaker: Dr Timothy M Jones
Many modern data processing and HPC workloads are heavily memory-latency bound. Current architectures and compilers perform poorly on these applications due to the highly irregular nature of the memory access patterns involved. This leads to CPU stalling for the majority of the time. However, on closer inspection, these applications contain abundant memory-level parallelism that is currently unexploited. Data accesses are, in many cases, well defined and predictable in advance, falling into a small set of simple patterns. To exploit them though, we require new methods for prefetching, in hardware and software.
In this talk I will describe some of the work my group has been doing in this area over the past couple of years. First, I'll show a compiler pass to automatically generate software prefetches for indirect memory accesses, a special class of irregular memory accesses often seen in high-performance workloads. Next, I'll describe a dedicated hardware prefetcher that optimises breadth-first traversals of large graphs. Finally, I'll present a generic programmable prefetcher that embeds an array of small microcontroller-sized cores next to the L1 cache in a high-performance processor. Using an event-based programming model, programmers are able to realise performance increases of over 4x by manual creation of prefetch code, or 3.5x for the same application using an automatic compiler pass.
SpaceTime - A fresh view on Parallel Programming(14 December, 2016)
Speaker: Prof Sven-Bodo Scholz
Traditionally, programs are specified in terms of data structures and successive modifications of these. This separation dictates at what time which piece of data is located in what space, be it main memory, disc or registers. When aiming at high-performance, parallel executions of programs, it turns out that the choice of this time / space separation can have a vast impact on the performance that can be achieved. Consequently, a lot of work has been spent on compiler technology for identifying dependencies between data and on techniques for rearranging codes for improved locality with respect to both, time and space. As it turns out, the programmer specified choice of data-structures often limits what can be achieved by such optimisation techniques. In this talk, we argue that a new way of formulating parallel programs that is based on a unified view on space and time not only matches much better typical scientific specifications, it also increases the re-usability of programs and, most importantly, it enables more radical space-time optimisations through compilers.
Reviewing the Systems Curriculum Review(13 December, 2016)
Speaker: Colin Perkins
Over the last few months, the Section has been engaged in a review of our undergraduate curriculum and teaching. This talk will outline the changes we’re proposing, and what we hope to achieve by doing so
Knights Landing, MCDRAM, and NVRAM: The changing face of HPC technology(07 December, 2016)
Speaker: Mr Adrian Jackson
The hardware used in HPC systems is becoming much more diverse than we have been used to in recent times. Intel's latest Xeon Phi processor, the Knights Landing (KNL), is one example of such change, however bigger changes in memory technologies and hierarchies are on the way. In this talk I will outline our experiences with the KNL, how future memory technologies are likely to impact the hardware in HPC systems, and what these changes might mean for users.
Performance Evaluation for CloudSim - Cloud Computing Simulator(06 December, 2016)
Speaker: Dhahi Alshammari
Much cloud computing research is performed using simulators. There are many simulators available. One of the most common simulators is "CloudSim", which is widely used as a cloud research tool. This talk will review briefly the CloudSim system and its various extensions. The extensions provide additional usability features and improved simulation fidelity. I will further present results of an empirical study to evaluate the precision of CloudSim by comparing it with actual test-bed results from the Glasgow Raspberry Pi Cloud infrastructure
Erlyberly - Erlang tracing for the masses(30 November, 2016)
Speaker: Mr Andy Till
The BEAM virtual machine has flexible and powerful tooling for introspection, statistics and debugging without affecting the running application. Erlyberly is an ongoing project to lower the barrier for entry for using these capabilities, focusing on tracing.
Raspberry Pi based sensor platform for a smart campus(29 November, 2016)
Speaker: Dejice Jacob
In a sensor network, using sensor nodes with significant compute
capability can enable flexible data collection, processing and reaction. This
can be done using commodity single-board computers. In this talk, we will be
describing initial deployment, software architecture and some preliminary analysis.
Data Structures as Closures(23 November, 2016)
Speaker: Prof Greg Michaelson
In formalising denotational semantics, Strachey introduced a higher order update function for the modelling of stores, states and environments. This function relies solely on atomic equality types, lambda abstractions and conditions to represent stack disciplined association sequences as structured closures, without recourse to data structure constructs like lists.
Here, we present higher order functions that structure closures to model queue, linear ordered and tree disciplined look up functions, again built from moderately sugared pure lambda functions. We also discuss their type properties and practical implementation.
intra-systems: TBA(22 November, 2016)
Speaker: John O'Donnell
Automatic detection of parallel code: dependencies and beyond(16 November, 2016)
Speaker: Mr Stan Manilov
Automatic parallelisation is an old research topic, but unfortunately, it
has always been over-promising and under-performing. In this talk, we'll
look at the main approaches towards automatically detecting parallelism in
legacy sequentialcode and we'll follow with some fresh ideas we're working
on, aiming to bring us beyond the ubiquitous dependence analysis.
Device Comfort for Information Accessibility(15 November, 2016)
Speaker: Tosan Atele-Williams
Device comfort is an augmented notion of trust that embodies a relationship between a device, its owner and the environment, with the device able to act, advice, encourage, and reason about everyday interactions, including a minutely precise comprehension of information management and personal security of device owner. The growing privacy and security needs in an increasingly intuitive, interactive and interconnected society contends with Device Comfort as information security methodology based on trust reasoning. In this paper an information accessibility architecture based on java security sandbox that uses device comfort methodology is presented, a further look at how information can be classified based on trust ratings and sensitivity, and how everything within this definition is confined to trusted zones or dimensions.
Dynamically Estimating Mean Task Runtimes(08 November, 2016)
Speaker: Patrick Maier
The AJITPar project aims to automatically tune skeleton-based parallel
programs such that the task granularity falls within a range that
promises decent performance: Tasks should run long enough to amortise
scheduling overheads, but not too long.
In this talk, I will sketch how AJITPar uses dynamic cost models to
accurately estimate mean task runtimes, despite irregular task sizes.
The key is random scheduling and robust linear regression.
(Joint work with Magnus Morton and Phil Trinder.)
Image processing on FPGAs with a DSL and dataflow transformations(02 November, 2016)
Speaker: Dr Rob Stewart
FPGAs are chips that can be reconfigured to exactly match the structure
of a specific algorithm. They are faster than CPUs and need less power
than GPUs, and hence are well suited for remote image processing needs.
They are however notoriously difficult to program, which is often done
by hardware experts working at a very low level. This excludes algorithm
designers across a range of real world domains from exploiting FPGA
technology. Moreover, time and space optimisation opportunities found in
compilers of high level languages cannot be applied to low level
This talk will be in three parts. 1) I will present RIPL, our image
processing FPGA DSL. It comprises algorithmic skeletons influenced by
stream combinator languages, meaning the RIPL compiler is able to
generate space efficient hardware. 2) I will demonstrate our compiler
based dataflow transformations framework, which optimises the dataflow
IR form of RIPL programs before they are synthesised to FPGAs. 3) I will
describe the FPGA based smart camera architecture that RIPL programs
slot into, which is used for evaluation.
Scaling robots and other stuff with Erlang(01 November, 2016)
Speaker: Natalia Chechina
I’m going to give this talk at the end of November at BuildStuff developer conferences in Vilnius (Lithuania) and Kiev (Ukraine). So it’s a bit skewed towards developer community rather than research community. Any feedback will be very much appreciated.
I’ll talk about scalability and fault tolerance features of distributed Erlang. In particular, what makes it so good for large scale distributed applications on commodity hardware, where devices are inherently non-reliable and can disappear and re-appear at any moment.
The talk is based on experience of developing Scalable Distributed Erlang (SD Erlang -- a small extension of distributed Erlang for distributed scalability) and integrating Erlang in robotics. So, I’ll share rationale behind design decisions for SD Erlang, lessons learned, advantages, limitations, and plans for the further development. And then talk about benefits of Erlang in distributed robotics, initial findings, and plans.
The Missing Link! A new Skeleton for Evolutionary Multi-Agent Systems in Erlang(26 October, 2016)
Speaker: Prof Kevin Hammond
Evolutionary multi-agent systems (EMAS) play a critical role in many artificial intelligence applications that are in use today. This talk will describe a new parallel pattern for parallel EMAS computations, and its associated skeleton implementation, written in Erlang using the Skel library. The skeleton enables us to flexibly capture a wide variety of concrete evolutionary computations that can exploit the same underlying parallel implementation. The use of the skeleton is shown on two different evolutionary computing applications: i) computing the minimum of the Rastrigin function; and ii) solving an urban traffic optimization problem. We can obtain very good speedups (up to 142.44× the sequential performance) on a variety of different parallel hardware from Raspberry Pis to large-scale multicores and Xeon Phi accelerators, while requiring very little parallelisation effort.
Power, Precision and EPCC(25 October, 2016)
Speaker: Blair Archibald
I have recently returned from a summer working at EPCC, one of the
largest high performance computing (HPC) centres in the UK. In this
talk I'll give give a whirlwind tour of what I got up to during my
I'll start by describing EPCC itself and how it fits into the wider
HPC community. Then will dive into two of the projects I was involved
in over summer.
Firstly, the Adept project which tackles the challenges presented by
the need for energy efficient computing. This project relies heavily
on custom hardware to gain fine grain knowledge of power usage. We
will see how at how energy scales with parallel efficiency, the
potential hidden cost of programming languages, and some interesting
future research directions.
Next, the ExaFLOW project aimed at providing the next generation of
computational fluid dynamics codes (ready for the "Exa-scale" era).
We will dive into mixed precision analysis and discover how we can
analyse floating-point behaviour of scientific codes by way of binary
From Robotic Ecologies to Internet of Robotic Things: Artificial Intelligence and Software Engineering Issues (19 October, 2016)
Speaker: Dr Mauro Dragone
Building smart spaces combining IoT technology and robotic
capabilities is an important and extended challenge for EU R&D&I, and a key
enabler for a range of advanced applications, such as home automation,
manufacturing, and ambient assisted living (AAL). In my talk I will provide an
overview of robotic ecologies, i.e. systems made up of sensors, actuators and
(mobile) robots that cooperate to accomplish complex tasks. I will discuss the
Robotic Ecology vision and highlight how it shares many similarities with the
Internet of Things (IoT): The ideal aim on both fronts is that arbitrary
combinations of devices should be able to be deployed in everyday environments,
and there efficiently provide useful services. However, while this has the
potential to deliver a range of disruptive services and address some of the
limitations of current IoT efforts, their effective realization necessitates
both novel software engineering solutions and artificial intelligence methods
to simplify their large scale application in real world settings. I will
illustrate these issues by focusing on the results of the EU project RUBICON
(fp7rubicon.eu). RUBICON built robotic ecologies that can learn to adapt to
changing and evolving requirements with minimum supervision. The RUBICON
approach builds upon a unique combination of methods from cognitive robotics,
machine learning, wireless sensor networks and software engineering. I will
summarise the lessons learned by adopting such an approach and outline
promising directions for future developments.
Mauro Dragone is Assistant Professor with the Research Institute of Signals,
Sensors and Systems (ISSS), School of Engineering & Physical Sciences at
Heriot-Watt University, Edinburgh Centre for Robotics. Dr. Dragone gained more
than 12 years of experience as a software architect and project manager in the
software industry before his involvement with academia. His research expertise
includes robotics, human-robot interaction, wireless sensor networks and
software engineering. Dr. Dragone was involved in a number of EU projects
investigating Internet of Things (IoT) and intelligent control solutions for
smart environments, before initiating and leading the EU project RUBICON
Data Plane Programmability for Software Defined Networks(18 October, 2016)
Speaker: Simon Jouet
OpenFlow has established itself as the defacto standard for Software Defined Networking (SDN) by separating the network's control and data planes. In this approach a central controller can alter the match-action pipeline of the individual switches using a limited set of fields and actions preventing. This inherent rigidity prevents the rapid introduction of new data plane functionality that would enable the design of new forwarding logic and other packet processing such as custom routing, telemetry, debugging, security, and quality of service.
In this talk I will present BPFabric a platform, protocol, and language-independent architecture to centrally program and monitor the data plane. It will cover the design of the switches and how they defer from "legacy" or OpenFlow switches and the design of a control API to orchestrate the infrastructure.
Turbocharging Rack-Scale In-Memory Computing with Scale-Out NUMA(12 October, 2016)
Speaker: Dr Boris Grot
Web-scale online services mandate fast access to massive quantities of
data. In practice, this is accomplished by sharding the datasets across a
pool of servers within a datacenter and keeping each shard within a
server's main memory to avoid long-latency disk I/O. Accesses to non-local
shards take place over the datacenter network, incurring communication
delays that are 20-1000x greater than accesses to local memory. In this
talk, I will introduce Scale-Out NUMA -- a rack-scale architecture with an
RDMA-inspired programming model that eliminates chief latency overheads of
existing networking technologies and reduces the remote memory access
latency to a small factor of local DRAM. I will overview key features of
Scale-Out NUMA and will describe how it can bridge the semantic gap
between software and hardware through integrated support for atomic object
Boris Grot is a Lecturer in the School of Informatics at the University of
Edinburgh. His research seeks to address efficiency bottlenecks and
capability shortcomings of processing platforms for big data. His recent
accomplishments include an IEEE Micro Top Pick and a Google Faculty
Research Award. Grot received his PhD in Computer Science from The
University of Texas at Austin and spent two years as a post-doctoral
fellow at the Parallel Systems Architecture Lab at EPFL.
Full Section Meeting and Strategic Discussion(11 October, 2016)
Speaker: Phil Trinder
This session is essential for all members of the Systems Section. We will
- Meet new PhD students in the section
- Discuss progress since the Away Day
- Discuss strategic plans, including:
- A Centre for Doctoral Training (CDT) proposal
- A high-profile Section Workshop as part of the School’s 60th anniversary celebrations
Feel free to propose other topics by email to Phil.Trinder@glasgow.ac.uk
Towards Reliable and Scalable Robot Communication(10 October, 2016)
Speaker: Phil Trinder
The Robot Operating System (ROS) is the de facto standard middleware
for modern robots. However, communication between
ROS nodes has scalability and reliability issues in practice. This talk reports
an investigation into whether Erlang’s lightweight concurrency
and reliability mechanisms have the potential to address these issues.
The basis of the investigation is a pair of simple but typical
robotic control applications, namely two face-trackers: one using
ROS publish/subscribe messaging, and the other a bespoke Erlang
The talk reports experiments that compare five key aspects of the ROS
and Erlang face trackers. We find that Erlang communication scales
better, supporting at least 3.5 times more active processes (700 processes)
than its ROS-based counterpart (200 nodes) while consuming
half of the memory. However, while both face tracking prototypes
exhibit similar detection accuracy and transmission latencies
with 10 or fewer workers, Erlang exhibits a continuous increase in
the total time taken to process a frame as more agents are added,
which we have identified is caused by function calls from Erlang
processes to Python modules via ErlPort. A reliability study shows
that while both ROS and Erlang restart failed computations, the
Erlang processes restart 1000–1500 times faster than ROS nodes,
reducing robot component downtime and mitigating the impact of
Joint work with Andreea Lutac, Natalia Chechina, and Gerardo Aragon-Camarasa
Teach You a Haskell Course(20 September, 2016)
Speaker: Jeremy Singer
This week, our Functional Programming in Haskell course began. We have around 4000 learners signed up for this massive open online course. Wim and I have spent the past six months developing the learning materials, mostly adapted from the traditional Functional Programming 4 course.
In this talk, I will give an overview of the challenges involved in setting up and running an online course. In short, hard work but very rewarding!
Systems: Climate is what you expect, the weather is what you get!(13 September, 2016)
Speaker: Professor Saji Hameed
This old saying among weather forecasters is correct. Yet it does give little insight into the workings of the climate system. While weather can be understood and simulated as instabilities arising within the atmosphere, climate involves interactions and exchanges of properties among a wide variety of subsystems that include for example the atmosphere, the ocean and land subsystems. I will first discuss an example of these interactions at play showcasing the El Nino phenomenon. In the rest of the talk, I will endeavor to describe how software for climate models integrates experiences and expertise across a wide range of disciplines and the computational challenges faced by the climate modeling community in doing so.
Biography: Professor Hameed is a Senior Associate Professor at the University of Aizu in Fukushima, Japan. He was the Director of Science at APCC, Korea and has been appointed a Senior Visiting Scientist at Japan Agency for Marine Earth Science and Technology (JAMSTECH).
He is credited with the discovery of an ocean-atmosphere coupled mode "Indian Ocean Dipole" which radically changed the prevailing paradigms. At APCC he pioneered an information technology based approach for generating and distributing climate information for societal benefit. He has also worked with APCC and its international partners to develop a climate prediction based approach to managing severe haze and forest fires in Southeast Asia, a severe environmental pollution issue in the area. He is also closely working with scientists at the National Institute of Advanced Science and Technology (Japan) to apply climate and weather science for renewable energy applications.
His current work includes investigating Super El Nino using computational modeling approaches, analyzing climate data using machine learning algorithms, tracking clouds and rain with low cost GPS chips, and continuing investigation into Indian Ocean Dipole that affects global climate.
Systems Seminar: Sorting Sheep from Goats - Automatically Clustering and Classifying Program Failures(07 September, 2016)
Speaker: Marc Roper
In recent years, software testing research has produced notable advances in the area of automated test data generation. It is now possible to take an arbitrary system and automatically generate volumes of high-quality test data. But the problem of checking the correctness or otherwise of the outputs (termed the "oracle problem") still remains.
This talk examines how machine learning techniques can be used to cluster and classify test outputs to separate failing and passing cases.
The feasibility of the approach is demonstrated and shown to have the potential to reduce by an order of magnitude the numbers of outputs that need to be examined following a test run.
This is joint work carried out with Rafig Almaghairbe
Biography: Dr Marc Roper is a Reader in the Department of Computer and Information Sciences at the University of Strathclyde. He has an extensive background in software engineering, particularly in understanding and addressing the problems associated with designing, testing, and evolving large software systems. Much of his research has incorporated significant empirical investigations: either based around controlled participant-based experiments or through the the analysis of open-source systems and large-scale repositories. His more recent work has explored the application of search-based strategies and machine learning techniques to software engineering problems such as test data generation, the identification of security anomalies, and automatic fault detection. His current interests lie in the area of software analytics, in particular building models of software systems behaviour to automatically identify and locate faults.