Big Data Analytics (Level M) STATS5016

  • Academic Session: 2023-24
  • School: School of Mathematics and Statistics
  • Credits: 10
  • Level: Level 5 (SCQF level 11)
  • Typically Offered: Semester 2
  • Available to Visiting Students: Yes

Short Description

This course introduces methods of computational inference, with an emphasis on practical issues and applications. It gives students an opportunity to learn more about one additional topic by reading books and papers and writing an essay to summarise what they have learnt.

Timetable

15 lectures (1 or 2 each week)

5 1-hour tutorials

5 2-hour computer-based practicals

Requirements of Entry

Some optional courses may be constrained by space and entry to these is not guaranteed unless you are in a programme for which this is a compulsory course.

Excluded Courses

STATS4042 Big Data Analytics

Assessment

90-minute, end-of-course examination (75%)

Coursework (25%)

Main Assessment In: April/May

Course Aims

This course aims:

■ to introduce the students to big data methods commonly applied in Statistics, notably regularised regression;

■ to motivate regularised regression geometrically and to identify methodological links between regularised regression and linear models;

■ to introduce students to the basic aspects of the geometry of high-dimensional data and of the role of sparsity in the analysis of high-dimensional data;

■ to introduce students to graphical models and how they can be used for structural inference in high-dimensional data;

■ to introduce Bayesian networks and Markov networks as elementary probabilistic graphical models;

■ to introduce the basics of network analysis, notably metrics of network characteristics.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

■ describe the challenges of the analysis of high-dimensional data and discuss, in a particular context, strategies for tackling big data problems;

■ formulate and fit a regularised linear model, such as ridge regression, the LASSO and partial least-squares;

■ undertand the theory underpinning regularized regression and how it is connected with GLMs mathematically;

■ infer statements about (conditional) independence from graphical models and factorisations of the joint distribution;

■ describe methods for structural inference in graphical models (such as Bayesian networks and Markov networks) and apply them in a given context;

■ understand basic aspects of the structure of a network by means of metrics of network characteristics.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.