Big Data Analytics STATS4042

  • Academic Session: 2018-19
  • School: School of Mathematics and Statistics
  • Credits: 10
  • Level: Level 4 (SCQF level 10)
  • Typically Offered: Semester 2
  • Available to Visiting Students: Yes
  • Available to Erasmus Students: Yes

Short Description

This course introduces methods of computational inference, with an emphasis on practical issues and applications.

Timetable

15 lectures (1 or 2 each week)

5 1-hour tutorials

5 2-hour computer-based practicals

Requirements of Entry

None

Excluded Courses

STATS5016 Big Data Analytics (Level M)

Assessment

90-minute, end-of-course examination (80%)

Project (20%)

Main Assessment In: April/May

Are reassessment opportunities available for all summative assessments? Not applicable

Reassessments are normally available for all courses, except those which contribute to the Honours classification. For non Honours courses, students are offered reassessment in all or any of the components of assessment if the satisfactory (threshold) grade for the overall course is not achieved at the first attempt. This is normally grade D3 for undergraduate students and grade C3 for postgraduate students. Exceptionally it may not be possible to offer reassessment of some coursework items, in which case the mark achieved at the first attempt will be counted towards the final course grade. Any such exceptions for this course are described below. 

Course Aims

This course aims:

■ to introduce the students to big data methods commonly applied in Statistics, notably regularised regression;

■ to introduce students to the basic aspects of the geometry of high-dimensional data and of the role of sparsity in the analysis of high-dimensional data;

■ to introduce students to graphical models and how they can be used for structural inference in high-dimensional data;

■ to introduce Bayesian networks and Markov networks as elementary probabilistic graphical models;

■ to introduce the basics of network analysis, notably metrics of network characteristics.

Intended Learning Outcomes of Course

By the end of this course students will be able to:

■ describe the challenges of the analysis of high-dimensional data and discuss, in a particular context, strategies for tackling big data problems;

■ formulate and fit a regularised linear model, such as ridge regression, the LASSO and partial least-squares;

■ infer statements about (conditional) independence from graphical models and factorisations of the joint distribution;

■ describe methods for structural inference in graphical models (such as Bayesian networks and Markov networks) and apply them in a given context;

■ understand basic aspects of the structure of a network by means of metrics of network characteristics.

Minimum Requirement for Award of Credits

Students must submit at least 75% by weight of the components (including examinations) of the course's summative assessment.